proposal for a bangla (or bengali) script root zone …‘bangla’ (or bengali) is historically and...
TRANSCRIPT
ProposalforaBangla(orBengali)ScriptRootZoneLabelGenerationRuleset(LGR)ndashIncorporatingcommentsfromIPLGRVersion40CurrentDate2020-03-02Documentversion482AuthorsNeo-BrahmiGenerationPanel[NBGP]
1 GeneralInformationThis document lays down the Label Generation Rule Set (LGR) for the Bangla (orlsquoBengalirsquo)1script under the general rubric of the Neo-Brāhmī Writing System Threemain components of theBangla Script LGR ie (i) Codepoint repertoire (ii) Variantsand(iii)WholeLabelEvaluationRuleswhichhavebeendescribedindetailherehavinggivenabriefhistoricalbackgroundoftheScriptunderSection3Allthesecomponentswillbeincorporatedinamachine-readableformatinanXMLfilenamed proposal-bengali-lgr-02mar20-enxml Labels for testing can be found in theaccompanyingtextdocumentldquobangla-test-labels-02mar20-entxtrdquo
2 ScriptforWhichtheLGRIsProposedISO15924CodeBengISO15924KeyNdeg325ISO15924EnglishNameBengali(Bangla)Latintransliterationofnativescriptnames[inIPA]bɑːŋlɑːocircxocircmiyaNativenamesofthescriptবাংলা অসমীয়াMaximalStartingRepertoire(MSR)versionMSR-4
1 The term lsquoBanglarsquo is used in the descriptive text and the term lsquoBengalirsquo is used in the normative part of this proposal
2
3 BackgroundonScriptampPrincipalLanguagesUsingIt30IntroductionlsquoBanglarsquo (or Bengali) is historically and genealogically regarded as an eastern Indo-Aryanlanguagewitharound1782millionspeakersinBangladesh(98speakers)and834millionspeakersintheIndianstatesofWestBengal(6837million)Tripura(215million) SouthAssam (73million) Odisha (049million) andDelhi (021million) aswellasintheAndamanandNicobarIslands(closetoahundredthousand)-accountingfor83ofIndiaItisamajorlanguageinJharkhand(26million)tooandalanguagewith a sizable population in Bihar (044million) Apart from these there are a hugenumberofBangla-speakingdiasporasspreadallovertheworldItistheseventhlargestspokenandwrittenlanguageintheworldBanglaisthenationalandofficiallanguageofBangladeshandoneof the22Official languages in India(listed in the8thScheduleofthe Indian Constitution) It is also one of the official languages of Sierra Leone Thescript is also calledBangla [102]which is an eastern variety of the lsquoBrāhmīrsquoWritingSystemwritten from left to rightHistorically it derives from theBrāhmī alphabet asusedintheAshokaninscriptions(269-232BC)
Banglaanditscognatelanguagesasmentionedabovetogetherformalinguisticgroupknown as the Eastern New Indo-Aryan (NIA) There is a gross inadequacy of theinscriptionsandmanuscriptsintheEasternApabhranśaorlsquoAvahaṭṭharsquoexceptforsmallinscriptions and the manuscripts of the Tantric Buddhist text titledlsquoCaryyācaryyaviniścayarsquoortheCaryā-Pada[114]datingbacktothe9th-11thcenturyAsa result there is not much epigraphic evidence for the development of its writingsystemHoweverwhatevidenceisavailableofthegenesisofBanglawritingsystemisdiscussedinthesection31[109]Historically theBangla languageisdividedintothreeperiodsasevident fromvarioussources
(i) FirstlyOldBanglaPeriod (roughly9501000 toAD12001350) ofwhichthreespecimensarefound(a)47CaryāsongstheDohākōṣaofSarahaandtheDohākōṣa of Kānha (mostly in Apabhranśa) and theḌākārṇava (in avariety of Prakrt) (b) Old Bangla specimens of over 300 words in acommentary[141]
(ii)ThenthereisMiddleBanglaPeriod-1200-1800ADagaindividedintothreestages(a)TransitionalMiddleBangla(1200-1300ADforwhichnogenuinespecimensarefound)[147](b)EarlyMiddleBangla(1300-1500AD)and(c)LateMiddleBangla(1500-1800AD)
(iii)Finally after1800ADwe find theModernorNewBanglamarkedby theintroduction of written prose [109] in the books of Fort William College(established in1800)ThecolloquialvarietyofBanglabasedonthespeech
3
varietyofCalcutta(calledlsquoKolkatarsquonow)madeitsfirstappearancethroughthe Hutōm Pẽcāra Nakśā (1862) by Peari Chand Mitra The influence ofEnglishinthevocabularyidiomsandexpressionsaswellasinthewritingstyles of Bangla is significant by this time The fonts and types for Bangladeveloped during this time also spread to all parts of Bangla speechcommunity[101120]Thesamefontswithsomeextensionswerealsousedfortheneighbouringlanguagesdeployingthiswritingsystem
Bangla prose had developed two literary styles during the 19th-20th Century TheSādhubhāṣā (সাধভাষা - Elegant Language or Style) and the Calitabhāṣā (চিলতভাষাCurrent Language orModern Style) It is the latter style that is prevalent today inwrittenproseTheLanguageMovementinBangladesh(thethenEastPakistan)beganin1948ascivilsociety dissented to the elimination of the Bangla script from currency and stampswhichwere inuse since theBritishRaj Themovement reached its pinnacle in1952when on 21 February the police fired on demonstrating students and civilianstriggeringnumerousinjuriesanddeaths2LaterfollowingtheLanguagemovementon27 April 1952 the All Party National Language Committee decided to demandestablishment of an organization for the promotion of Bengali language BanglaAcademyDhaka right from its inception in1955hasbeenengaged inpromotingandfosteringBanglaasthelinguafrancaofthecountrybeforeandafterindependencefromPakistanin1971ThroughthevariouscommissionsandcommitteesconstitutedbytheGovernment of Bangladesh (Banladesa Jatıya Sy iksa Kamisana in 1972 Jatıya Sy iksaUpadestaParisadin1979BanlaBhasaBastabayanaSelain1982BanlaBhasaKamitiin1983 etc3) after independence in 1971 Bangla was made the primary medium ofinstructioncommunication in all Governmental and educational activities Through agreatstruggleandbloodshedtheBengalisestablishedBanglaasanofficiallanguageofthestate4
2 The UN declared Ekuśe February (21st February) as the International Mother Language Day at the UNESCO General Conference in Paris on 17 November 1999 ldquoin recognition of the sanctity and preservation of all vernacular languages in the worldrdquo22 3 Bāṅlā Bhāṣā Kamiṭi 1983 Bāṅlā Bhāṣā Kamiṭi Riporṭ (Report of the Bangla Bhasha Committee) Dhakaː Śikṣā Dharma Krīṛā O Saṅskṛti Mantraṇālaya Peoples Republic of Bangladesh 4 Chakraborty Rajib 2018 The Fishermenrsquos Community A Language-Culture Interplay (A Study of Post-1971 Select Bangla Novels) Unpublished PhD Dissertation Visva-Bharati
4
31WrittenBanglaThe lsquoBangla alphabetrsquo (বাংলা িলিপ - Bānglā lipi ISO15924) is derived from theBrāhmīwritingsystemwhichisrelatedtotheNagarı(alsoknownasDevanāgarī5)script[108]aswell as to Tirhutāwriting system [106] Considered to be fifthmostwidely usedwritingsystem in theworld thiscombinedBangla-Asamiyā-ManipuriScript (showingsomevariationsforAsamiyāandMeiteiorBisnupriyaManipuri)(130)wasusedintheeasternIndianSanskritmanuscriptstooForChakma in IndiaandBangladeshandforKokborok inTripura itwasandstill isoneof thescriptsusedAclosevariant calledTirhutā (123 now available also in UNICODE 100 as 11480 114DF See 110) orMithilākṣarawasused forMaithili fromthe14thCenturyuntil theearly-20thcentury[106]InthiscontextonefindsamentionoflsquoSylhetiNagarılipirsquoorlsquoSilotirsquo(addedtotheUnicodeStandard inMarch2005with thereleaseofversion41) thedetailsofwhichcouldbeof interest only tohistorians andhistorical linguists (See137and144)ButSylhetiBanglaisgenerallywrittenbymanyinthemodern-dayBanglascriptnowforallpracticalpurposes Originallyduring thereignof thePāladynasty (750-1154AD) intheeasternIndiaandevenearlierperhapsduringtheMallaperiod(694ADonwards)thepresent-dayBanglawritingsystemgotashapecomparabletothemodern-dayones[111 119] A pictorial description of Brāhmī to Modern Bangla Script could bepresentedhereinatabularform
Modern ক জ ম র স অ
k j m r s a
Table1PictorialdepictionofEvolutionofBrāhmītoBangla
5William DwightWhitney in his SanskritGrammar unequivocally said ldquoThis name (Devanagarı) is ofdoubtfuloriginandvaluerdquo(WhitneyWilliamDwight1994reprintSanskritGrammarNewDelhiːMotilalBanarasidassPublishersp1)
5
The inscriptional evidence in Brāhmī is found in the Archaic Brāhmī from the 3rdcenturyBC tothe1stcenturyBCandinMiddleBrāhmīndashsoonafter(1st-3rdCenturyAD)andthenonintheLateBrāhmī(4th-6thCenturyAD)ThisevidencecouldbeseeninbothBangladeshandWestBengal [108]by1)TheMahasthanagara(BogradistrictBangladesh mdash the ancient name being Pundranagara or Paundravardhanapura)inscriptions 2)Brāhmī (andKharoṣṭhī) inscriptions from the lower lsquoGangeticBengalrsquoand (3) Copper plate inscriptions of the Imperial Guptas fromNorthernpart ofWestBengal andNorth-West Bangladeshmdash in the areas underDharmaditya Gopachandraand Samācāradeva (about whom one only knows from five Copper-plates found inKotalipara in the Faridpur district in Bangladesh one in Mallasarul in the Burdwandistrict(WestBengal)andoneinJayramapura(BallesvaradistrictnowinOdisha)Theseepigraphs fromtheeasternpartofUndivided India (datingback to the4th-6thCenturiesAD)showedsomecharacteristicfeaturesofletters(especiallyinমlsquomarsquoলlsquolarsquo
শlsquosarsquoসlsquosarsquoandহlsquoharsquo)whichledtothedevelopmentofeasternvarietyofGuptascriptEpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmī In this context the Tippera copper plate inscription of the lsquoSamatatarsquo rulers(139 pp 265) such as Lokanātha (dated 7th Century AD during the latter half) theKailaninscriptionofSy ridharanaRātaaswellastheAstafpurcopperplatesThelettersseemtohangdownfromwedgeshapedsolidtriangleswithrighthandverticalsbendingdownatthebottombecauseofwhichitwasdescribedbyPrinsepandFleetasKuṭila-lipi (literally lsquoCursivewriting stylersquo)whereas the termSiddhamātrikā (as amatra orbarisplacedovereachoftheletters)wasusedbyAlBiruni(973-1048)todesignatethescriptofNorthernIndiaThenextstageofdevelopmentisillustratedbythe9thCenturycopper plate inscriptions fromKhalimpur of the reign of Dharmapāla fromMonghyrand Nalanda of the time of Devapāla in Bihar and from Jagjıvanpura (Malda) of thereignofMahendrapālaTheSiddhamātrikā(mentionedaslsquoSiddhamrsquoinChinesesources)issaidtohavebeenprevalentalsointhisregionuptotheendofthetenthcenturyAlsocalledtheGauri(ieGandi)inPūrvadeśāortheEasterncountryitwasregardedasthesame script to which is given the appellative Proto-Bangla characteristics inrudimentaryformsintheperiodbetweenAD875andAD1025Insomeepigraphs it isconsideredasbelonging to thesecondquarterof theeleventhcenturyADFlatteningofhead-marksbecomesprominentincomparisontothewedge-shaped serifs An important landmark in the development of the Bangla script is theRamaganja copper plate inscription of Mahāmānḍalika in the last quarter of theeleventhcenturyADItistheearliestdocumentfromthisentireregionwhichbearsthelettermwithatickrisingupwardsThefullvowelidevelopsatickattherightendofthe upper horizontal bar above and a curved hook below Initial e approaches themodernBanglacharacterAmature formofProto-Bangla the immediateprecursorof
6
BanglascriptisillustratedintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies[104]TheevolutionoftheBanglascript(Cf136)isalignedwiththestoryofadvancementofprintingtechnologyThefirstldquoMovabletyperdquoscriptstechnicallycreatedandusedwhileprintingNathanielBrasseyHalheds (1751-1830)1778-book titled AGrammaroftheBengalLanguageIn1785Governor-GeneralWarrenHastings(1732-1818)requestedanother civilian Charles Wilkins (1749-1836) to cut punches for Bangla printingcharactersThecurrentprintedformofBanglascriptappearedsoonafterItisgenerallyagreedthatWilkinsdevelopedBanglaprintscript[111]HepassedonthisknowledgetoPancananaKarmakara(-1804)arenownedartistinBengalLateritwasKarmakarand his family that became famous in Bangla printing technology Shepherd wasanotherassistantofWilkinsinthisdesigningofscriptwhichbecamemoreangularwithsharperturnsandedges[133]Afewarchaiclettersweremodernizedduringthe19thcentury It was standardized by Pandit Ishwar Chandra Vidyasagar when the Banglatypefontsweretobeusedtopublishona largescaleundertheCalcuttaSchoolBookSociety[116forseveralreferences]Much later in1935 theLinotypetechnique inventedbyOttmarMergenthaler(1854-1899) in 1886was introduced intoBangla printing in 1935 by the efforts of SureshChandra Majumdar (1888-1954) Rajsekhar Basu (1880-1960) Jatindra Kumar Sen(1882-1966)andhisdiscipleSushilKumarBhattacharyaandhadbegunbeingusedbytheA nandabazaraPatrikagrouplaterfollowedbyothersWithinafewyearsthemoreadvancedmonotypetechnologycametobeusedinBanglaprintingHoweverinBanglaprinting culturemonotypehas a very limited acceptance and linotype held stage tilleventuallythedigitaltechnologycameintoreplaceallearliertechniquesAllthesecouldbepresentedinatable
PERIOD DESCRIPTION NAMES
3rdCenturyBC UseofBrāhmīandKharosthīscriptsbegininthesubcontinentBrāhmīwaswidelyusedduringtheMauryanKingAśokaInonetheoryBrāhmīisbasedonNorthSemiticalphabetbutsuitablymodifiedtofittheneedoflocallanguagesItiscurrentlybelievedtohavebeenanindependentdevelopment
Brāhmī
1st-3rdCenturyAD
TheKusanascriptnamedaftertheKusanaroyaldynasty
Kusanascript
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
2
3 BackgroundonScriptampPrincipalLanguagesUsingIt30IntroductionlsquoBanglarsquo (or Bengali) is historically and genealogically regarded as an eastern Indo-Aryanlanguagewitharound1782millionspeakersinBangladesh(98speakers)and834millionspeakersintheIndianstatesofWestBengal(6837million)Tripura(215million) SouthAssam (73million) Odisha (049million) andDelhi (021million) aswellasintheAndamanandNicobarIslands(closetoahundredthousand)-accountingfor83ofIndiaItisamajorlanguageinJharkhand(26million)tooandalanguagewith a sizable population in Bihar (044million) Apart from these there are a hugenumberofBangla-speakingdiasporasspreadallovertheworldItistheseventhlargestspokenandwrittenlanguageintheworldBanglaisthenationalandofficiallanguageofBangladeshandoneof the22Official languages in India(listed in the8thScheduleofthe Indian Constitution) It is also one of the official languages of Sierra Leone Thescript is also calledBangla [102]which is an eastern variety of the lsquoBrāhmīrsquoWritingSystemwritten from left to rightHistorically it derives from theBrāhmī alphabet asusedintheAshokaninscriptions(269-232BC)
Banglaanditscognatelanguagesasmentionedabovetogetherformalinguisticgroupknown as the Eastern New Indo-Aryan (NIA) There is a gross inadequacy of theinscriptionsandmanuscriptsintheEasternApabhranśaorlsquoAvahaṭṭharsquoexceptforsmallinscriptions and the manuscripts of the Tantric Buddhist text titledlsquoCaryyācaryyaviniścayarsquoortheCaryā-Pada[114]datingbacktothe9th-11thcenturyAsa result there is not much epigraphic evidence for the development of its writingsystemHoweverwhatevidenceisavailableofthegenesisofBanglawritingsystemisdiscussedinthesection31[109]Historically theBangla languageisdividedintothreeperiodsasevident fromvarioussources
(i) FirstlyOldBanglaPeriod (roughly9501000 toAD12001350) ofwhichthreespecimensarefound(a)47CaryāsongstheDohākōṣaofSarahaandtheDohākōṣa of Kānha (mostly in Apabhranśa) and theḌākārṇava (in avariety of Prakrt) (b) Old Bangla specimens of over 300 words in acommentary[141]
(ii)ThenthereisMiddleBanglaPeriod-1200-1800ADagaindividedintothreestages(a)TransitionalMiddleBangla(1200-1300ADforwhichnogenuinespecimensarefound)[147](b)EarlyMiddleBangla(1300-1500AD)and(c)LateMiddleBangla(1500-1800AD)
(iii)Finally after1800ADwe find theModernorNewBanglamarkedby theintroduction of written prose [109] in the books of Fort William College(established in1800)ThecolloquialvarietyofBanglabasedonthespeech
3
varietyofCalcutta(calledlsquoKolkatarsquonow)madeitsfirstappearancethroughthe Hutōm Pẽcāra Nakśā (1862) by Peari Chand Mitra The influence ofEnglishinthevocabularyidiomsandexpressionsaswellasinthewritingstyles of Bangla is significant by this time The fonts and types for Bangladeveloped during this time also spread to all parts of Bangla speechcommunity[101120]Thesamefontswithsomeextensionswerealsousedfortheneighbouringlanguagesdeployingthiswritingsystem
Bangla prose had developed two literary styles during the 19th-20th Century TheSādhubhāṣā (সাধভাষা - Elegant Language or Style) and the Calitabhāṣā (চিলতভাষাCurrent Language orModern Style) It is the latter style that is prevalent today inwrittenproseTheLanguageMovementinBangladesh(thethenEastPakistan)beganin1948ascivilsociety dissented to the elimination of the Bangla script from currency and stampswhichwere inuse since theBritishRaj Themovement reached its pinnacle in1952when on 21 February the police fired on demonstrating students and civilianstriggeringnumerousinjuriesanddeaths2LaterfollowingtheLanguagemovementon27 April 1952 the All Party National Language Committee decided to demandestablishment of an organization for the promotion of Bengali language BanglaAcademyDhaka right from its inception in1955hasbeenengaged inpromotingandfosteringBanglaasthelinguafrancaofthecountrybeforeandafterindependencefromPakistanin1971ThroughthevariouscommissionsandcommitteesconstitutedbytheGovernment of Bangladesh (Banladesa Jatıya Sy iksa Kamisana in 1972 Jatıya Sy iksaUpadestaParisadin1979BanlaBhasaBastabayanaSelain1982BanlaBhasaKamitiin1983 etc3) after independence in 1971 Bangla was made the primary medium ofinstructioncommunication in all Governmental and educational activities Through agreatstruggleandbloodshedtheBengalisestablishedBanglaasanofficiallanguageofthestate4
2 The UN declared Ekuśe February (21st February) as the International Mother Language Day at the UNESCO General Conference in Paris on 17 November 1999 ldquoin recognition of the sanctity and preservation of all vernacular languages in the worldrdquo22 3 Bāṅlā Bhāṣā Kamiṭi 1983 Bāṅlā Bhāṣā Kamiṭi Riporṭ (Report of the Bangla Bhasha Committee) Dhakaː Śikṣā Dharma Krīṛā O Saṅskṛti Mantraṇālaya Peoples Republic of Bangladesh 4 Chakraborty Rajib 2018 The Fishermenrsquos Community A Language-Culture Interplay (A Study of Post-1971 Select Bangla Novels) Unpublished PhD Dissertation Visva-Bharati
4
31WrittenBanglaThe lsquoBangla alphabetrsquo (বাংলা িলিপ - Bānglā lipi ISO15924) is derived from theBrāhmīwritingsystemwhichisrelatedtotheNagarı(alsoknownasDevanāgarī5)script[108]aswell as to Tirhutāwriting system [106] Considered to be fifthmostwidely usedwritingsystem in theworld thiscombinedBangla-Asamiyā-ManipuriScript (showingsomevariationsforAsamiyāandMeiteiorBisnupriyaManipuri)(130)wasusedintheeasternIndianSanskritmanuscriptstooForChakma in IndiaandBangladeshandforKokborok inTripura itwasandstill isoneof thescriptsusedAclosevariant calledTirhutā (123 now available also in UNICODE 100 as 11480 114DF See 110) orMithilākṣarawasused forMaithili fromthe14thCenturyuntil theearly-20thcentury[106]InthiscontextonefindsamentionoflsquoSylhetiNagarılipirsquoorlsquoSilotirsquo(addedtotheUnicodeStandard inMarch2005with thereleaseofversion41) thedetailsofwhichcouldbeof interest only tohistorians andhistorical linguists (See137and144)ButSylhetiBanglaisgenerallywrittenbymanyinthemodern-dayBanglascriptnowforallpracticalpurposes Originallyduring thereignof thePāladynasty (750-1154AD) intheeasternIndiaandevenearlierperhapsduringtheMallaperiod(694ADonwards)thepresent-dayBanglawritingsystemgotashapecomparabletothemodern-dayones[111 119] A pictorial description of Brāhmī to Modern Bangla Script could bepresentedhereinatabularform
Modern ক জ ম র স অ
k j m r s a
Table1PictorialdepictionofEvolutionofBrāhmītoBangla
5William DwightWhitney in his SanskritGrammar unequivocally said ldquoThis name (Devanagarı) is ofdoubtfuloriginandvaluerdquo(WhitneyWilliamDwight1994reprintSanskritGrammarNewDelhiːMotilalBanarasidassPublishersp1)
5
The inscriptional evidence in Brāhmī is found in the Archaic Brāhmī from the 3rdcenturyBC tothe1stcenturyBCandinMiddleBrāhmīndashsoonafter(1st-3rdCenturyAD)andthenonintheLateBrāhmī(4th-6thCenturyAD)ThisevidencecouldbeseeninbothBangladeshandWestBengal [108]by1)TheMahasthanagara(BogradistrictBangladesh mdash the ancient name being Pundranagara or Paundravardhanapura)inscriptions 2)Brāhmī (andKharoṣṭhī) inscriptions from the lower lsquoGangeticBengalrsquoand (3) Copper plate inscriptions of the Imperial Guptas fromNorthernpart ofWestBengal andNorth-West Bangladeshmdash in the areas underDharmaditya Gopachandraand Samācāradeva (about whom one only knows from five Copper-plates found inKotalipara in the Faridpur district in Bangladesh one in Mallasarul in the Burdwandistrict(WestBengal)andoneinJayramapura(BallesvaradistrictnowinOdisha)Theseepigraphs fromtheeasternpartofUndivided India (datingback to the4th-6thCenturiesAD)showedsomecharacteristicfeaturesofletters(especiallyinমlsquomarsquoলlsquolarsquo
শlsquosarsquoসlsquosarsquoandহlsquoharsquo)whichledtothedevelopmentofeasternvarietyofGuptascriptEpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmī In this context the Tippera copper plate inscription of the lsquoSamatatarsquo rulers(139 pp 265) such as Lokanātha (dated 7th Century AD during the latter half) theKailaninscriptionofSy ridharanaRātaaswellastheAstafpurcopperplatesThelettersseemtohangdownfromwedgeshapedsolidtriangleswithrighthandverticalsbendingdownatthebottombecauseofwhichitwasdescribedbyPrinsepandFleetasKuṭila-lipi (literally lsquoCursivewriting stylersquo)whereas the termSiddhamātrikā (as amatra orbarisplacedovereachoftheletters)wasusedbyAlBiruni(973-1048)todesignatethescriptofNorthernIndiaThenextstageofdevelopmentisillustratedbythe9thCenturycopper plate inscriptions fromKhalimpur of the reign of Dharmapāla fromMonghyrand Nalanda of the time of Devapāla in Bihar and from Jagjıvanpura (Malda) of thereignofMahendrapālaTheSiddhamātrikā(mentionedaslsquoSiddhamrsquoinChinesesources)issaidtohavebeenprevalentalsointhisregionuptotheendofthetenthcenturyAlsocalledtheGauri(ieGandi)inPūrvadeśāortheEasterncountryitwasregardedasthesame script to which is given the appellative Proto-Bangla characteristics inrudimentaryformsintheperiodbetweenAD875andAD1025Insomeepigraphs it isconsideredasbelonging to thesecondquarterof theeleventhcenturyADFlatteningofhead-marksbecomesprominentincomparisontothewedge-shaped serifs An important landmark in the development of the Bangla script is theRamaganja copper plate inscription of Mahāmānḍalika in the last quarter of theeleventhcenturyADItistheearliestdocumentfromthisentireregionwhichbearsthelettermwithatickrisingupwardsThefullvowelidevelopsatickattherightendofthe upper horizontal bar above and a curved hook below Initial e approaches themodernBanglacharacterAmature formofProto-Bangla the immediateprecursorof
6
BanglascriptisillustratedintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies[104]TheevolutionoftheBanglascript(Cf136)isalignedwiththestoryofadvancementofprintingtechnologyThefirstldquoMovabletyperdquoscriptstechnicallycreatedandusedwhileprintingNathanielBrasseyHalheds (1751-1830)1778-book titled AGrammaroftheBengalLanguageIn1785Governor-GeneralWarrenHastings(1732-1818)requestedanother civilian Charles Wilkins (1749-1836) to cut punches for Bangla printingcharactersThecurrentprintedformofBanglascriptappearedsoonafterItisgenerallyagreedthatWilkinsdevelopedBanglaprintscript[111]HepassedonthisknowledgetoPancananaKarmakara(-1804)arenownedartistinBengalLateritwasKarmakarand his family that became famous in Bangla printing technology Shepherd wasanotherassistantofWilkinsinthisdesigningofscriptwhichbecamemoreangularwithsharperturnsandedges[133]Afewarchaiclettersweremodernizedduringthe19thcentury It was standardized by Pandit Ishwar Chandra Vidyasagar when the Banglatypefontsweretobeusedtopublishona largescaleundertheCalcuttaSchoolBookSociety[116forseveralreferences]Much later in1935 theLinotypetechnique inventedbyOttmarMergenthaler(1854-1899) in 1886was introduced intoBangla printing in 1935 by the efforts of SureshChandra Majumdar (1888-1954) Rajsekhar Basu (1880-1960) Jatindra Kumar Sen(1882-1966)andhisdiscipleSushilKumarBhattacharyaandhadbegunbeingusedbytheA nandabazaraPatrikagrouplaterfollowedbyothersWithinafewyearsthemoreadvancedmonotypetechnologycametobeusedinBanglaprintingHoweverinBanglaprinting culturemonotypehas a very limited acceptance and linotype held stage tilleventuallythedigitaltechnologycameintoreplaceallearliertechniquesAllthesecouldbepresentedinatable
PERIOD DESCRIPTION NAMES
3rdCenturyBC UseofBrāhmīandKharosthīscriptsbegininthesubcontinentBrāhmīwaswidelyusedduringtheMauryanKingAśokaInonetheoryBrāhmīisbasedonNorthSemiticalphabetbutsuitablymodifiedtofittheneedoflocallanguagesItiscurrentlybelievedtohavebeenanindependentdevelopment
Brāhmī
1st-3rdCenturyAD
TheKusanascriptnamedaftertheKusanaroyaldynasty
Kusanascript
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
3
varietyofCalcutta(calledlsquoKolkatarsquonow)madeitsfirstappearancethroughthe Hutōm Pẽcāra Nakśā (1862) by Peari Chand Mitra The influence ofEnglishinthevocabularyidiomsandexpressionsaswellasinthewritingstyles of Bangla is significant by this time The fonts and types for Bangladeveloped during this time also spread to all parts of Bangla speechcommunity[101120]Thesamefontswithsomeextensionswerealsousedfortheneighbouringlanguagesdeployingthiswritingsystem
Bangla prose had developed two literary styles during the 19th-20th Century TheSādhubhāṣā (সাধভাষা - Elegant Language or Style) and the Calitabhāṣā (চিলতভাষাCurrent Language orModern Style) It is the latter style that is prevalent today inwrittenproseTheLanguageMovementinBangladesh(thethenEastPakistan)beganin1948ascivilsociety dissented to the elimination of the Bangla script from currency and stampswhichwere inuse since theBritishRaj Themovement reached its pinnacle in1952when on 21 February the police fired on demonstrating students and civilianstriggeringnumerousinjuriesanddeaths2LaterfollowingtheLanguagemovementon27 April 1952 the All Party National Language Committee decided to demandestablishment of an organization for the promotion of Bengali language BanglaAcademyDhaka right from its inception in1955hasbeenengaged inpromotingandfosteringBanglaasthelinguafrancaofthecountrybeforeandafterindependencefromPakistanin1971ThroughthevariouscommissionsandcommitteesconstitutedbytheGovernment of Bangladesh (Banladesa Jatıya Sy iksa Kamisana in 1972 Jatıya Sy iksaUpadestaParisadin1979BanlaBhasaBastabayanaSelain1982BanlaBhasaKamitiin1983 etc3) after independence in 1971 Bangla was made the primary medium ofinstructioncommunication in all Governmental and educational activities Through agreatstruggleandbloodshedtheBengalisestablishedBanglaasanofficiallanguageofthestate4
2 The UN declared Ekuśe February (21st February) as the International Mother Language Day at the UNESCO General Conference in Paris on 17 November 1999 ldquoin recognition of the sanctity and preservation of all vernacular languages in the worldrdquo22 3 Bāṅlā Bhāṣā Kamiṭi 1983 Bāṅlā Bhāṣā Kamiṭi Riporṭ (Report of the Bangla Bhasha Committee) Dhakaː Śikṣā Dharma Krīṛā O Saṅskṛti Mantraṇālaya Peoples Republic of Bangladesh 4 Chakraborty Rajib 2018 The Fishermenrsquos Community A Language-Culture Interplay (A Study of Post-1971 Select Bangla Novels) Unpublished PhD Dissertation Visva-Bharati
4
31WrittenBanglaThe lsquoBangla alphabetrsquo (বাংলা িলিপ - Bānglā lipi ISO15924) is derived from theBrāhmīwritingsystemwhichisrelatedtotheNagarı(alsoknownasDevanāgarī5)script[108]aswell as to Tirhutāwriting system [106] Considered to be fifthmostwidely usedwritingsystem in theworld thiscombinedBangla-Asamiyā-ManipuriScript (showingsomevariationsforAsamiyāandMeiteiorBisnupriyaManipuri)(130)wasusedintheeasternIndianSanskritmanuscriptstooForChakma in IndiaandBangladeshandforKokborok inTripura itwasandstill isoneof thescriptsusedAclosevariant calledTirhutā (123 now available also in UNICODE 100 as 11480 114DF See 110) orMithilākṣarawasused forMaithili fromthe14thCenturyuntil theearly-20thcentury[106]InthiscontextonefindsamentionoflsquoSylhetiNagarılipirsquoorlsquoSilotirsquo(addedtotheUnicodeStandard inMarch2005with thereleaseofversion41) thedetailsofwhichcouldbeof interest only tohistorians andhistorical linguists (See137and144)ButSylhetiBanglaisgenerallywrittenbymanyinthemodern-dayBanglascriptnowforallpracticalpurposes Originallyduring thereignof thePāladynasty (750-1154AD) intheeasternIndiaandevenearlierperhapsduringtheMallaperiod(694ADonwards)thepresent-dayBanglawritingsystemgotashapecomparabletothemodern-dayones[111 119] A pictorial description of Brāhmī to Modern Bangla Script could bepresentedhereinatabularform
Modern ক জ ম র স অ
k j m r s a
Table1PictorialdepictionofEvolutionofBrāhmītoBangla
5William DwightWhitney in his SanskritGrammar unequivocally said ldquoThis name (Devanagarı) is ofdoubtfuloriginandvaluerdquo(WhitneyWilliamDwight1994reprintSanskritGrammarNewDelhiːMotilalBanarasidassPublishersp1)
5
The inscriptional evidence in Brāhmī is found in the Archaic Brāhmī from the 3rdcenturyBC tothe1stcenturyBCandinMiddleBrāhmīndashsoonafter(1st-3rdCenturyAD)andthenonintheLateBrāhmī(4th-6thCenturyAD)ThisevidencecouldbeseeninbothBangladeshandWestBengal [108]by1)TheMahasthanagara(BogradistrictBangladesh mdash the ancient name being Pundranagara or Paundravardhanapura)inscriptions 2)Brāhmī (andKharoṣṭhī) inscriptions from the lower lsquoGangeticBengalrsquoand (3) Copper plate inscriptions of the Imperial Guptas fromNorthernpart ofWestBengal andNorth-West Bangladeshmdash in the areas underDharmaditya Gopachandraand Samācāradeva (about whom one only knows from five Copper-plates found inKotalipara in the Faridpur district in Bangladesh one in Mallasarul in the Burdwandistrict(WestBengal)andoneinJayramapura(BallesvaradistrictnowinOdisha)Theseepigraphs fromtheeasternpartofUndivided India (datingback to the4th-6thCenturiesAD)showedsomecharacteristicfeaturesofletters(especiallyinমlsquomarsquoলlsquolarsquo
শlsquosarsquoসlsquosarsquoandহlsquoharsquo)whichledtothedevelopmentofeasternvarietyofGuptascriptEpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmī In this context the Tippera copper plate inscription of the lsquoSamatatarsquo rulers(139 pp 265) such as Lokanātha (dated 7th Century AD during the latter half) theKailaninscriptionofSy ridharanaRātaaswellastheAstafpurcopperplatesThelettersseemtohangdownfromwedgeshapedsolidtriangleswithrighthandverticalsbendingdownatthebottombecauseofwhichitwasdescribedbyPrinsepandFleetasKuṭila-lipi (literally lsquoCursivewriting stylersquo)whereas the termSiddhamātrikā (as amatra orbarisplacedovereachoftheletters)wasusedbyAlBiruni(973-1048)todesignatethescriptofNorthernIndiaThenextstageofdevelopmentisillustratedbythe9thCenturycopper plate inscriptions fromKhalimpur of the reign of Dharmapāla fromMonghyrand Nalanda of the time of Devapāla in Bihar and from Jagjıvanpura (Malda) of thereignofMahendrapālaTheSiddhamātrikā(mentionedaslsquoSiddhamrsquoinChinesesources)issaidtohavebeenprevalentalsointhisregionuptotheendofthetenthcenturyAlsocalledtheGauri(ieGandi)inPūrvadeśāortheEasterncountryitwasregardedasthesame script to which is given the appellative Proto-Bangla characteristics inrudimentaryformsintheperiodbetweenAD875andAD1025Insomeepigraphs it isconsideredasbelonging to thesecondquarterof theeleventhcenturyADFlatteningofhead-marksbecomesprominentincomparisontothewedge-shaped serifs An important landmark in the development of the Bangla script is theRamaganja copper plate inscription of Mahāmānḍalika in the last quarter of theeleventhcenturyADItistheearliestdocumentfromthisentireregionwhichbearsthelettermwithatickrisingupwardsThefullvowelidevelopsatickattherightendofthe upper horizontal bar above and a curved hook below Initial e approaches themodernBanglacharacterAmature formofProto-Bangla the immediateprecursorof
6
BanglascriptisillustratedintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies[104]TheevolutionoftheBanglascript(Cf136)isalignedwiththestoryofadvancementofprintingtechnologyThefirstldquoMovabletyperdquoscriptstechnicallycreatedandusedwhileprintingNathanielBrasseyHalheds (1751-1830)1778-book titled AGrammaroftheBengalLanguageIn1785Governor-GeneralWarrenHastings(1732-1818)requestedanother civilian Charles Wilkins (1749-1836) to cut punches for Bangla printingcharactersThecurrentprintedformofBanglascriptappearedsoonafterItisgenerallyagreedthatWilkinsdevelopedBanglaprintscript[111]HepassedonthisknowledgetoPancananaKarmakara(-1804)arenownedartistinBengalLateritwasKarmakarand his family that became famous in Bangla printing technology Shepherd wasanotherassistantofWilkinsinthisdesigningofscriptwhichbecamemoreangularwithsharperturnsandedges[133]Afewarchaiclettersweremodernizedduringthe19thcentury It was standardized by Pandit Ishwar Chandra Vidyasagar when the Banglatypefontsweretobeusedtopublishona largescaleundertheCalcuttaSchoolBookSociety[116forseveralreferences]Much later in1935 theLinotypetechnique inventedbyOttmarMergenthaler(1854-1899) in 1886was introduced intoBangla printing in 1935 by the efforts of SureshChandra Majumdar (1888-1954) Rajsekhar Basu (1880-1960) Jatindra Kumar Sen(1882-1966)andhisdiscipleSushilKumarBhattacharyaandhadbegunbeingusedbytheA nandabazaraPatrikagrouplaterfollowedbyothersWithinafewyearsthemoreadvancedmonotypetechnologycametobeusedinBanglaprintingHoweverinBanglaprinting culturemonotypehas a very limited acceptance and linotype held stage tilleventuallythedigitaltechnologycameintoreplaceallearliertechniquesAllthesecouldbepresentedinatable
PERIOD DESCRIPTION NAMES
3rdCenturyBC UseofBrāhmīandKharosthīscriptsbegininthesubcontinentBrāhmīwaswidelyusedduringtheMauryanKingAśokaInonetheoryBrāhmīisbasedonNorthSemiticalphabetbutsuitablymodifiedtofittheneedoflocallanguagesItiscurrentlybelievedtohavebeenanindependentdevelopment
Brāhmī
1st-3rdCenturyAD
TheKusanascriptnamedaftertheKusanaroyaldynasty
Kusanascript
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
4
31WrittenBanglaThe lsquoBangla alphabetrsquo (বাংলা িলিপ - Bānglā lipi ISO15924) is derived from theBrāhmīwritingsystemwhichisrelatedtotheNagarı(alsoknownasDevanāgarī5)script[108]aswell as to Tirhutāwriting system [106] Considered to be fifthmostwidely usedwritingsystem in theworld thiscombinedBangla-Asamiyā-ManipuriScript (showingsomevariationsforAsamiyāandMeiteiorBisnupriyaManipuri)(130)wasusedintheeasternIndianSanskritmanuscriptstooForChakma in IndiaandBangladeshandforKokborok inTripura itwasandstill isoneof thescriptsusedAclosevariant calledTirhutā (123 now available also in UNICODE 100 as 11480 114DF See 110) orMithilākṣarawasused forMaithili fromthe14thCenturyuntil theearly-20thcentury[106]InthiscontextonefindsamentionoflsquoSylhetiNagarılipirsquoorlsquoSilotirsquo(addedtotheUnicodeStandard inMarch2005with thereleaseofversion41) thedetailsofwhichcouldbeof interest only tohistorians andhistorical linguists (See137and144)ButSylhetiBanglaisgenerallywrittenbymanyinthemodern-dayBanglascriptnowforallpracticalpurposes Originallyduring thereignof thePāladynasty (750-1154AD) intheeasternIndiaandevenearlierperhapsduringtheMallaperiod(694ADonwards)thepresent-dayBanglawritingsystemgotashapecomparabletothemodern-dayones[111 119] A pictorial description of Brāhmī to Modern Bangla Script could bepresentedhereinatabularform
Modern ক জ ম র স অ
k j m r s a
Table1PictorialdepictionofEvolutionofBrāhmītoBangla
5William DwightWhitney in his SanskritGrammar unequivocally said ldquoThis name (Devanagarı) is ofdoubtfuloriginandvaluerdquo(WhitneyWilliamDwight1994reprintSanskritGrammarNewDelhiːMotilalBanarasidassPublishersp1)
5
The inscriptional evidence in Brāhmī is found in the Archaic Brāhmī from the 3rdcenturyBC tothe1stcenturyBCandinMiddleBrāhmīndashsoonafter(1st-3rdCenturyAD)andthenonintheLateBrāhmī(4th-6thCenturyAD)ThisevidencecouldbeseeninbothBangladeshandWestBengal [108]by1)TheMahasthanagara(BogradistrictBangladesh mdash the ancient name being Pundranagara or Paundravardhanapura)inscriptions 2)Brāhmī (andKharoṣṭhī) inscriptions from the lower lsquoGangeticBengalrsquoand (3) Copper plate inscriptions of the Imperial Guptas fromNorthernpart ofWestBengal andNorth-West Bangladeshmdash in the areas underDharmaditya Gopachandraand Samācāradeva (about whom one only knows from five Copper-plates found inKotalipara in the Faridpur district in Bangladesh one in Mallasarul in the Burdwandistrict(WestBengal)andoneinJayramapura(BallesvaradistrictnowinOdisha)Theseepigraphs fromtheeasternpartofUndivided India (datingback to the4th-6thCenturiesAD)showedsomecharacteristicfeaturesofletters(especiallyinমlsquomarsquoলlsquolarsquo
শlsquosarsquoসlsquosarsquoandহlsquoharsquo)whichledtothedevelopmentofeasternvarietyofGuptascriptEpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmī In this context the Tippera copper plate inscription of the lsquoSamatatarsquo rulers(139 pp 265) such as Lokanātha (dated 7th Century AD during the latter half) theKailaninscriptionofSy ridharanaRātaaswellastheAstafpurcopperplatesThelettersseemtohangdownfromwedgeshapedsolidtriangleswithrighthandverticalsbendingdownatthebottombecauseofwhichitwasdescribedbyPrinsepandFleetasKuṭila-lipi (literally lsquoCursivewriting stylersquo)whereas the termSiddhamātrikā (as amatra orbarisplacedovereachoftheletters)wasusedbyAlBiruni(973-1048)todesignatethescriptofNorthernIndiaThenextstageofdevelopmentisillustratedbythe9thCenturycopper plate inscriptions fromKhalimpur of the reign of Dharmapāla fromMonghyrand Nalanda of the time of Devapāla in Bihar and from Jagjıvanpura (Malda) of thereignofMahendrapālaTheSiddhamātrikā(mentionedaslsquoSiddhamrsquoinChinesesources)issaidtohavebeenprevalentalsointhisregionuptotheendofthetenthcenturyAlsocalledtheGauri(ieGandi)inPūrvadeśāortheEasterncountryitwasregardedasthesame script to which is given the appellative Proto-Bangla characteristics inrudimentaryformsintheperiodbetweenAD875andAD1025Insomeepigraphs it isconsideredasbelonging to thesecondquarterof theeleventhcenturyADFlatteningofhead-marksbecomesprominentincomparisontothewedge-shaped serifs An important landmark in the development of the Bangla script is theRamaganja copper plate inscription of Mahāmānḍalika in the last quarter of theeleventhcenturyADItistheearliestdocumentfromthisentireregionwhichbearsthelettermwithatickrisingupwardsThefullvowelidevelopsatickattherightendofthe upper horizontal bar above and a curved hook below Initial e approaches themodernBanglacharacterAmature formofProto-Bangla the immediateprecursorof
6
BanglascriptisillustratedintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies[104]TheevolutionoftheBanglascript(Cf136)isalignedwiththestoryofadvancementofprintingtechnologyThefirstldquoMovabletyperdquoscriptstechnicallycreatedandusedwhileprintingNathanielBrasseyHalheds (1751-1830)1778-book titled AGrammaroftheBengalLanguageIn1785Governor-GeneralWarrenHastings(1732-1818)requestedanother civilian Charles Wilkins (1749-1836) to cut punches for Bangla printingcharactersThecurrentprintedformofBanglascriptappearedsoonafterItisgenerallyagreedthatWilkinsdevelopedBanglaprintscript[111]HepassedonthisknowledgetoPancananaKarmakara(-1804)arenownedartistinBengalLateritwasKarmakarand his family that became famous in Bangla printing technology Shepherd wasanotherassistantofWilkinsinthisdesigningofscriptwhichbecamemoreangularwithsharperturnsandedges[133]Afewarchaiclettersweremodernizedduringthe19thcentury It was standardized by Pandit Ishwar Chandra Vidyasagar when the Banglatypefontsweretobeusedtopublishona largescaleundertheCalcuttaSchoolBookSociety[116forseveralreferences]Much later in1935 theLinotypetechnique inventedbyOttmarMergenthaler(1854-1899) in 1886was introduced intoBangla printing in 1935 by the efforts of SureshChandra Majumdar (1888-1954) Rajsekhar Basu (1880-1960) Jatindra Kumar Sen(1882-1966)andhisdiscipleSushilKumarBhattacharyaandhadbegunbeingusedbytheA nandabazaraPatrikagrouplaterfollowedbyothersWithinafewyearsthemoreadvancedmonotypetechnologycametobeusedinBanglaprintingHoweverinBanglaprinting culturemonotypehas a very limited acceptance and linotype held stage tilleventuallythedigitaltechnologycameintoreplaceallearliertechniquesAllthesecouldbepresentedinatable
PERIOD DESCRIPTION NAMES
3rdCenturyBC UseofBrāhmīandKharosthīscriptsbegininthesubcontinentBrāhmīwaswidelyusedduringtheMauryanKingAśokaInonetheoryBrāhmīisbasedonNorthSemiticalphabetbutsuitablymodifiedtofittheneedoflocallanguagesItiscurrentlybelievedtohavebeenanindependentdevelopment
Brāhmī
1st-3rdCenturyAD
TheKusanascriptnamedaftertheKusanaroyaldynasty
Kusanascript
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
5
The inscriptional evidence in Brāhmī is found in the Archaic Brāhmī from the 3rdcenturyBC tothe1stcenturyBCandinMiddleBrāhmīndashsoonafter(1st-3rdCenturyAD)andthenonintheLateBrāhmī(4th-6thCenturyAD)ThisevidencecouldbeseeninbothBangladeshandWestBengal [108]by1)TheMahasthanagara(BogradistrictBangladesh mdash the ancient name being Pundranagara or Paundravardhanapura)inscriptions 2)Brāhmī (andKharoṣṭhī) inscriptions from the lower lsquoGangeticBengalrsquoand (3) Copper plate inscriptions of the Imperial Guptas fromNorthernpart ofWestBengal andNorth-West Bangladeshmdash in the areas underDharmaditya Gopachandraand Samācāradeva (about whom one only knows from five Copper-plates found inKotalipara in the Faridpur district in Bangladesh one in Mallasarul in the Burdwandistrict(WestBengal)andoneinJayramapura(BallesvaradistrictnowinOdisha)Theseepigraphs fromtheeasternpartofUndivided India (datingback to the4th-6thCenturiesAD)showedsomecharacteristicfeaturesofletters(especiallyinমlsquomarsquoলlsquolarsquo
শlsquosarsquoসlsquosarsquoandহlsquoharsquo)whichledtothedevelopmentofeasternvarietyofGuptascriptEpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmī In this context the Tippera copper plate inscription of the lsquoSamatatarsquo rulers(139 pp 265) such as Lokanātha (dated 7th Century AD during the latter half) theKailaninscriptionofSy ridharanaRātaaswellastheAstafpurcopperplatesThelettersseemtohangdownfromwedgeshapedsolidtriangleswithrighthandverticalsbendingdownatthebottombecauseofwhichitwasdescribedbyPrinsepandFleetasKuṭila-lipi (literally lsquoCursivewriting stylersquo)whereas the termSiddhamātrikā (as amatra orbarisplacedovereachoftheletters)wasusedbyAlBiruni(973-1048)todesignatethescriptofNorthernIndiaThenextstageofdevelopmentisillustratedbythe9thCenturycopper plate inscriptions fromKhalimpur of the reign of Dharmapāla fromMonghyrand Nalanda of the time of Devapāla in Bihar and from Jagjıvanpura (Malda) of thereignofMahendrapālaTheSiddhamātrikā(mentionedaslsquoSiddhamrsquoinChinesesources)issaidtohavebeenprevalentalsointhisregionuptotheendofthetenthcenturyAlsocalledtheGauri(ieGandi)inPūrvadeśāortheEasterncountryitwasregardedasthesame script to which is given the appellative Proto-Bangla characteristics inrudimentaryformsintheperiodbetweenAD875andAD1025Insomeepigraphs it isconsideredasbelonging to thesecondquarterof theeleventhcenturyADFlatteningofhead-marksbecomesprominentincomparisontothewedge-shaped serifs An important landmark in the development of the Bangla script is theRamaganja copper plate inscription of Mahāmānḍalika in the last quarter of theeleventhcenturyADItistheearliestdocumentfromthisentireregionwhichbearsthelettermwithatickrisingupwardsThefullvowelidevelopsatickattherightendofthe upper horizontal bar above and a curved hook below Initial e approaches themodernBanglacharacterAmature formofProto-Bangla the immediateprecursorof
6
BanglascriptisillustratedintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies[104]TheevolutionoftheBanglascript(Cf136)isalignedwiththestoryofadvancementofprintingtechnologyThefirstldquoMovabletyperdquoscriptstechnicallycreatedandusedwhileprintingNathanielBrasseyHalheds (1751-1830)1778-book titled AGrammaroftheBengalLanguageIn1785Governor-GeneralWarrenHastings(1732-1818)requestedanother civilian Charles Wilkins (1749-1836) to cut punches for Bangla printingcharactersThecurrentprintedformofBanglascriptappearedsoonafterItisgenerallyagreedthatWilkinsdevelopedBanglaprintscript[111]HepassedonthisknowledgetoPancananaKarmakara(-1804)arenownedartistinBengalLateritwasKarmakarand his family that became famous in Bangla printing technology Shepherd wasanotherassistantofWilkinsinthisdesigningofscriptwhichbecamemoreangularwithsharperturnsandedges[133]Afewarchaiclettersweremodernizedduringthe19thcentury It was standardized by Pandit Ishwar Chandra Vidyasagar when the Banglatypefontsweretobeusedtopublishona largescaleundertheCalcuttaSchoolBookSociety[116forseveralreferences]Much later in1935 theLinotypetechnique inventedbyOttmarMergenthaler(1854-1899) in 1886was introduced intoBangla printing in 1935 by the efforts of SureshChandra Majumdar (1888-1954) Rajsekhar Basu (1880-1960) Jatindra Kumar Sen(1882-1966)andhisdiscipleSushilKumarBhattacharyaandhadbegunbeingusedbytheA nandabazaraPatrikagrouplaterfollowedbyothersWithinafewyearsthemoreadvancedmonotypetechnologycametobeusedinBanglaprintingHoweverinBanglaprinting culturemonotypehas a very limited acceptance and linotype held stage tilleventuallythedigitaltechnologycameintoreplaceallearliertechniquesAllthesecouldbepresentedinatable
PERIOD DESCRIPTION NAMES
3rdCenturyBC UseofBrāhmīandKharosthīscriptsbegininthesubcontinentBrāhmīwaswidelyusedduringtheMauryanKingAśokaInonetheoryBrāhmīisbasedonNorthSemiticalphabetbutsuitablymodifiedtofittheneedoflocallanguagesItiscurrentlybelievedtohavebeenanindependentdevelopment
Brāhmī
1st-3rdCenturyAD
TheKusanascriptnamedaftertheKusanaroyaldynasty
Kusanascript
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
6
BanglascriptisillustratedintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies[104]TheevolutionoftheBanglascript(Cf136)isalignedwiththestoryofadvancementofprintingtechnologyThefirstldquoMovabletyperdquoscriptstechnicallycreatedandusedwhileprintingNathanielBrasseyHalheds (1751-1830)1778-book titled AGrammaroftheBengalLanguageIn1785Governor-GeneralWarrenHastings(1732-1818)requestedanother civilian Charles Wilkins (1749-1836) to cut punches for Bangla printingcharactersThecurrentprintedformofBanglascriptappearedsoonafterItisgenerallyagreedthatWilkinsdevelopedBanglaprintscript[111]HepassedonthisknowledgetoPancananaKarmakara(-1804)arenownedartistinBengalLateritwasKarmakarand his family that became famous in Bangla printing technology Shepherd wasanotherassistantofWilkinsinthisdesigningofscriptwhichbecamemoreangularwithsharperturnsandedges[133]Afewarchaiclettersweremodernizedduringthe19thcentury It was standardized by Pandit Ishwar Chandra Vidyasagar when the Banglatypefontsweretobeusedtopublishona largescaleundertheCalcuttaSchoolBookSociety[116forseveralreferences]Much later in1935 theLinotypetechnique inventedbyOttmarMergenthaler(1854-1899) in 1886was introduced intoBangla printing in 1935 by the efforts of SureshChandra Majumdar (1888-1954) Rajsekhar Basu (1880-1960) Jatindra Kumar Sen(1882-1966)andhisdiscipleSushilKumarBhattacharyaandhadbegunbeingusedbytheA nandabazaraPatrikagrouplaterfollowedbyothersWithinafewyearsthemoreadvancedmonotypetechnologycametobeusedinBanglaprintingHoweverinBanglaprinting culturemonotypehas a very limited acceptance and linotype held stage tilleventuallythedigitaltechnologycameintoreplaceallearliertechniquesAllthesecouldbepresentedinatable
PERIOD DESCRIPTION NAMES
3rdCenturyBC UseofBrāhmīandKharosthīscriptsbegininthesubcontinentBrāhmīwaswidelyusedduringtheMauryanKingAśokaInonetheoryBrāhmīisbasedonNorthSemiticalphabetbutsuitablymodifiedtofittheneedoflocallanguagesItiscurrentlybelievedtohavebeenanindependentdevelopment
Brāhmī
1st-3rdCenturyAD
TheKusanascriptnamedaftertheKusanaroyaldynasty
Kusanascript
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
7
PERIOD DESCRIPTION NAMES
4th-5thCenturyAD
ThenextstageofitsevolutionwasintotheGuptascriptnamedaftertheGuptaroyaldynasty
Guptascript
7thCenturyAD EpigraphicrecordsfromBangladeshdemonstrateremarkabledevelopmentsinEasternBrāhmīgivingrisetotheKuṭila-lipi
Kutila-lipi
8thCenturyAD SomecopperplateinscriptionsarefoundintheKhalimpurBangladeshduringthereignofDharmapālafromMonghyrandNālandāinBiharofthetimeofDevapālaandfromJagjıvanapurainWestBengalofthereignofMahendrapāla
Siddhamātikā
9thCenturyADuntil1025AD
Proto-BanglacharacteristicsinrudimentaryformsdevelopAnimportantlandmarkinthedevelopmentoftheBanglascriptistheRamaganjacopperplateinscriptionofMahāmāndalikafoundinthelastquarteroftheeleventhcenturyAD
Proto-BanglaScriptampLanguage
12th-13thCenturyAD
AmatureformofProto-BanglatheimmediateprecursorofBanglascriptisfoundintheinscriptionsoftheVarmanaSenaandDevarulersofthetwelfthandthirteenthcenturies
MaturedProto-Bangla
14th-15thCenturyAD
ThecharacteristicsoftypicalBanglascriptbegantodevelopascouldbeseeninthecopperplateinscriptionofVijayamānikya-IofTripuradated1478AD-alsoIllustratesformsofBanglalettersinthefifteenthcenturyAD
ModernBanglaScripterabegins(SeeRoss1999)
16th-17thCenturyAD
ThechartoftheBanglaalphabetappendedtotheChinaMonumentspublishedfromAmsterdamin1667andThecodeofGentoolawpublishedfromLondonin1776bothshowachartoftheBanglaalphabetTheyshow16VowellettersincludingtheLonglsquoৡrsquo lsquol irsquoAnusvāraandVisargaand34Consonants
PrintedChartsofBangla
18th-19thCenturyAD
CharlesWilkinsdevelopsprintinginBanglain1778andVidyasagarreformsit
BanglaTypeFonts
Table2DevelopmentoftheBanglaWritingSystem
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
8
TheoveralldevelopmentofBanglaScriptfromtheKuṭila-lipiperiodtoModernBanglacouldbeseenhereinTable3([102and146]andalsoseetheweb-pagein147)
Table3BanglaScriptinDifferentCenturies
32LanguagesConsideredBelowisthetabularrepresentationofthelanguagesusingBanglascriptthatareplacedonEGIDSScale1-6 (See117 fordetails) Some languagesunderEGIDS5 and6havealso developed their own scripts for printing and publishing Some had used Banglascriptearlier(suchasBodo)orusedit inWestBengalatsomepointoftime(Santali)but have later shifted to another writing system Bodo is now written in Nāgarī or
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
9
DevanāgarīandforSantalioneusesbothNāgarīDevanāgarīandOl-chiki(145)Forthepurposesof theBanglaLGRonly languagesbelonging to theEGIDS scale1 to4havebeenconsideredConsiderthefollowingtable
EGIDSScale1
EGIDSScale2
EGIDSScale3
EGIDSScale4
EGIDSScale5
EGIDS6
Bangla(Bengali)
SantaliBodoRiangKhumiMru(ng)Asho
LepchaPnarKodaKoraChak
Asamiyā(Assamese)
KochorRajabansı
MaltoorMalpahariya
ManipuriorMeitei
BisnupriyaManipuriKok-Borok(TripuraampBangladesh)
ChakmaHajongMundariampKurux(ofBangladesh)
TotoRohingyaTipperaMegamTanchangya
Usoi LimbuSadriorOraon
BhumijorMundariBawmChin
Table4MainlanguagesinIndiaandBangladesh
thatuseBanglaScriptontheEGIDSScale
33NotableFeaturesofBanglaScript[150]BanglaWritingSystemhascertainfeaturesthatshowhowithastobewritteninorhowtype-setting inBangla couldbedoneThis section is followedbya section that explains theCode-points (and fixed Code-point sequences) which show certain distinctive characteristics ofBanglaandwhichmaketheRepertoireThenextsectionswillalsocoverthelsquoaksharrsquo-formationrules(ABNF)showingcharacterclassWordLevelEvaluation(WLE)andContextRulesaswellas In-ScriptandCross-ScriptVariantsHerewepresentsomebasic featuresof theScriptandPronuncition The Bangla script is an alpha-syllabic writing system in which writing of all
consonants are assumed to contain an accompanying lsquoinherentrsquo vowel(theoretically before or after each consonant) It varies between ɔ and o
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
10
depending on the position of the consonant in the word At times theselsquoassumedrsquoorlsquoinherentrsquovowelsarenotpronouncedatall[142]
Vowelscanbewrittenasindependentlettersorbyusingavarietyofdiacriticalmarks which are written above below before after or both of the last twopositionstheconsonanttheyfollowinpronunciation[105]
AllBanglaconsonantswhenpronouncedinisolationareutteredwithaninherentvowel-ɔhenceক lsquokrsquoখ lsquokhrsquoorগ lsquogrsquoareusuallypronouncedas[kɔ][khɔ]or[gɔ]etcPhonologicallyBanglavowel-ɔcorrespondstotheHindischwaə
WhenconsonantsoccurtogetherinclustersspecialconjunctlettersareformedInprintedBanglamanyof theseconsonantal clustersorconjoinedconsonantsareinuseThelettersfortheconsonantsotherthanthefinaloneinthegrouparegenerally reduced But there are a few special conjunct characters which arecompounds of the consonant characters eg 7(k)+ষ(s)=8(ks)
9(n)+জ(j)=(nj)(j)+ঞ(n)==(jn) gt (h)+ম(m)=(hm) There are other issuesalsomdashরasthesecondmemberofaclusterisreducedtoasecondarysymboleg
(p)+র(r)=A(pr)B(s)+C(t)+র(r)=D(str) (as inউD ustra ldquocamelrdquo)য (y)whenusedas a primary symbol represents jɔ in Bangla But its secondary symbol(allograph) jɔ-phala has two phonetic values When added to the initialconsonant in a word it is a vowel aelig (as in শGামল (syamala) ldquogreenrdquo র Gাপার
(ryapara)ldquowrapperrdquoetc)Butafteranon-initialconsonant it justdoublesit in
pronunciation (as in কাযH ধাযH etc) The I(r)+য(y) combination has two
renderingsmdashর G(ry) andযH(ry)IncaseofJ(d)+ধ(dh)K(g)+ধ(dh)L(n)+ধ(dh)the
shape of the second member is changedmdasheg M(ddh) N(gdh) and O(ndh)
respectively The solitary example of I (r)+ঋ(r)=ঋH (as in ৈনঋHত nairrtSouthwest) ndash usedmostly in cases of Classical borrowings shows the use ofsecondary symbol of a consonant followed by the primary symbol of a vowelTheinherentvowelonlyappliestothefinalconsonantofthecluster
InconsonantclustersmanyconsonantstookacompletelydifferentformSometypicalexamplesareS(kt)T(kr)8(ks)N(gdh)=(jn)U(nc)(nj)V(tt)W(nt)O(ndh)X(bdh)Y(bhr)Z(mb)[(st)etcরhastwoallographsapartfromthisfullshapeoneislsquorepharsquoasfoundinকH(rk)পH(rp)andanotherisra-phalaasinA(pr)T(kr)(s+n)isanotheronewherethecerebralnasalconsonantsigntakesaqueershape[151]
The Bangla script has at least fifty-two primary symbols and quite a fewallographs(positionalvariantsofthem)correspondingtoforty-four(7oraland7nasalvowelsand30consonants)phonemes(150)orfunctionalspeechsoundswithsomeobviousredundanciesalthoughinoneofthefirstphonemicanalysisthenumberwasthoughttobethirty-fivephonemes[140]
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
11
As mentioned above in Bangla several graphemic symbols have secondaryshapestechnicallycalledlsquoallographsrsquowithacomplementarydistributionineachcaseThesegraphsormarkingsaregenerallyaddedtothefollowingpositionsoftheprimarysymbol[113]inthefollowingmanner
1) Below(egক(ku)W(nta)ক(ku)^ (hra)etc)
2) Above(egচ (ca)কH (rka)etc)
3) Rightside(egকা (ka)কং (kan)etc)
4) Leftside(egেক (ke))
5) LeftSideandabovesimultaneously(egৈক (kai)িক (ki)etc)
6) Rightsideandabovesimultaneously(egকী (kı))
7) Rightsideandleftsidesimultaneously(egেকা (ko))
8) Rightsideleftsideandabovesimultaneously(egেকৗ (kau))
Asforcomplementarydistributionofvowelletters(word-orsyllable-initial)andVowel Matras which are relevant for ABNF let us consider the followingBesidessomesimpleVowelModifierscalledlsquoKarsrsquoinBangla(alsoreferredtoasMatraintheotherLGRdocumentsofNeo-Brāhmī)therearesomecombinatorymodifiersofBanglaVowelswithcertainconsonantsForexamplewhereas
আU+0986BENGALILETTERAAissubstitutedby
াU+09BEBENGALIVOWELSIGNAA
ইU+0987BENGALILETTERIissubstitutedby
pre-posedিU+09BFBENGALIVOWELSIGNI
ঈU+0988BENGALILETTERIIissubstitutedby
ীU+09C0BENGALIVOWELSIGNIIor
উU+0989BENGALILETTERUissubstitutedby
U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme there are some special vowel modifiers of উ as in the followingcombinedletters
zwnj guratherthanwritingasগ(g)+ (u)
h ruratherthanwritingasর(r)+ (u)
zwnj śuratherthanwritingasশ (s)+ (u)
j huratherthanwritingasহ(h)+ (u)
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
12
knturatherthanwritingasL (n)+ত (t)+ (u)
Similarlytherecouldbevowelmodifiersofঊorlsquo(Long)ūrsquoaswelleg
m (bh)+র (r) (n bhru ldquoeyebrowrdquo)o (s)+র (r) (p sru)ঋ (r) afterহ (h) (q hr)etc
TherehavebeenmanynotablecontributionsinsimplifyingandmodifyingBanglaspellings and combinatory techniques especially by scholars such as PabitraSarkar(1992)[134]Inthistherehasbeenanattempttoreducethenumberofallographs of both vowels and consonants in clusters and it has been widelyacceptedintheprintingofschooltextsinbothBangladeshandWestBengal[151152]Asofnow twosystems theold (traditional) and thenewgoon sidebysideoperativeindifferentdomains
HoweverinpreparationofthisLGRdocumenttheaimhasbeentoconsiderthewidelyused and usable sequences and combinations and their variations across the sisterscriptsbelongingtothebasketofBrāhmīwritingsystemsBanglaAcademyDhakapublishedStandardBanglaSpellingRulesin1992followingtherecommendationsofacommitteeconstitutedthroughaworkshopjointlyorganizedbytheJatıyaSy iksakramaandPathyapustakaBoardin1988AthroughlyrevisededitionoftheRuleswaspublishedinSeptember20126After the establishment of Banla A kademi ofWestBengal in 1986 its first PresidentAnnadasankar Ray (1904-2002) in his inaugural address gave a direction forstandardizationofBanglaalphabetscript thespellingsystemandclearlyarguedthattheywouldnotblindly followtheSanskriticmodelofconventionalgrammarAbroadlistofproposalswassenttoexpertsonBanglaandabroadagreementwasreachedforlsquohomogenizationofBanglaspellingrsquoby1988BasedonopinionsreceivedfromdifferentquartersaunanimouslistoflsquorulesrsquowasagreeduponThiswaspublishedbyalsquoSpellingDictionaryrsquo titled Ākādemi Bānāna Abhidhāna (1997) which was obviously morecomprehensive than lsquoTheUniversityofCalcuttaproposalsrsquomade in1936Alongwiththe lsquorationalizationrsquo of spellings another stepwas taken tomake thewriting systemeasier to read by making the symbols used both single and combined ones morelsquotransparentrsquoThesereformswereoriginallysuggestedbySarkar(1987firstpublishedin1978)[134][153]whereheusedthetermsSwaccha (lsquoTransparentrsquo)andAswaccha(lsquoOpaquersquo or non-transparent) even adding Ardha Swaccha (lsquohalf transparent) inbetweenthetwoSomesampleexamplesare
6Bangla Academy 2012 Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy StandardBanglaSpellingRules)DhakaːBanglaAcademy
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
13
Transparent r (nn) s (pt) [ (st) where both member of the cluster can berecognizedOpaquewhereneitherofthetwocouldbe(easily)recognizedmdash8 ks(7 k+ষ s)= jn
( j+ঞ n)tng(un+গg) hm(gt h+ম m)
Semi-transparent A (pr)পH (rp)whereone symbol is recognizable and theother is
notIncaseofthree-termclustersatleastonesymbolwillnotbetransparentegv str
(w s+x t+র r)D str(B s+C t+র r)etc
Therewere in fact two types of proposals One concerned the shape of the lettersthose of consonant + vowel (CV) combinations and conjuncts which is consonant +consonantcombinationsTherewerefurthercomplexshapesiethoseofconsonant+consonant+ (consonant+) vowel (CC(CV) signs as in y (pru) or z (skru) SomedecisionsinthisareawerenecessarybecauseafewoftheCC(C)symbolsrepresentedcomplexitiesthatmadelearningthemdifficultforthechildrenTheotherdealtwiththespellings ofwords onlywithout any reference to the shapes of letters inwhich theywere written The basic objective here was lsquoone word one spellingrsquo to the greatestextentthatwaspossible[151]
Belowwe place a statement of themost salient changes that affect the consonant +vowelcombinations[153]
a The variants of the short u (^ উ-কার hrasva u-kāra) vowel sign have been
brought down to one ie So zwnj (gu) is now গ Similarly h (ru) gt র zwnj
(śu)gt শ j (hu)gtহ and therefore cluster + short u sign k (ntu)gt W
(ন++ত+উ) (stu)gt[ (স++ত+উ)
b The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced
(rū)gt র n (bhrū) gt Y (ভ bh++র r+ঊ ū) (drū)gt (দ d++র r+ঊ ū) p (śrū)gt
(শ ś++র r+ঊ ū)
c The variants of ঋ-কার (ṛ-kāra secondary symbol of ṛ) have been brought down to one q (hṛ) gt হ
Regarding consonant + consonant + (consonant)hellip+ (vowel) clusters PaschimbangaBanglaAkademi proposed transparent or semi-transparent shapes for clusters to theextentadmissibleinBanglawritingsystemSomeexampleswillclarifytheproposal(Aslashwillmeanthatthetraditionalcluster-shapeprecedesitwhiletheBanglaAkademiinnovationfollows)[153]
Xব ধ bdh ( b+ ধ dh) Mদ ধ ddh (J d+ধ dh) ন থ nth (L n+থ th) Uঞ চ ntildec
(9 ntilde+চ c) ঞ ছ $ ntildech (9+ছ) ঞ জ ntildej (9 ntilde+জ j) Sক ত amp kt (7 k+ত t) T
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
14
kr (7 k+র r) Nগ ধ ( gdh (K g+ধ dh) ) ṅk (u ṅ+ক k) t ṅg (u ṅ+গ g) +
ṣṇ (B ṣ+ণ ṇ) ন ndhr (L n+ dh+র r) - ṇḍr ( ṇ+ ḍ+র r) ktr (7 k+x
t+র r)
331TheConsonantsAsper traditional classificationBangla Consonants are categorized according to theirphoneticpropertiesespeciallyintermsofplaceandmannerofarticulation[107]Thereare Five lsquoVargarsquo (pronounced as lsquoBargarsquo in Bangla) or Groups (sets or classes)distinguished by Place of Articulation and one Non-lsquovargarsquo group [105] Each Vargawhich corresponds toStopsat a certainplaceof articulation containsa seriesof fiveconsonants classified as per their phonetic qualities (ie manner of articulation)beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourthcolumn)finallyendingwithaHomorganicorCorrespondingnasal[107]Considerthefollowingtable
lsquoVargarsquoorSets
Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar কlsquoKrsquoU+0995
খlsquoKHrsquoU+0996
গlsquoGrsquoU+0997
ঘlsquoGHrsquoU+0998
ঙlsquoNGrsquoU+0999
Palatal চlsquoCrsquoU+099A
ছlsquoCHrsquoU+099B
জlsquoJrsquoU+099C
ঝlsquoJHrsquoU+099D
ঞlsquoNYrsquoU+099E
Retroflex টlsquoTTrsquoU+099F
ঠlsquoTTHrsquoU+09A0
ডlsquoDDrsquoU+09A1
ঢlsquoDDHrsquoU+09A2
ণlsquoNNrsquoU+09A3
Dental তlsquoTrsquoU+09A4
থlsquoTHrsquoU+09A5
দlsquoDrsquoU+09A6
ধlsquoDHrsquoU+09A7
নlsquoNrsquoU+09A8
Bilabial পlsquoPrsquoU+09AA
ফlsquoPHrsquoU+09AB
বlsquoBrsquoU+09AC
ভlsquoBHrsquoU+09AD
মlsquoMrsquoU+09AE
Table5VargaclassificationofBanglaconsonants
(FallingintoaPatternofFiveSetsofUnvoicedUnaspiratedUnvoicedAspiratedVoicedUnaspiratedVoicedAspiratedandNasalscalledfivelsquoVargarsquo)
Non-
যlsquoYrsquoU+09AF
য়lsquoYYrsquoU+09DF
রlsquoRrsquoU+09B0
ড়lsquoRRrsquoU+09DC
ঢ়lsquoRHrsquoU+09DD
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
15
Varga লlsquoLrsquoU+09B2
শlsquoSHrsquoU+09B6
ষlsquoSSrsquoU+09B7
স lsquoSrsquo U+09B8
হlsquoHrsquoU+0939
Table6Non-Vargaconsonants(Notfallingintoanyofthefivecategories)
332TheImplicitVowelKillerHasanta(calledrsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)As stated earlier all consonants are pronounced in isolation with an implicit vowel(centralback-ɔinBanglaastheneutralvowel)assumedtobeassociatedwiththem[121]ThelsquoHasantarsquo(=rsquoHalantrsquoorlsquoHalantarsquoinotherBrahmı-basedscripts)orthetermlsquoVirāmarsquo7(=rsquoDa rirsquoinBangla)aspreferredinUNICODE(cfUnicode30andabove)havebeenusedinthisreportastermsthathavebeenusedtodenotethecharacterthatmarkthe absence of this inherent vowel It may be noted that the term virama has beenadopted in UNICODE in a sense that is different from the traditional definition ofgrammarandhenceitrequiressomeexplanationhereConsideringtheimportanceofthedocumentthisnoteshouldbeapartofthisLGRdocumentsothatanybodyreferingtoitshouldbeabletoknowthepropergrammaticalexplanationofthetermBecauseaspecialsignisneededwheneverthisimplicitvowelisstrippedoffthesymbolisknownas the Hasanta (= Halant) (U+09CD) By placing the Hasanta under the firstconsonantofacombinationorclusteronecouldndashincommonparlanceldquokillrdquoitsvowelandcreate conjuncts In thismanner conjunct characters canbegenerallywrittenbyjoining two to fourconsonant combinations In rarecases thisprocess can joinup tofive consonantsHowever thenotionof amaximumnumberof consonants joining toformoneaksara8istobeboundedempiricallyThisisanobservationbasedontheCIIL-Emille Corpora of Bangla words [132 amp 133] as seen in print these days Given themixture of scripts and languages happening on theweb the possibility that onemaywant a generic Top Level Domain [gTLD] which may have more than the observedmaximum cannot be ruled out This can be the case when a foreign language wordwhichadmitsalargenumberofconsonantsistransliteratedintoBanglaHenceintheBanglaLGRworkthislimitwillnotbeenforced
333VowelsSeparate symbols exist for all lsquoSwararsquo or Vowels in Bangla which are pronouncedindependentlyeitheratthebeginningofthewordorafteranothervowelorconsonantsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsigncalledlsquokārrsquo
7VirāmaasusedhereisalsoamisnomeraccordingtotheIndiangrammaticaltraditionsNowheremereabsence of a vowel is marked as virama Hasanta just marks the absence of a vowel nothing else(AbhyankarKashinathVasudevampJMShukla1961ADictionaryofSanskritGrammarBarodaːOrientalInstitute)8ThistermneedstobedisambiguatedAksaraalsomeanslsquosyllablelsquoinIndiangrammaticaltreaditions
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
16
inBanglaorMātrāinNagarı9isattachedtotheconsonantSincetheconsonanthasthisbuilt in neutral vowel at the end there are equivalent kāras (Mātrās) for all vowelsexcepttheঅ(pronounced-ɔ)Thecorrelationisshownasfollows
Vowel Correspondingvowelsign(kāras(Mātrās)
অlsquoArsquoU+0985
আlsquoAArsquoU+0986 া U+09BE
ইlsquoIrsquoU+0987 ি U+09BF
ঈlsquoIIrsquoU+0988 ীU+09C0
উlsquoUrsquoU+0989 U+09C1
ঊlsquoUUrsquoU+098A U+09C2
ঋVocalicrsquoRrsquoU+098B U+09C3
ৠVocaliclsquoRRrsquoU+09E0 U+09C4
ঌVocaliclsquoLrsquoU+098C U+09E2
ৡVocaliclsquoLLrsquoU+09E1 U+09E3
এlsquoErsquoU+098F ে U+09C7
ঐlsquoAIrsquoU+0990 ৈU+09C8
ওlsquoOrsquoU+0993 োU+09CB
ঔlsquoAUrsquoU+0994 ৌ U+09CC
- ৗ U+09D7
Couldappearontopofঅ lsquoArsquoU+0985oranyothervowel
U+0981Candrabindu
9AlthoughthetermlsquoMatralsquoinBanglastandsforanaltogetherdifferentconceptvizthetopbarplacedoveraletterndashtypicallyavailableinHindiandBanglabutmissinginGujarati
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
17
Vowel Correspondingvowelsign(kāras(Mātrās)
Couldappearafterঅ lsquoArsquoU+0985oranyothervowel
ংU+0982Anusvara
Couldappearafter অlsquoArsquoU+0985oranyothervowel
ঃU+0983Visarga
Afteranyconsonant U+09CD(Hasanta)
- ঽ U+09BDAvagraha
Table7BanglaVowelswithcorrespondingkārs
334TheAnusvāraonuʃʃār(ং-U+0982)TheAnusvāra or onuʃʃār inBangla at times represents a homorganic nasal but notalwaysItreplacesaconjunctgroupofalsquoNasalConsonant+Hasanta+ConsonantrsquowherethesecondconsonantbelongstotheVelarvargaorsetasinলংকাButitoftenappearsalso for such combinations involving non-velars appearing as the lastmember of thecombinationasinলGাংটা ldquonakedrdquoorলGাংচা ldquoakindofsweettolimprdquoBeforeanon-vargaconsonant the Anusvara represents a nasal sound that may have an alternativeconjoined writing symbol representing the corresponding nasal consonant of theparticularsetAlthoughModernHindiMarathiandKonkaniprefertheanusvāratothecorrespondingHalf-nasal inBangla it isclearlydemarcatedastowhereonemustusetheAnusvāraandwhere ithastobeaconjunctclusterwithanasalasthefirstorthesecondcomponent
335NasalizationCandrabindu(-U+0981)
Candrabindu denotes nasalization of the preceding vowel as in চাদ cad lsquomoonrsquo(U+099AU+09BEU+0981U+09A6)Thissignwithadotinsidethehalf-moonmarkisusedasnasalizationmarkerinmanyBrahmı-basedscripts[143]
336Nukta(-U+09BC)ThenuktasigndoesnotexistinBanglaorthographyItispredominantlyusedinmanyBrahmıderivedscriptssuchasDevanagarı(forHindiBodoMaithiliSantaliKashmiriandSindhiThetermandtheconceptofnuktaareborrowedinBanglaTheIDNAProtocol(RFC5891)statesthatIDNsmustbeinUnicodeNormalizationFormC (NFC) RFC 7940 applies this requirement to LGRs The definition of NFC in the
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
18
UnicodeStandardcontainsanumberofcompositionexclusionsAsaresulttheBanglalettersয় YYAড় RRAandঢ় RRHAhavetoberepresentedinthethisLGRbyusingthesequences (YA +Nukta U+9AF + U+09BC) (DDA + Nukta U+9A1 + U+09BC) and(DDHA+NuktaU+9A2+U+09BC)insteadofthesinglecodepointsYYA(U+9DF)RRA(U+09DC) andRRHA (U+09DD) although the useof lsquoNuktarsquo is otherwise completelyunnaturalinBanglaIt is noted that in the current Unicode Standard chart these characters are listed asadditionalconsonantsAspertheLGRProcedurehoweverthesedecisionsdependontheIDNAProtocolthroughasetofprodeduresdevelopedbytheIETFEventhoughtheUnicode Standard also prescribesmethods to produce these three characters both asatomiccharacters (forexample09DC forড় [r]09DD forঢ় [rh] and09DFasয় [y]assinglekeystroke)theIDNAprotocolrequiresthatwetreatthemasconjunctcharactersandthenallocatecodesfortheseintheUnicodeBengaliBlockItmaybenotedthattherecouldbesporadicattemptsorcasesofwritingMuslimnamesUrdupoeticwordsandPerso-Arabicloanwordswithnuktaunderক(k)খ(kh)গ(g)জ(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining thesanctityoftheloanwordThesewerealsolikeusingBanglawritingsystemtoworkliketheIPAscriptItishowevernotinuseinBanglawritinginprinting
337Visargabiʃɔrgo(ঃ-U+0983)andAvagraha(ঽ-U+09BD)
TheVisargabiʃɔrgoU+0983 is frequentlyused inBangla loanwordsborrowed fromSanskritandrepresentsasoundveryclosetohOnecouldquoteasanexampleদঃখduhkholdquosorrowrsquorsquoldquounhappinessrsquorsquo(U+0926U+0941U+0983U+0916)The Avagraha ঽ (U+09BD) is mainly used in Sanskrit Pali Prakrt or Maithili textswritteninBanglaItisgraduallybeingreplacedbyanuppercomma(egনেরাঽপরািণre-writtenasনেরাrsquoপরািণ)ItisrarelyusednoweveninotherlanguagesusingBanglascriptIncaseofLGRtheAvagrahaisnotpartoftherepertoireIthasbeendecidedthereforenottoretainAvagraha(ঽ)(U+09BD)becauseitisblockedinTLDsaspertheMaximalStartingRepertoire(MSR)PleaseseeAppendixIIinsection11foracompletelistofBanglaconsonantsandtheirallographs
338ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)ThisnoteispertinenttotheuseofZeroWidthJoiner(ZWJ)andZeroWidthNonJoiner(ZWNJ)asusedinBanglaItneedstobenotedthatNepaliKonkaniandHindiusethesetwosignsinadifferentmanner
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
19
ZWJ(U+0200D)andZWNJ(U+0200C)arecodepointsthathavebeenprovidedbytheUnicodestandardto instructtherenderingofastringwherethescripthastheoptionbetweenjoiningandnon-joiningcharactersWithouttheuseofthesecontrolcodesthestringmayberenderedinanalternateformfromwhatisintendedUseofZWJ
bull InsofarasBanglaisconcernedZWJisusedfortheproperrenderingofcharacterssuchaskhaṇḍa-taৎasinসতGিজৎ (satyajit)ldquoSatyajitrdquoandসৎ(sat)ldquohonestrdquoThisistypedasfollowsta+Hasanta+ZWJ(U+0200D)
bull However ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts Eg র++য
havetworepresentationsinBanglamdashasযHandasর GTogettheformযHonehasto
type in the following mannermdashর++য but for র G the sequence would be
র+ZWJ++য [154] In other words ZWJ is used in the rendering of wordsdemanding ya-phalā after ra which is otherwise not possible to type (render)due to the same order of ra+hasanta+antastha ja in the medial andor finalposition Interestingly ra+hasanta+antastha ja is used to type repha on theconsonant -antasthaja as inকায6 (kaarjo) In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in র Gাপার
(wrapper)র Gাশ(rash)র Gািল(rally)etcThetypingsequenceisgivenbelow
ra(র)+ZWJ+hasanta()+antasthaja(য)=র GUseofZWNJ
bull TheuseofZWNJinBanglaisusedtorepresenttheexplicitHasantaorHalantInordertoavoidconjunctformationincaseswherethereisanexplicithasantabeforethesucceedingconsonanttheZWNJisused
Consonant+hasanta+ZWNJ+consonant=explicithasantaExampleAা7 কথন(prakkathanaprakkɔtʰon)
TheuseofZWJZWNJhavebeenruledoutfromtherootzonebythe[Procedure]Usedin Bangla to create alternate renderings the insertion of these two signs can affectsearchingaswellasNLP
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
20
TheZeroWidthNon-joiner(ZWNJ)isaninvisiblecharacterusedincertaincases(afterHasanta)wheredefaultconjunctformationistobeexplicitlyrestrictedandtheHasantajoiningthetwoconsonantsparticipatingintheconjunctformationneedstobeexplicitlyshown
339UseofYa-phalaaYa-Phalaasequencesare two instances inBanglawhereHasanta isprecededbya fullvowel(U+0985অ-BENGALILETTERAandU+098Fএ-BENGALILETTERE)
bull অা 098509CD09AF09BE BENGALILETTERA+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
bull এা 098F09CD09AF09BEBENGALILETTERE+BENGALISIGNVIRAMA+BENGALILETTERYA+BENGALIVOWELSIGNAA
For renderingYa-phalā followedbyঅ andএ it isnecessary to typeU+09CDHasantaplusU+09AFyaprecededby thesaidvowelsThis isapurely ligaturalentityand theadditionofYa-phalāandākaraisusedtoelicittheaeligsoundasinEnglishacidঅGািসড
association অGােসািসেয়শনlsquobatrsquoবGাটlsquofatrsquo ফGাট lsquomatrsquo মGাটlsquocaprsquoকGাপetcTheBrāhmīscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribed as lsquovowel killerrsquo although it actually indicates absenceof a vowel after themarkedconsonantOnly theconsonantscanhave theHasantamarkedButasweseehereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅ8াandএ8া(CfUnicode100p473[100])
3310FormationofRa-phalaaandRefSequences Thiscasereferstotheformationofrephaandra-phalāasfollows
Ra-Hasanta= (C2H)whereC2iseither
09B0(র-BENGALILETTERRA)or 09F0(ৰ-ASSAMESELETTERRAUnicodename
BENGALILETTERRAWITHMIDDLEDIAGONAL)His09CD(-BENGALISIGNVIRAMA)
Owingtoco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egT ieka+Hasanta+raasinচTchakraldquocyclerdquo)Thepointisin
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
21
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesameTheLGRmakesanoteofthispoint of concern with respect to the two RAs in disguise as it would be compeltelyimpossibletodistinguishbetweenthemwithnakedeyesinalablesogeneratedwhichmay consequently lead to concerns related to spoofing and other kind of cyberirregularitiesThemotivetoclassthesetwoCPsas(blocking)variantsisbecausefullyrendered labels may mask the distinction between Bangla ra র (U+09B0) or theAssameseraৰ(U+09F0)ThatprovidesthejustificationforVariantSet4thoughonlyinthecontextoffollowingHasantThedifferencebetweentheRAsisonlydistinguishableifonelooksintotheirUnicodevaluesThereforelabelssuchasঅকHarka শীষH sırsalsquotopapexrsquo অY abhra lsquocloudthe skyrsquo ম śrama lsquophysical labourrsquo could be extremelydangerous as theweb-usermay never verify the digital content (the labels) with itsunicodevaluecodepoints ThispointismadeexplicitlywithreferencetoTable9(ofsequencesp36)andTable16(ofWLESymbolsp47)thataretofollowMoreoveritisnoteworthythattheREPHAcanalsooccurwithKHANDATATheconditionsinthiscontextofKHANDATAareliabletobesuchthattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
4 OverallDevelopmentProcessandMethodologyThe Neo-Brāhmī Generation Panel (NBGP) has been formed by members havingexperience in Linguistics (especially in NLP Computational linguistics) LiteratureLanguageHistoryandEpigraphyUndertheNeo-BrāhmīGenerationPanelBanglaandeightotherscriptsbelongingtoseparateUnicodeblocksarebeingtakenuptoassignaseparate LGR for each However an attempt ismade to ensure that the fundamentalphilosophybehindbuildingthoseLGRsconsistentwithallotherBrāhmī-derivedscriptsThepresentLGRwillcater tomultiple languagesbelongingtoEGIDSscale1to4(seeTable4)thatuseBanglascriptThefollowingguidingprinciplesareusedinmakingdecisionsaboutBanglaLGRCode-points
41 GuidingPrinciplesTheNBGP adopts followingbroadprinciples for selection of code-points in the code-pointrepertoireacrosstheboardforalltheNeo-Brāhmīscriptswithinitsambit
411 InclusionPrinciples4111 ModernUsageEvery character proposed should be in the everyday usage of a particular linguisticcommunityThecharacterswhichhavebeenencodedintheUnicodefortranscription
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
22
purposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire4112 UnambiguousUseEvery character proposed should have unambiguous understanding among linguistsaboutitsusageinthelanguage
42 ExclusionPrinciplesThe main exclusion principle is that of External Limits on Scope These consist ofprotocolsor standardswhichareprerequisites to theLabelGenerationRule-setsAllfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity421 ExternalLimitsofScopeThecodepointrepertoireforrootzonebeingaveryspecialcaseatthetopofprotocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathitThefollowingthreemainprotocolsstandardsactassuccessivefiltersiTheUnicodeChartOut of all the characters that are needed by the script in question if a particularcharacter is not encoded in Unicode it cannot be incorporated in the code pointrepertoire Such cases are quite rare and especially so in Bangla-Asamiyā-ManipuriWritingSystemgiventheelaborateandexhaustivecharacterinclusioneffortsmadebytheUnicodeconsortiumiiIDNAProtocolUnicode being the character-encoding standard for providing the maximum possiblerepresentation of a given scriptlanguage it has encoded as far as possible all thepossible characters needed by the script However the Domain name being aspecialized case it is governed by an additional protocol known as IDNA(InternationalizedDomainNames inApplications) The IDNAprotocol excludes somecharactersoutofUnicoderepertoirefrombeingpartofthedomainnamesiiiMaximalStartingRepertoire(MSR)TheRoot-zoneLGRbeing the repertoireof characterswhicharegoing tobeused forcreationoftheRoot-zoneTLDswhichinturnconstituteanevenmorespecializedcaseof domain names the ROOT LGR procedure introduces additional exclusions on theIDNArsquosallowedsetofcharacters
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
23
ExampleBanglaSignAvagrahaঽ(U+093D)evenifallowedbyIDNAprotocol isnotpermittedintheRootZoneRepertoireaspertheMSRTosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe code-block of the given scriptlanguage The IDNA Protocol further narrows thisdownandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore4211 NoPunctuationMarksTheTLDsbeingidentifierspunctuationmarkerspresentinBraHami-basedscriptswillnotbeincluded 4212 NoSymbolsandAbbreviationsAbbreviations weights and measures and other such iconic characters like BANGLAISSHAR(U+09FA)BANGLACURRENCYDENOMINATORSIXTEEN৹(U+09F9)etcwillalsonotbeincluded4213 NoRareandObsoleteCharactersThere are characterswhich have been added toUnicode to accommodate rare formssuchasSanskriticVOCALICRRৠ(U+09E0)andVOCALICLldquoঌrdquo (U+098C)aswellasVOCALICLLৡ(U+09E1)andtheallographicndashkaraformsofthelattertwosymbols-VOWELSIGNVOCALICL(U+09E2)andVOWELSIGNVOCALICLLldquo(U+09E3)Allsuch charactersareexcludedwhich complieswith theConservatismprincipleas laiddownintheRootZoneLGRprocedureHoweverinBanglathe-karacorrespondingtoVOCALICRRৠ(U+09E0)whichisVOWELSIGNVOCALICRRldquordquo(U+09C4)isstill inactiveuseincertainlimitedborrowedorSanskriticwordsandarethereforeretained4214 NoStressMarkersofClassicalSanskritandVedicStressmarkers for classical Sanskrit will not be included This is also in consonancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure4215 ABNFThe Augmented Backus-Naur Formalism (ABNF) is described in Section 541 andAppendix(Section101)
5 RepertoireTheBanglaWritingSystemisrepresentedinUNICODEusingtheBengali(Bangla)scriptname as enumerated in ISO 15924 corresponding to languages such as Asamiyā(Assamese) Bangla (Bengali) and Manipuri The BENGALI block used for Bangla-
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
24
Asamiyā-Manipuri in theUNICODEhas93 entriesThis sectiondetails the code-pointrepertoirethattheNeo-BrāhmīGenerationPanel[NBGP]proposestobeincludedintheBanglaLGRItmaybementionedherethat theGovernmentofAssamhassubmittedaproposal toBureauof Indian Standards (BIS) on26th February2016 for dis-unificationofBanglaand Asamiyā Scripts The BIS inits 8thMeetingofIndian Language Technologies andProducts Sectional Committee LITD 20 held on 23rd Aug 2017 decided torefer theproposalforrecognitionofAssamesescriptinISOIEC10646toISOUntiltheUNICODEConsortiumtakesanyfurtheractionitwillbeassumedthattheCodePointRepertoireunderTable11willbevalidforallthethreelanguagesasaboveFor each of the code points language references have been given in the last columntitledReferenceunderTable8titledtheldquoCodePointRepertoirerdquoForentirecoverageofBanglacodepointsreferencesofBanglaAsamiyā(Assamese)Manipuri(Meitei)andBishnupriya are given Kokborok written in Bangla script is not known to haveintroducedmanynewcomplicationsexceptforoneparticularcharacterThoughonlyafewrepresentativelanguagesunderEGIDSScale1-4havebeenchosenforreferencingthey together cover all the code-points required for all the languages that NBGP hasconsideredasgivenunderBanglaUnicodePoints(asgiveninUNICODE63)Howeverbefore thedetailsarepresented it is ideal to lookat theBanglaCodePointChartfromMaximalStartingRepertoire[MSR]Version3Itmaybenotedthattheshapesofthereferenceglyphsgivenbelowinthecodechartsarebasedononeofthemanyfontsdesigned and are not prescriptive because there could be some variations in actualfonts ndash both UNICODE-compatible and True-Type ones Consider the following Codepointtable
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
25
Figure1BanglaCodePagefrom[MSR]forBangla-Asamiyā-Manipuri
ColourconventionAllcharactersthatareincludedinthe[MSR]-YellowbackgroundPVALIDinIDNA2008butexcludedfromthe[MSR]-PinkishbackgroundNot PVALID in IDNA2008 or are ineligiblefor the root zone (digits hyphen) - Whitebackground
GiventheBanglaUnicodeBlockasinFigure1forthecodepointsthoseareincludedintheMSRthefollowingsymbolswillneedaseparatetreatmentৎ U+09CE BanglaLetterKhanda-Taৰ U+09F0 Asamiyā-BanglaLetterRaWithMiddleDiagonalৱ U+09F1 Asamiyā-BanglaLetterRaWithLowerDiagonal
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
26
51 CodePointRepertoireInclusion
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
1 U+0981 BENGALISIGNCANDRABINDU
Candra-bindu
1Bangla2Manipuri2Assamese
[112][122][125]
2 U+0982 ং BENGALISIGNANUSVARA
Onushshar(Anusvara)
1Bangla2Manipuri2Assamese
[112][122][125]
3 U+0983 ঃ BENGALISIGNVISARGA
Biśarga(Visarga)
1Bangla2Manipuri2Assamese
[112][122][125]
4 U+0985 অ BENGALILETTERA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
5 U+0986 আ BENGALILETTERAA
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
6 U+0987 ই BENGALILETTERI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
7 U+0988 ঈ BENGALILETTERII
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
27
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
8 U+0989 উ BENGALILETTERU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
9 U+098A ঊ BENGALILETTERUU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
10 U+098B ঋ BENGALILETTERVOCALICR
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
11 U+098F এ BENGALILETTERE
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
12 U+0990 ঐ BENGALILETTERAI
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
13 U+0993 ও BENGALILETTERO
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
14 U+0994 ঔ BENGALILETTERAU
Vowel 1Bangla2Manipuri2Assamese
[112][122][125]
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
28
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
15 U+0995 ক BENGALILETTERKA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
16 U+0996 খ BENGALILETTERKHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
17 U+0997 গ BENGALILETTERGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
18 U+0998 ঘ BENGALILETTERGHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
19 U+0999 ঙ BENGALILETTERNGA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
20 U+099A চ BENGALILETTERCA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
21 U+099B ছ BENGALILETTERCHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
29
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
22 U+099C জ BENGALILETTERJA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
23 U+099D ঝ BENGALILETTERJHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
24 U+099E ঞ BENGALILETTERNYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
25 U+099F ট BENGALILETTERTTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
26 U+09A0 ঠ BENGALILETTERTTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
27 U+09A1 ড BENGALILETTERDDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
30
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
28 09A109BC(U+09DC)
ড় NormalizedformofBENGALILETTERRRA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DCisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
29 U+09A2 ঢ BENGALILETTERDDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
30 09A209BC(U+09DD)
ঢ় NormalizedformofBENGALILETTERRHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DDisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
31 U+09A3 ণ BENGALILETTERNNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32 U+09A4 ত BENGALILETTERTA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
31
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
33 U+09A5 থ BENGALILETTERTHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
34 U+09A6 দ BENGALILETTERDA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
35 U+09A7 ধ BENGALILETTERDHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
36 U+09A8 ন BENGALILETTERNA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
37 U+09AA প BENGALILETTERPA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
38 U+09AB ফ BENGALILETTERPHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
39 U+09AC ব BENGALILETTERBA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
32
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
40 U+09AD ভ BENGALILETTERBHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
41 U+09AE ম BENGALILETTERMA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
42 U+09AF য BENGALILETTERYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
43 09AF09BC(U+09DF)
য় NormalizedformofBENGALILETTERYYA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]09DFisthepreferredcodepointhoweveritisnotavailableforLGRasperthestandardsgoverningthisLGRdevelopment
44 U+09B0 র BENGALILETTERRA
Consonant 1Bangla2Manipuri
[112][125]
45 U+09B2 ল BENGALILETTERLA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
46 U+09B6 শ BENGALILETTERSHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
33
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
47 U+09B7 ষ BENGALILETTERSSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
48 U+09B8 স BENGALILETTERSA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
49 U+09B9 হ BENGALILETTERHA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
50 U+09BE া BENGALIVOWELSIGNAA
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
51 U+09BF ি BENGALIVOWELSIGNI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
52 U+09C0 ী BENGALIVOWELSIGNII
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
53 U+09C1 BENGALIVOWELSIGNU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
34
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
54 U+09C2 BENGALIVOWELSIGNUU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
55 U+09C3 BENGALIVOWELSIGNVOCALICR
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
56 U+09C4 BENGALIVOWELSIGNVOCALICRR
Kāra(Mātrā)
1Bangla2Assamese
[112][122]
57 U+09C7 l BENGALIVOWELSIGNE
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
58 U+09C8 m BENGALIVOWELSIGNAI
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
59 U+09CB lা BENGALIVOWELSIGNO
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
60 U+09CC lৗ BENGALIVOWELSIGNAU
Kāra(Mātrā)
1Bangla2Manipuri2Assamese
[112][122][125]
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
35
No UnicodeCodePoint
Glyph
CharacterName
Category Language(s)withEGIDSValue
ReferencesandComment
61 U+09CD BENGALISIGNVIRAMA
Hasanta(=Halant)Virama(=Da ri)
1Bangla2Assamese2Manipuri
[112][122][125]
62 U+09CE ৎ BENGALILETTERKHANDATA
Consonant 1Bangla2Manipuri2Assamese
[112][122][125]
63 U+09F0 ৰ BENGALILETTERRAWITHMIDDLEDIAGONAL
Consonant 2Assamese [122]
64 U+09F1 ৱ BENGALILETTERRAWITHLOWERDIAGONAL
Consonant 2Assamese2Manipuri
[122][125]
Table8BanglaCode-PointRepertoire
Apart from the above individual code-points the Neo-Brāhmī Generation Panel alsoproposes some specific sequences which enable conditional inclusion of the BanglaLETTER A and E followed by Bangla SIGN VIRAMA and Bangla LETTER YA againfollowed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of aeligsoundasinEnglishlsquobatrsquolsquocatrsquoetc
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
36
SrNo
UnicodeCodePoints
Sequence
CharacterNames Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
S1 098509CD09AF09BE
অ8া BENGALILETTERABENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
BanglaAssamese
[112][122]
S2 098F09CD09AF09BE
এ8া BENGALILETTEREBENGALISIGNVIRAMABENGALILETTERYABENGALIVOWELSIGNAA
Bangla [112]
Table9Sequences
52 CodePointRepertoireExclusionTherearesomecharactersoftheBanglascriptthatfindplaceintheUnicodebuthavenot been included in the repertoire in the LGR proposal The reason for excludingঌ(U+098C)andৗ(U+09D7)isthattheyarerareandobsoletecharacters
SrNo CodePoints
Glyph CharacterNames Note
1 U+098C ঌ BENGALILETTERVOCALICL Limitedordeclininguse
2 U+09D7 ৗ BENGALIAULENGTHMARK Limitedordeclininguse
Table10ExcludedCodePoints
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
37
53 CodepointnotusedaloneBENGALI SIGN NUKTA U+09BC (See 336) is excluded from repertoire since it will never be used alone It will be used as sequence in three special characters in normalized form for ড় ঢ় য়
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
U+09BC BENGALISIGNNUKTA
Never used alone Only used together with U+09A1 ড
U+09A2 ঢ U+09AF য as to form ড় ঢ় য় respectively
Table10bExcludedCodePoints
54 TheBasisofPresentIDNThepresentLGRhasalsobenefitedfromtheearlierworkonIDNforBangla(differentversions)doneforभारतorভারতdraftedbetween20112009and18072013
541 TheABNFVariablesTheAugmentedBackus-NaurFormalism(ABNF)beganwiththefollowingvariables
CrarrConsonantVrarrVowelMrarrkāra(Mātrā)BrarrAnusvāra(onuʃʃār)DrarrCandrabinduXrarrVisarga(biʃɔrgo)HrarrHasantaViramaZrarrKhandaTa
TheAugmentedBackus-NaurFormalism(ABNF)willusethefollowingOperators
SrNumber Operator Function
1 ldquo|ldquo Alternative
2 ldquo[]rdquo Optional
3 ldquordquo VariableRepetition
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
38
4 ldquo()rdquo SequenceGroup
Table11TheABNFFormalism
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaaregiventofacilitateunderstanding
542 TheVowelSequenceInwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoBanglaare given To facilitate understanding of other Brahmi script users equivalents inDevanāgarīareprovidedwherevernecessaryAvowelsequenceismadeupofasinglevowelItmaybefollowedbutnotnecessarily(optionally)byanAnusvāraonuʃʃār(B)Candrabindu(D)oraVisargabiʃɔrgo(X)ThenumberofDBorXwhichcanfollowaV inBanglamaynotberestrictedtooneGoingbytherules illustratedinthedocument it isclearthat formationssuchasVDDVBBandVXXare invalidorthographicunitsHowever it isvalidandpossible tohaveformationsorsequencessuchasanusvarafollowedbyachandrabinduononehandandvisarga followed by a chandrabindu on the other as in হ8াংচা lsquohaelignchārsquo and lsquohaelignrsquo হ8াঃrespectivelyThepossibilityof aVisarga orAnusvāra (onuʃʃār) followingaCandrabinduexists inBangla Vowel can optionally be followed by a combination of Hasanta Virama [H]Consonant [C] to formaYa-phala ldquoYa-phala isapresentation formofU+09AFBanglaletterযorlsquoyarsquoRepresentedbythesequenceltU+09CDieBENGALISIGNVIRAMABangla SIGNHasanta or VIRAMA U+09AF -য BENGALI LETTER YAgt Ya-phala has aspecialformয়AgainwhencombinedwithU+09BEাBENGALIVOWELSIGNAA(ielsquoaarsquo(ā))itisusedfortranscribing[aelig]asintheldquoardquointheEnglishwordldquobatrdquowritteninBanglaasব8াটAVowel-sequenceadmitsthefollowingcombinations
5421 ASingleVowel
ExamplesV অअ
5422 AVowelwithConditionsAVowelcanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D]orVisarga[X]or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination ofHasanta(orVirama)[H]followedbyConsonant[C]followedbykāra(Mātrā)[M]
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
39
Examples
VB অং अ
VD অ अ
VX অঃ अः
VDB অং अ
VDX অঃ अःVHCM অ8াএ8া
5423 VHCMSequenceAVHCMsequencecanoptionallybefollowedbyAnusvāra[B]orCandrabindu[D] orVisarga [X]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga [DX]
Examples VHCMB অ8াংএ8াং VHCMD অ8াএ8া VHCMX অ8াঃএ8াঃ VHCMDB অ8াংএ8াং VHCMDX অ8াঃএ8াঃ
543 TheConsonantSequence
5431 ASingleConsonant(C) ExampleC কक
5432 AConsonantwithConditionsAConsonantoptionallyfollowedbydependentvowelsignkāra(Mātrā)[M]orAnusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known asVirama)[H]orCandrabindu+Anusvāra[DB]orCandrabindu+ Visarga[DX]
Example
CM িকক -कक
CB কং क
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
40
CD ক क
CX কঃ कः
CH p क(Pureconsonant)
CDB কং क
CDX কঃ कः
5433 CMSequenceACMsequencecanbeoptionallyfollowedbyBDXDBorDX
Example CMB কীংকং कक
CMD কা का
CMX বীঃ वीः
CMDB কাং का
CMDX কাঃ काः
5434 SequenceofConsonants Asequenceofconsonants(upto4)joinedbyHasanta(alsoknownasVirama)
3(CH)CExample
CHC W rarr ন++ ত न+त
CHCHC sup2 rarr ন+ + ত+ + র न+त+र
CHCHCHC q8 rarr ন++ত++র++য় न+त+र+य
5435 Subsets While considering its subsets as a representative example we will consider thecombinationCHConlyhoweverthesameisequallyapplicabletoCHCHCandCHCHCHC[A]ThecombinationmaybefollowedbyMBDXDBorDX
Example CHCM sup3ী rarrক ক ী 4कrarrककী
CHCB sup3ং rarrক ক ং 4कrarrकक
CHCD sup3 rarrক ক 4कrarrकक
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
41
CHCX sup3ঃ rarrক ক ঃ 4कःrarrककঃ
CHCDB sup3 ং rarrক ক ং 4कrarrकक
CHCDX sup3ঃ rarrক ক ঃ 4कःrarrककः [B] 3(CH)CMmayfurtherbefollowedbyaBDXDBorDX
Example
CHCMB sup3ীং rarr ক ক ী ং 4क rarr क क ी
sup3ং rarr ক ক ং 4क rarr क क
CHCMD sup3া rarr ক ক া 4का rarr क क ा
CHCMX sup3ীঃ rarr ক ক ী ঃ 4कः rarr क क ी ः
CHCMDB sup3াংrarr ক ক া ং 4काrarr क क ा
CHCMDX sup3াঃ rarr ক ক া ঃ 4काः rarr क क ा ः
544 TheKhanda-Tasequence
5441 AsinglelsquoKhandarsquo-Ta(Z) Example Zৎ=x
5442 AKhandaTaCombination10AKhandaTacanbeprecededbyaconsonantandHasanta(alsoknownasVirama)
[CH]Z
Example র++ৎ =ৎHasinভৎHসনা (bhartsanā) scoldingNoteTheconditionsinthiscontextofKHANDATAarethattheCshouldbeeitherRAU+09B0(র)(usedinBangla)orRAU+09F0(ৰ)(usedinAssamese)
545 SpecialCasesSandPTwospecialcasesinvolvingSequences(referredtoasSandPinTable16underSection7)couldbedescribedbrieflyhereLetustakeupSinthefirstinstanceItisnoteworthythat thereare two instances inBanglawhereHasanta(U+09CD) isprecededbya full
10 Refer to Rule P in Section 7 Table 16
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
42
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E) ForrenderingYa-phalāfollowedbyঅandএitisnecessarytotypeU+09CDplusU+09AFyapreceded by the said vowels This is a purely ligatural entity and the addition ofYa-phalāandā-kāraisusedtoelicittheaeligsoundasinEnglishlsquobatrsquolsquofatrsquoetcTheBrahmıscriptbynaturedoesnothaveHasantaafteravowelHasantaisgenerallydescribedaslsquovowel killerrsquo although it actually indicates absence of a vowel after the markedconsonant Only the consonants can have the Hasanta marked But as we see hereBanglaendsupwithadeviantfeatureintheorthographyhereinwhichHasantacomesimmediatelyafteravowelinligaturesঅGাandএGা(CfUnicode100p473[100])Another case refers to the formation of repha and ra-phalā in the said script andmentioned in the tableaboveasPOwing toco-occurrencewithHASANTARAeitherlosesitsownimplicitvowel(REPHA)orsuppressestheimplicitvoweloftheprecedingconsonant(RA-PHALA )Forinstancerepha=ra+Hasanta+C(egকHiera+Hasanta+
kaasinঅকHarkaldquothesunldquo)ra-phalā=C+Hasanta+ra(egTieka+Hasanta+raas
inচTcakraldquocycleldquo)ThepointisinboththecasestheslotforracouldbeBanglaraর
(U+09B0)ortheAssameseraৰ(U+09F0)followedprecededbythecommonHasanta(U+09CD)whereastheshapesofrephaandra-phalāinboththecasesremainthesame
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
43
6 VariantsThissection talksabout thevariants in theBanglascriptTheNBGPcategorizes theseconfusinglyvariantsintwogroupsGroup1ConfusingduetopurevisualsimilarityGroup2ConfusingduetodeviationfromnormallyperceivedcharacterformationsbylargerlinguisticcommunityForGroup1anyidenticalcodepointsaredefinedasvariantsTheconfusablebutnotidenticalcasesarenotproposedasthereisanotherpanel(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesHowevercaseswhichbelongtoGroup2areproposedtobeconsideredasvariantsThesecasesarenotofmerevisualsimilarityasthey involve some deviations from the widely accepted norms of Bangla Aksharformations These can cause confusion even to a careful observer and hence beingproposedasvariantsThe variants are generated in a script when two or more forms are formed withdifferent storage or code points In Bangla the e-kāra ā-kāra and the o-kāra havedifferentcodepointsOnecantypeowithaconsonantatonegoandthesamebytypinge-kāra and ā-kāra as two separate keys getting the same results A reader cannotdifferentiatebetween the twoko (েকা)one typedwitha singlekeyand theotheronetypedwithtwodifferentkeysMoreoverthiswillnotbeconsideredasacaseofvariantbecauseakārafollowedbyakāraisnotallowed
61 InScriptVariantsHoweverweproposetwocasesoftruein-scriptvariantsinBanglascriptCASEIAs far as true variants in Bangla are concernedwemay drawour attention to caseswhereinHasantawith(U+09A5)থ(tha)appearsasconjunctwith(U+09B8)স(sa)and(U+09A8)ন(na)1 স+Hasanta+থ(U+09B8+U+09CD+U+09A5)versus
স +Hasanta+হ(U+09B8+U+09CD+U+09B9)
2 ন+Hasanta+থ(U+09A8+U+09CD+U+09A5)versus
ন+Hasanta+হ(U+09A8+U+09CD+U+09B9)
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
44
Theabovecombinationsifwrittenintraditionalorthographycouldbelittleconfusingwheretheথ(tha)inconjunctappearslikeaহ(ha)Theconjunctcouldbeintheinitialmedialorfinalpositions(asshownbelowinegno1)Itcouldbetypedwrongaswellthinking itwas aহ (ha)U+09B9 increasing the chances of risks in labelwriting andidentificationExamples1 acuteandসহ(asinacuteানsthānaacuteলsthulaাacuteGsvāsthyaঅacuteায়ীasthāyı)
2 andনহ(asinparagrantha)The fontswhichrepresent traditionalBanglawritingsystemcould tend tocreate thisproblemThereforethesemaybetakenascasesofvariantsinBanglaCASEIIAnotherinterestingexampleofvariantisencounteredinra+HasantaandHasanta+racombinationsinwritinglabelsintheBanglascript(forlanguagessuchasBanglaAssameseandManipuri)Thevariantcasesariseintypinglsquorepharsquo(involvingra+Hasanta)andlsquora-phalārsquo(involvingHasanta+ra)lsquoRepharsquocouldbeformedbytwosequences(mainlybecausebothAssameseandBanglafindplaceinthesameUNICODEpointsandlsquoB_RArsquoaswellaslsquoA_RArsquorefertothesamephoneticelement)Herethefinalligatureslookthesameandwillbeasfollows
(1) B_RA+H+C(2) A_RA+H+C
Where
B_RA = U+09B0BENGALILETTERRA(র)orA_RA = U+09F0BENGALILETTERRAWITHMIDDLEDIAGONAL(ৰ)H = U+09CDBENGALISIGNVIRAMA()C = anyconsonant(theoretically)
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+09B0(র)U+09CD()U+0995(ক) কH U+09F0(ৰ)U+09CD()U+0995(ক) কH
U+09B0(র)U+09CD()U+09A0(ঠ) ঠH U+09F0(ৰ)U+09CD()U+09A0(ঠ) ঠH
Table12ExampleofRepha
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
45
NoteAsBanglaandAssameseকandঠlookexactlythesametheresultantcombinationswithRephalookidenticalAdditionofRephadoesnotmakeanydifferencelsquoRa-phalarsquocouldbeformedbytwosequencesonsimilargroundsandthefinalligatureswouldlookthesame
(1) C1+H+B_RA(2) C1+H+A_RA
WhereC1 = anyconsonantsexceptKhanda-ta
Example
Sequence1(UsingBanglaRA)
Ligature1
Sequence2(UsingAssameseRA)
Ligature2
U+0995(ক)U+09CD()U+09B0(র) U+0995(ক)U+09CD()U+09F0(ৰ)
U+09A8(ন)U+09CD()U+09B0(র) ) U+09A8(ন)U+09CD()U+09F0(ৰ) )
Table13ExampleofRa-phala
AstheAssameseandBanglaRephaandRa-phalaconjunctformslookthesamethiscouldcauseconfusabilitytotheend-usersHencetherephaandra-phalacasesneedtobedefinedasvariantsNBGPconcludedtodefineরandৰasvariantcodepointswhereonlyonevariantsetbetweenরandৰcouldcoverallcasesButthiswillcreateblockedvariantlabelsegif
someoneregistersldquoরররrdquothevariantlabelldquoৰৰৰrdquowillbegeneratedasvariantandwillbeblockedandviceversaHoweveritisonlyblockedatthelabellevelifsomeoneelseneedstoregisterotherlabelsegৰৰorৰৰৰৰitisstillpossible
62 CrossScriptVariantsAcrispcrossscriptstudyforBanglahasbeendonewithrespecttosisterscriptssuchasDevanāgarī Gurmukhı and Odia11(formerly Oriya) keeping in mind the visual andtechnicalconfusionstheymaycauseaslabelsonthewebdomainMoreoverthereisnoin-script variant in Bangla as far as the orthography is concerned The followingcharacters are being proposed by the NBPG as variants Although there are certaincharacters which are somewhat similar they but have not been included here TheyhavebeenprovidedintheAppendix(102)forreference
11 Unicode uses Oriya for the script although Odia is now the official term used
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
46
1 BanglaandNāgarīDevanāgarīScript
Bangla Devanāgarī
মU+09AE
मU+092E
িU+09BF
िU+093F
Table14-BanglaandDevanāgarīcross-scriptvariantcodepoint
2 BanglaandGurmukhiScript
Bangla Gurmukhı
মU+09AE
ਸU+0A38
িU+09BF
ਿU+0A3F
Table15-BanglaandGurmukhıcross-scriptvariantcodepoint
7 WholeLabelEvaluationRules(WLE) ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinsection32whenwritten inBangla12ScriptThe ruleshavebeendrafted in suchawaythattheycanbeeasilytranslatedintotheLGRspecificationsBelow are the symbols used in the WLE rules for each of the Indic SyllabicCategoryasmentionedinthetableprovidedinCodepointrepertoire(Section51)
C rarr Consonant
M rarr Kāra(Mātrā)
V rarr Vowel
B rarr Anusvāra
12 As used by the Unicode denoting and including both Assamese and Maṇipuri
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
47
D rarr Candrabindu
X rarr Visarga
H rarr Hasanta
Z rarr KhandaTa
S rarr S1S2(fromTable9)or(ae)Ya-phalā(V1HC1M1)whereV1iseither0985(অ-BENGALILETTERA)or098F(এ-BENGALILETTERE)His09CD(-BENGALISIGNVIRAMA)C1is-09AF(য-BENGALILETTERYA)M1is-09BE(া-BENGALIVOWELSIGNAA)S1 and S2 are valid even they are not allowed by the other context rules
P rarr Ra-Hasanta(C2H)whereC2iseither09B0(র-BENGALILETTERRA)or 09F0 (ৰ - ASSAMESE LETTER RA
Unicode name BENGALI LETTER RAWITHMIDDLEDIAGONAL)
His09CD(-BENGALISIGNVIRAMA)
Table16-SymbolsusedinWLErules
ItisalsoperhapsidealtomentionherethatinBanglatheconsonantletters(orgraphemes)arephysicallyjoinedtoformldquoclustersrdquothatcouldtheoreticallyconjoinfromtwotofourconsonantsandcombinetocreatenewshapesDashandChaudhuri(1998)statethatthereareldquonearly380uniqueconsonantclustersrdquooutofwhichBi-consonantalcombinationsare290three-lettercombinationsaccountforanother80andtherareroneswithfourlettersnumber10more[136Pg4]MoredetailsofsuchcombinationscouldbeseeninPabitraSarkar(1993)[135] 71FinalSetofWLERulesTheprevalentpatternsinBanglaandvariousrestrictionsbelowarethespecificWLErulesthatneedtobeimplemented
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
48
1 CisasetofCandCNwhereCNisthesetofnormalizedformsofড়ঢ়য়2 HmustbeprecededbyC
Example
3 MmustbeprecededbyCExampleকা
4 DmustbeprecededbyeitherofVCorMExampleআখখা হা
5 XmustbeprecededbyeitherofVCMorDExampleউঃখঃবঃাঃ দ ঃ
6 BmustbeprecededbyeitherofVCMorDExampleআংইংকং
7 ZmustbeprecededbyVCMDBXorPExampleইৎকৎাৎাৎপ6ৎrৎ (S is not listed because S ends with M Z may also follow S)rdquo
8 VCANNOTbeprecededbyHDetailsin711CaseofVprecededbyH
9 SCANNOTbeprecededbyH
Now letuselaborateeach rulewithexamples from the scriptkeeping inmindthe Bangla Assamese and Manipuri communities Some combinations ofcharactersmayseemunrealisticorrareinusagebutthereisnoharminaddingsuch ligaturesbecause it ispossible tocreate thembyanyusereasilybutmaynotbeattestedcombinationsCase of V Preceded by H There could be cases involvingmulti-word domains where Vmay need to beallowedtofollowanHegব8াsঅtইিuয়া baeligŋkʌv ɪndiə (U+09ACU+09CDU+09AFU+09BEU+0999U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1U+09BFU+09DFU+09BE)(meaningBankofIndia)ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendswithanH(অt)andthesecondwordbeginswithaV(ইিuয়া)Somesectionsofthe linguistic community require the explicit presence of H for fullrepresentation of the sound intended However by and large the form of thefirstwordwithoutanH(U+09CD)isconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
49
This isauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoire Otherwise V is never required to be allowed to follow an HPermittingthismaycreateaperceptivesimilaritybetweentwolabels(withandwithout H) for majority of the linguistic communities hence this is explicitlyprohibitedbytheNBGPIn future if required depending on the prevailing requirements from thecommunitythefutureNBGPmayconsiderrevisitingthisrule
72 AdditionalExamplesfromBanglaABNF
Belowarea fewexampleswhichhelponeunderstandsomeof the rulesABNFputsinplaceThesearejustgivenforreferencepurposesandarenotmeanttobecomprehensive
1HMBDorXcannotoccurinthebeginningofaBanglawordExample
ক क
াক ाक
ংক क
ক क
ঃক ःकAscanbeseensuchcombinationwillresultautomaticallyinaldquogolurdquooradottedcircle marking it as an invalid formation This is an intrinsic property of theIndianlanguagesyllableandisquasi-automaticallyappliedwhereversupportedbytheOS
2HisnotpermittedafterVBDXMS
Example
অ अ
অং क ক क কঃ कः ি )क
3NumberofBDorXpermittedafterConsonantorVowelorakāra(Mātrā)isrestrictedtoonethusthefollowingcombinationsareinvalidated
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
50
Example
কংং क
ক क
কঃঃ कःः
কা का
কীঃঃ कःः
অংং अ
অ अ
অঃঃ अःः
4NumberofMpermittedafterConsonantisrestrictedtooneExample
কীী की5MisnotpermittedafterV Example
ইাঈৗ ईाईौ6ThecombinationsofAnusvāra+ VisargaaswellasVisarga+Anusvāraarenotpermissible
Example
কংঃ कः
কঃং कः
8 Contributors
81ExpertsfromIndia ProfessorUdayaNarayanaSinghChair-ProfessorofLinguisticsampDeanFacultyofArtsAmityUniversityHaryanaGurgaonPachgaonManesarPIN122431(Haryana)India
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
51
ProfessorPabitraSarkarformerlyVice-ChancellorRabindraBharatiUniversityKolkataDrAtiurRahmanKhanPrincipalTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrRajibChakrabortyLinguistSocietyforNaturalLanguageTechnologyResearch(SNLTR)Module114amp130SDFBuildingSaltLakeSector-VKolkata-700091(WestBengal)IndiaMrAkshatJoshiProjectEngineerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMsMoumitaChowdhurySeniorTechnicalOfficerGISTGroupC-DACPunePIN411008(Maharashtra)IndiaMrChandrakantaMurasinghAgartalaTripuraSomeotherNBGPmembers
82ContributorsfromBangladeshJanabMustafaJabbarHonorableMinisterMinistryofPostsTelecommunicationsampInformationTechnologyGovtofBangladeshProfShamsuzzamanKhanFormerDirector-GeneralBanglaAcademyDhakaProfRafiqulIslamNationalProfessorofHumanitiesDhakaProfSwarochisSarkarDirectorInstituteofBangladeshStudiesRajshahiUniversityRajshahiBangladeshProfJinnatImtiazAliDirector-GeneralInternationalMotherLanguageInstituteDhakaMrMohammadMamunOrRashidDepartmentofBanglaJahangirnagarUniversityampMemberBangladeshComputerCouncilProfManiruzzamanformerlyProfessorChittagongUniversityChattagramBangladeshMrShyamSunderSikderSecretarySecretaryPostampTelecommunicationsDivisionGovtofBangladesh
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
52
MrMdMustafaKamalFormerDirectorGeneralBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaBrigadierGeneralMdMahfuzulKarimMajumderDirector-GeneralEngineeringampOperationsDivisionBangladeshTelecommunicationsRegulatoryAuthorityGovernmentofBangladeshDhakaMdZiarulIslamProgrammerPostsampTelecommunicationsDivisionGovernmentofBangladeshDhakaProfSyedShahriyarRahmanDepartmentofLinguisticsUniversityofDhakaDrMizanurRahmanDirector(In-Charge)TranslationTextBookandInternationalRelationsDivisionBanglaAcademyDhakaDrApareshBandyopadhyayDirectorBanglaAcademyDhakaMrMdMobarakHossainDirectorBanglaAcademyDhakaDrJalalAhmedDirectorBanglaAcademyDhakaMrJahangirHossainInternetSocietyBangladesh(ICANNALS)JanabSarwarMostafaChoudhuryBangladeshComputerCouncilDhakaJanabMdRashidWasifBangladeshComputerCouncilDhakaJanabIstiaqueArifSeniorAssistantDirectorBangladeshTelecommunicationsRegulatoryAuthorityDhakaMsAfifaAbbasInformationSecurityandGovernanceLeadEngineeratBanglalinkandICANNFellowMr Mohammad Abdul Haque Secretary General Bangladesh Internet GovernanceForumMrImranHossenCEOEyeSoftandkeymemberofBangladeshAssociationofSoftwareampInformationServices(BASIS)MsShahidaKhatunDirectorFolkloreMuseumampArchiveDivisionBanglaAcademyDhakaMrSyedAshikRehmanCEOBengalMediaCorporationDhakaMrHaseebRahmanCEOProfessionalsrsquoSystemsDhaka
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
53
9 References[100] UnicodeConsortium2017UnicodeStandard100MountainViewCA
[101] BandyopadhyayChittaranjan1981DuiShatakerBanglaMudranoPrakashanKolkataAnandaPublishers
[102] BanerjiRD1919TheOriginoftheBengaliScriptKolkataNewDelhiAsianEducationalServices2003reprint
[103] ChatterjiSK1926TheOriginandDevelopmentoftheBengaliLanguageCalcuttaCalcuttaUniversityPress
[104] -----1939Bhasha-prakashBangalaVyakaran(AGrammaroftheBengaliLanguage)CalcuttaUniversityofCalcutta
[105] HaiMuhammadAbdul1964DhvaniVijnanOBanglaDhvani-tattwa(PhoneticsandBengaliPhonology)DhakaBanglaAcademy
[106]JhaSubhadra1958TheFormationofMaithiliLondonLuzacampCo
[107] KosticDjordjeDasRheaS1972AShortOutlineofBengaliPhoneticsCalcuttaStatisticalPublishingCompany
[108] MajumdarRC1971HistoryofAncientBengalCalcuttaGBhardwaj
[109] MazumdarBijaychandra19202000TheHistoryoftheBengaliLanguage(ReprCalcutta1920ed)NewDelhiAsianEducationalServices
[110] PandeyAnshuman2001ProposaltoEncodetheTirhutaScriptinISOIEC10646
[111] PalPalashBaran2001DhwanimalaBarnamalaKolkataPapyrus
[112] -----2007lsquoBanglaHarapherPanchParbarsquoInSwapanChakrabortyedMudranerSanskritiOBanglaBoiKolkataAbabhas
[113] RossFiona1999ThePrintedBengaliCharacteranditsEvolutionLondonCurzon
[114] ShastriMahamahopadhyayHaraPrasad1916HājārBacharērPurāṇaBāṅgālāBhāṣāyBauddhaGānōDōhāCalcuttaBangiyaSahityaParishat
[115] SinghUdayaNarayana(JointlyManiruzzaman)1983DiglossiainBangladeshandlanguageplanningCalcuttaGyanBharati214pp
[116] -----1987ABibliographyofBengaliLinguisticsMysoreCIILxii+316pp
[117] -----2017(withRajibChakrabortyBidishaBhattacharjeeampArimardanKumarTripathy)LanguagesandCulturesontheMarginGuidelinesforFieldworkonEndangeredLanguagesMimeoCentreforEndangeredLanguagesVisva-Bharati
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
54
[118] -----1980ScriptalchoiceandspellingreformAnessayinlanguageandplanningJournaloftheMSUniversityofBarodaSocialScienceNumber292173-186AmodifiedversionreprintedEAnnamalaiBjornJernuddandJoanRubinedsLanguagePlanningProceedingsofanInstituteMysoreCIIL405-417
[119] Sripantha1996JakhanChapakhanaEloKolkataPaschim-BangaBanglaAcademy
[120] SurAtul1986BanglaMudranerDushoBacharKolkataJijnasa
[121] ScriptBehaviourforBengaliVersion11TDILandC-DACPune
[122] BoraMahendra1981TheEvolutionofAssameseScriptJorhatAssamSahityaSabha
[123] ProposaltoEncodetheTirhutaScriptinISOIEC10646httpwwwunicodeorgL2L201111175r-tirhutapdfaccessedon25112017
[124] EthnologueAssameseintheLanguageCloudhttpswwwethnologuecomcloudasmaccessedon25112017
[125] BengalialphabetforManipurifoundinEthnologueManipuri(MeeteilonMeithei)httpswwwomniglotcomwritingmanipurihtmaccessedon20102019
[126] WikipediaBengalialphabethttpsenwikipediaorgwikiBengali_alphabetaccessedon25112017
[129]OmniglotSlyhetihttpwwwomniglotcomwritingsylotihtmaccessedon1052018
[130]WikipediaBishnupriyaManipurilanguagehttpsenwikipediaorgwikiBishnupriya_Manipuri_languageaccessedon25112017
[131] TheEMILLECIILCorpushttpmetashareeldaorgrepositorybrowsethe-emilleciil-corpusabdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20accessedon1052018
[132]TheEMILLECIILCorpushttpcatalogelrainfoproduct_infophpproducts_id=696accessedon1052018
[133] BanglaLanguageampScript httpswwwisicalacin~rc_banglabanglahtmlaccessedon1052018
[134] SarkarPabitra1992BanglaBananSanskarSamasyaoSambhabanaKolkataChirayataPrakashan
[135] SarkarPabitra1993BanglaBhasharYuktabyanjanBhasha1123-45
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
55
[136] DashNiladriShekharandBBChaudhuri1998BanglaScriptAStructuralStudyLinguisticsToday121-28Alsoavailableathttpswwwacademiaedu9967428Bangla_Script_A_Structural_Study
[137] DaniAhmedHasan(1957)lsquoSrīhatta-NāgarīLipirUtpattioBikāśrsquoBanglaAcademyPatrika(Dhaka)Vol12(Bhadra-Agrahayan1364BangabdaNumber)pg1
[138] WikipediaSylhetiNagari httpsenwikipediaorgwikiSylheti_Nagariaccessedon1952018
[139] FuruiRyosuke(2015)lsquoVariegatedAdaptationsStateFormationinBengalfromtheFifthtoSeventhCenturyrsquoinBhairabiPrasadSahuampHermannKulkeedsInterrogatingPoliticalSystemsIntegrativeProcessesandStatesinPre-ModernIndiaChapter9Pp255-73NewDelhiManohar
[140] FergusonCharesAandMunierChowdhury(1960)lsquoPhonesofBengalirsquoLanguageVol36No1pp22-59
[141] ShahidullahMuhammad(2007)BuddhistMysticSongsDhakaMowlaBrothers
[142] RayPunyaSloka(1966)BengaliLanguageHandbookWashington
[143] HaiMuhammadAbdul(1960)AphoneticandphonologicalstudyofnasalsandnasalizationinBengaliDhakaUniversityofDhaka
[144] UnicodeConsortiumProposalSummaryFormtoAccompanySubmissionsforAdditionstotheRepertoireofISOIEC10646UNICODEhttpswwwunicodeorgL2L200202387r-syloti-formpdfaccessedonMay212018
[145] WikipediaOlChiki(Unicodeblock)httpsenwikipediaorgwikiOl_Chiki_(Unicode_block)accessedonMay212018
[146]BanglaScripthttpwwwbangladesh2000combdbangla_scripthtmlaccessedonMay212018
[147] BhattacharyaAshutoshed(1942)GopichandrerGanCalcuttaCalcuttaUniversity
[149] DasSisirKumar(1975)SahibsandMunshisAnAccountoftheCollegeofFortWilliamCalcutta
[150] IslamRafiqulPabitraSarkarMahbubulHaqampRajibChakraborty(eds)(2014)BanglaAcademyPromitoBanglaByabaharikByakaran(AFunctionalGrammarofStandardBangla)DhakaBanglaAcademy
[151] SarkarPabitra[2013]lsquoBanglaSpellingReformtheLongandShortofItrsquoBanglaJournal19215-232
[152] BanglaAcademy(2012)BanglaAcademyPromitoBanglaBananerNiyam(StandardBanglaSpellingasadoptedbyBanglaAcademy)DhakaBanglaAcademy
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
56
[153] SarkarPabitraampRajibChakraborty2018ldquoWhathashappenedSoFarIntermsof Script Reformsrdquo Paper presented at the Face to Face meeting jointly held by theBanglaAcademyDhakaampICANNatBanglaAcademyDhakaon10072018
[154] The Unicode Consortium 2018 The Unicodereg Standard Version 110 ndash CoreSpecificationChapter12P473
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
57
10 Appendix-I
101 AugmentedBackus-NaurFormalism(ABNF)The Augmented Backus-Naur Formalism (ABNF) is generic in nature and whenappliedtoaspecificlanguagescriptcertainrestrictionrulesapplyInotherwordsinagivenlanguagesomeoftheFormalismstructuresdonotnecessarilyapplyTotakecareofsuchcasesrestrictionrulesaresetinplaceTheserestrictionswillhelptofine-tunetheABNFIncaseofBangla13inparticularthefollowingrulesapply
1Khaṇḍata (ৎ) is NOT allowed at the beginning of an IDN label The sameappliestoঞandthevelarnasalঙintheBanglaSchemeoffive-foldlsquovargarsquo(asdefined under Table 5) Moreover Bangla does not allow ya (য়) in thebeginningofawordeitherbutwecanciteacoupleofnativeexamples forexamplethewordয়8াvেড়া(yaeligbbɔRo)fromthepoemlsquoLichuchorrsquowrittenbyKaziNazrul IslamHowever there are instances of it being used in namesmostlyof foreignoriginsuchasYaqubwhichmaybewrittenwithya(য়) inthe beginning as inয়াwব) In very recent timeswhile transliterating someChineseandJapanesenamesinBanglaonedoescomeacrossthepossibilityofKhaṇḍata (ৎ) followedbysa (স) inthebeginningofaword forexamplexেসিরং(Tsering)
2CHcancomewithKhandaTainonlythecasewhereCisra(র)(09B0)
ৎ6 asinভৎ6 সনা
3OnlyfollowingcombinationswithVHCMwillbeallowedrarrঅ8া(togetherpronouncedasaelig)asinঅ8ািসড(acid)rarrএ8া(togetheralsopronouncedasaelig)asinএ8ািসডএ8ােসািসেয়শান
(acidassociation)
102 lsquoSylhetiNagarılipirsquoorlsquoSilotirsquoThisversionofBanglascriptresemblesthe lsquoKaithīrsquoscript(ISO12954)usedbytheAccountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh andBiharndashwidelyinuseduringthe1880sTherewereseveralothernamesofSylheti
13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language Assamese and Maṇipuri have not been covered in this section
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
58
Nagarı or Siloti (129)ndash suchas lsquoJalalabadaNagarırsquo lsquoFula (flower)Nagarırsquo lsquoMuslimNagarırsquoorlsquoMuhammadNagarırsquoItissaidthatShahJalalahadbroughtthescriptwithhim in13th-14thCentury inSylhet (138)althoughsomesuggested that itwasaninvention by the Afghan rulers of Sylhet (137) Some ascribe the credit to theBuddhistBhikkhusfromNepalPurelyforhistoricalreasonsthedetailsofthescriptwith32symbolsarereproducedhere(138)
Table17ndashTheScriptTableofSylhetiNagarıorSiloti
103 ConfusablecodepointsThe following code points were analysed and concluded that they are either (a)distinguishable or (b) confusable but not enough to be defined as variant codepoints
1031 BanglaandNāgarīorDevanāgarī
Bangla Devanāgarī NBGP
Decision
ঃ U+0983 ः U+0903 Confusable
ওU+0993 उU+0909 Confusable
ঘU+0998 घU+0918 Confusable
U+0981 U+0945 Confusable
Table18BanglaandDevanāgarīconfusablecodepoints
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
59
1032 BanglaandGurmukhi
Bangla Gurmukhi NBGPdecision
ঘU+0998 ਬU+0A2C Confusable
U+0981 U+0A71 Confusable
Table19BanglaandGurmukhiconfusablecodepoints
Bangla Gurmukhi
NBGPdecision
ওU+0993 ਤU+0A24
Distinguishable
শU+09B6 ਅU+0A05
Distinguishable
মU+09AE ਮU+0A2E
Distinguishable
বাU+09ACandU+09BE
ਗU+0A17
Distinguishable
Table20ndashBanglaandGurmukhıdistinguishablecodepoints
1033BanglaandOriya(Odia)
Bangla Oriya(Odia) NBGPDecision
ওU+0993 ଓU+0B13 Confusable
Table21ndashBanglaandOriyadistinguishablecodepoints
Bangla Oriya(Odia) NBGP
Decision
ঘU+0998 ସU+0B38 Distinguishable
Table22ndashBanglaandOriyadistinguishablecodepoints
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
60
11 Appendix-IIBengaliconsonantsandtheirallographs
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
প p y(z+ত)(z +ন)|(z +প)প8(z +য)r(z +র)(z +ল)~(z +স)(+প)(+প)
ফ pʰ (t +র)(t +ল)(+ফ)
ব b ( +জ)( +দ)( +ধ)v(+ব)ব8(+য)(+র)(+ল)ভ(+ভ)(+ব)(+ব)
(0+ধ) 2 (3+ব)
ভ bʱ ভ8(+য)(+র)(+ল)
ত t (x+ত)8(x+x+য)(x+x+ব)(x+থ)(x+ন)ত8(x+য)(x+ম)8(x++য)(x+ব)(x+র)y(z+ত)(p+ত)(p+x+ব)(+ত)q8(+x++য) (+x+র)Thereisamarkedformofত+=ৎৎ6 (+xৎ)
amp (5+ত)
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
61
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
থ tʰ থ8(iexcl+য)cent(iexcl+র)pound(+থ)(x+থ)curren(+থ)
(7+থ) 9 (+থ)
দ d yen(brvbar+গ)sect(brvbar+ঘ)uml(brvbar+দ)copy(brvbar+ধ)দ8(brvbar+য)ordf(brvbar+ব)laquo(brvbar+ভ)not(brvbar+র)(+দ)shy(+দ)reg(+brvbar+র)not6 (+brvbar+র)
(lt+গ) gt (lt+ধ)
ধ dʱ macr(deg+ন)plusmn(deg+ম)ধ8(deg+য)sup2(deg+র)sup3(acute+ধ)copy(brvbar+ধ)(+ধ)micro(+ধ)
( (+ধ) gt (lt+ধ) (0+ধ) (7+ধ)
ট ʈ para(middot+ট)ট8(middot+য)cedil(middot+ব)sup1(middot+র)ordm(p+ট)raquo(frac14+ট)
ঠ ʈʰ ঠ8(frac12+য)frac34(iquest+ঠ)Agrave(frac14+ঠ)
ড ɖ Aacute(Acirc+ড)ড8(Acirc+য)Atilde(Acirc+র)
ঢ ɖʱ ঢ8(Auml+য)Aring(iquest+ঢ)
চ tʃ AElig(Ccedil+চ)Egrave(Ccedil+ছ)Eacute(Ccedil+Ecirc+র)Euml(Ccedil+ঞ)চ8(Ccedil+য)Igrave(Iacute+চ)Icirc(Iuml+চ)
(A+চ)
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
62
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ছ tʃʰ ETH(Ecirc+র)Egrave(Ccedil+ছ)Ntilde(Iacute+ছ)Ograve(Iuml+ছ)
$ (A+ছ)
জ dʒ Oacute(Ocirc+জ)Otilde(Ocirc+Ocirc+ব)Ouml(Ocirc+ঝ)times(Ocirc+ঞ)জ8(Ocirc+য)Oslash(Ocirc+র)Ugrave(Iacute+জ)
(A+জ)
ঝ dʒʱ (notprivilegedenoughtohaveclustersasafirstmember)Ouml(Ocirc+ঝ)Uacute(Iacute+ঝ)
ক k Ucirc(p+ক)ordm(p+ট)(p+ত)Uuml(p+x+র)(p+x+ব)Yacute(p+ন)THORN(p+ব)szlig(p+ম)ক8(p+য)agrave(p+র)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)auml(p+frac14+ব)aacute8(p+frac14+য)aring(p+স)s(aelig+ক)ccedil(+p+র)
amp (5+ত) (5+E+র) G (5+E+ব) (5+র) ) (H+ক) J (+5+র)
খ kʰ (notprivilegedenoughtohaveclustersasafirstmember)egrave(aelig+খ)
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
63
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
গ g eacute(acute+গ)ecirc(acute+দ)sup3(acute+ধ)euml(acute+ন)igrave(acute+ব)iacute(acute+ম)গ8(acute+য)icirc(acute+র)iuml(acute+ল)eth(aelig+গ)eth6 (+aelig+গ)
( (+ধ) (H+গ) K (L+H+গ)
ঘ gʱ ntilde(ograve+ন)ঘ8(ograve+য)oacute(ograve+র)ocirc(aelig+ঘ)
ঞ Thisletterdoesnothaveanyparticularphoneticvaluebutmostlypronouncedasn
Igrave(Iacute+চ)Ntilde(Iacute+ছ)Ugrave(Iacute+জ)Uacute(Iacute+ঝ)times(Ocirc+ঞ)
(A+চ) $ (A+ছ) (A+জ) M (A+ঝ)
ণ n otilde(iquest+ট)frac34(iquest+ঠ)ouml(iquest+ড)divide(iquest+Acirc+র)Aring(iquest+ঢ)oslash(iquest+ণ)ণ8(iquest+য)ugrave(iquest+ব)acirc(p+frac14+ণ)uacute(frac14+ণ)ucirc(+ণ)
O (P+ড) - (P+R+র) + (S+ণ)
ঙং ŋ s(aelig+ক)uuml(aelig+p+র)egrave(aelig+খ)eth(aelig+গ)ocirc(aelig+ঘ)yacute(aelig+p+ষ)(Insomecontextsaeligisreplacedbyং)কংঅং
) (H+ক) (H+গ) U (H+ঘ)
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
64
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
ম m thorn(+ল)yuml(+প)(+z+র)(+ভ)(++র)$(+ম)(+র)(x+ম)plusmn(deg+ম)amp(+ম)atilde(p+frac14+ম)
W (3+ম)
ন n (+ট)((+middot+র))(iquest+ঠ)u(+ড)(+Acirc+র)(+ত)q(+x+র)q8(+x++য)curren(+থ)shy(+দ)reg(+brvbar+র)micro(+ধ)+(+deg+র)(+brvbar+ব)-(+ন)(+ম)ন8(+য)(+স)0(+ন)
(7+থ) (7+ধ) (7+Y+র)
শ ʃ Icirc(Iuml+চ)Ograve(Iuml+ছ)1(Iuml+ন)2(Iuml+ম)3(Iuml+র)4(Iuml+ল)শ8(Iuml+য)
ষ ʃ 5(frac14+ক)raquo(frac14+ট)Agrave(frac14+ঠ)uacute(frac14+ণ)6(frac14+প)7(frac14+z+র)8(frac14+ফ)raquo(frac14+ট)9(frac14+middot+র)Agrave(frac14+ঠ)uacute(frac14+ণ)ষ8(frac14+য)aacute(p+ষ)acirc(p+frac14+ণ)atilde(p+frac14+ম)
+ (S+ণ)
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
65
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
স sampʃ (+ক)(+ট)(+প)(+ফ)lt(+ত)pound(+থ)(+ট)(+ক)=(+খ)স8(+য)gt(+র)(+ল)aring(p+স)
9 (+থ)
হ h ucirc(+ণ)0(+ন)amp(+ম)হ8(+য)(+র)A(+ল)
W (3+ম)
ড় ɽ B(C+গ)
ঢ় ɽʱ (notprivilegedenoughtohaveclusters)
য dʒThesecondarysymbol(allograph)jɔ-phalahastwophoneticvaluesWhenaddedtotheinitialconsonantinaworditisavowelaelig(asinশ8ামলর 8াপারetc)Butafteranon-initialconsonantitjustdoublesitinpronunciation(asinকায6ধায6etc)The+যcombinationhastwophysicalmanifestationsmdashর 8andয6
ক8(p+য)স8(+য)র 8(+য)[Justর 8isneverusedinBanglaorthographyর 8াisbutthenitslasttwosymbolsYa-phalaa-karaconstituteavowelsignrepresentingthevowelঅ8া]
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব
66
Consonants PhoneticValue Allographs
Clusters TransparentForm(BanglaAkademifont)
র r Twomanifestationsmdashi lরফrepʰasthefirst
memberofaclusteregপ6ৎ6 not6 য6D6(+deg+ব)(earlierE6=+brvbar+deg+বafour-termcluster)etc(placedoverthefollowingconsonant)
ii র-ফলাrɔ-pʰɔlaasthesecondthirdmemberofaclustereg etc(placedundertheconsonantitfollows)
ল l F(+গ)(+প)G(+ব)H(+ম)I(+ট)J(+ড)K(+ক)F(+গ)L(+দ)ল8(+য)iuml(acute+ল)(+ল)thorn(+ল)
ঃ hwordfinallywordmediallyitdoublesthepronunciationofthefollowingconsonant
অঃকঃ
অব