jagomart
digital resources
picture1_Language Pdf 101586 | Rahaman 2016 Ijca 911305


 216x       Filetype PDF       File size 0.79 MB       Source: www.ijcaonline.org


File: Language Pdf 101586 | Rahaman 2016 Ijca 911305
international journal of computer applications 0975 8887 volume 147 no 14 august 2016 a revised unicode based sorting algorithm for bengali texts md mahfuzur rahaman dept of computer science and ...

icon picture PDF Filetype PDF | Posted on 22 Sep 2022 | 3 years ago
Partial capture of text on file.
                                                                                             International Journal of Computer Applications (0975 – 8887) 
                                                                                                                           Volume 147 – No.14, August 2016 
                  A Revised Unicode based Sorting Algorithm for Bengali 
                                                                                 Texts 
                                                                     Md. Mahfuzur Rahaman 
                                                           Dept. of Computer Science and Engineering  
                                                         Shahjalal University of Science and Technology 
                                                                      Sylhet – 3114, Bangladesh 
                 
                 
                ABSTRACT                                                                     Bengali  texts  with  Unicode  representation  according  to 
                This  paper  describes  a  sorting  algorithm  for  Bengali  texts           Bangla  Academy  [4]  standard.  As  Bangla  Academy  is 
                which  is  one  of  the  most  vital  tasks  for  Bengali  Natural           Bangladesh’s national language authority [5] and this is the 
                Language Processing. As Unicode is much more preferable                      national  academy  for  promoting  Bengali  language  in 
                than ASCII encoding, we need to use this representation for                  Bangladesh,  we  need  to  follow  Bangla  Academy  to  set 
                Bengali  Language.  But  due  to  some  distinct  properties  of             standard for Bengali Linguistic works.  
                Bengali Language, they cannot be sorted directly using the                   2.  BENGALI LANGUAGE 
                order in Unicode character scheme. A few works have been                     Bengali language is written using the Bengali alphabet which 
                done on this topics – some of them are for ASCII encoding                             th
                whether  some  are  for  Unicode.  But  still  they  have  some              is the 6  most widely used writing system in the world. The 
                drawbacks and still there is no standard to sort Bengali texts.              script shared by Assamese with minor variants and is the basis 
                In  this  paper,  we  have  discussed  about  the  previous                  for  the  other  writing  systems  like  Meithei  and  Bishnupriva 
                approaches and proposing a revised and easier procedure to                   Manipuri [6]. The script has also been used to write Sanskrit 
                sort Unicode Bengali texts. We used a mapping to simplify                    in the region of Bengal. 
                the sorting process. The efficiency depends on the efficiency                2.1  Base Letters 
                of  the  sorting  algorithm.  This  method  is  able  to  sort  any          There are 11 vowels and 39 consonants in the written form of 
                Unicode Bengali texts. It will also work for Unicode text of                 Bengali alphabets. When we use these alphabets in full form, 
                any  language  if  we  just  change  the  mapping  part.  So  the            we call them base letters. 
                process is both keyboard and language independent.   
                General Terms                                                                Independent Vowels (স্বরবর্ণ) 
                Theoretical Informatics                                                                 অ আ ই ঈ উ ঊ ঋ এ ঐ ঑ ঒ 
                Keywords                                                                     Consonants (বযঞ্জনবর্ণ) 
                Bengali Word Sorting, Bengali Text Sorting, Unicode Bengali                             ও ঔ ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ঠ ড ঢ ণ দ ধ ন 
                Text  Sorting,  Bengali  Linguistic  Sort,  Bengali  Dictionary 
                Sort, Bangla Academy Dictionary Based Sort.                                             ঩ প ফ ব ভ ম য র ঱ ল ঳ ঴ ড় ঢ় ৞ ৎ ং  ং  ং 
                                                                                                                                                     
                1.  INTRODUCTION                                                             2.2  Modifiers 
                Bengali  or  Bangla  is  an  Indo-Aryan  language  spoken                    There are two types of modifiers in Bengali alphabets – vowel 
                predominantly in Bangladesh and in the Indian state of West                  modifiers and consonant modifiers. 
                Bengal and Tripura [1]. With about 250 million native and 
                about 300 million total speakers worldwide, it is the second                 Dependent Vowels or Vowel Modifiers (-কার) 
                most  spoken  language  in  the  Indian  subcontinent,  seventh              10 of the 11 vowels are used as modifiers to consonants. They 
                most spoken language in the world by total number of native 
                speakers and the tenth most spoken language by total number                  are called vowel modifiers and are generally known as -ওায. 
                of speakers [1][2]. This language is derived from Sanskrit and               They can never be used independently. Following is the list of 
                hence appears to be similar to Hindi [3]. It is written left-to-             vowel modifiers with examples: 
                right, top-to-bottom of page. Vocabulary of Bengali language                 Table 1. List of Vowel Modifiers with Examples 
                is similar to Sanskrit and there are to some extent similarities 
                with Latin. As it is one of the most spoken languages and it                        Vowel             Vowel Modifier             Example 
                has  some  complexities  in  its  structure,  it  becomes  a 
                fundamental necessity to have some standardization such as                            আ                       ংা                     ওা 
                Bengali keyboard layout, Bengali character recognition, voice 
                synthesis like speech to text or text to speech etc. Bengali text                      ই                      িং                     িও 
                sorting  is  the  first  issue  that  need  to  be  standardized  first.               ঈ                      ংী                     ওী 
                There are some papers on this topic but still none of them 
                could set standard for Bengali text sorting. In this writing, we                       উ                      ং                      কু 
                have shown some analysis, drawbacks and limitations on the                                                     ু
                                                                                                       ঊ                      ং                      কূ 
                previous works. We also proposed a revised procedure that                                                      ূ
                can be used as a standard procedure to sort Bengali texts. This                        ঋ                      ং                      ও 
                procedure is easy to comprehend and implementation is so                                                       ৃ                      ৃ
                much  easier  in  any  programming  language.  It  sorts  the 
                                                                                                                                                             35 
                                                                                                                                                                                                                                                                           International Journal of Computer Applications (0975 – 8887) 
                                                                                                                                                                                                                                                                                                                                                                  Volume 147 – No.14, August 2016 
                                                                 Vowel                                               Vowel Modifier                                                                Example                                                                  But ং , ং , ং are used like a modifier and they cannot be used 
                                                                                                                                                                                                                                                                                                                    
                                                                         এ                                                                 েং                                                                 েও                                                            without  any  other  alphabet.  Though  many  compound 
                                                                                                                                                                                                                                                                            characters  are  made  up  with  consonant  modifiers,  they  can 
                                                                         ঐ                                                                 ৈং                                                                 ৈও                                                            also  be  written  with  conjunct  character  (ং  )  between  two 
                                                                         ঑                                                                েংা                                                                েওা                                                            consonants. To simplify these kind of complexities, Bangla 
                                                                                                                                                                                                                                                                            Academy  uses  the  following  order  for  Bengali  words  in 
                                                                         ঒                                                                েং                                                                 েও                                                             Dictionary: 
                                                                                                                                                                                                                                                                            অ আ ই ঈ উ ঊ ঋ এ ঐ ঑ ঒ ং  ং  ং 
                                                                                                                                                                                                                                                                                                                                                                                  
                                              Consonant Modifiers (-ফলা)                                                                                                                                                                                                    ও ঔ ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ড় ঠ ঢ় ড ঢ ৎ ণ দ ধ ন ঩ প ফ ব ভ ম 
                                              Like the vowel modifiers, some consonants have short forms                                                                                                                                                                    ৞ য র ঱ ল ঳ ঴ 
                                              when they are used with another consonant. They are called                                                                                                                                                                    We followed this alphabetic order to sort Bengali texts in our 
                                              consonant modifiers and are generally known as -পরা. Some                                                                                                                                                                     approach. 
                                              of them are listed below with examples:                                                                                                                                                                                       3.  DIFFICULTIES TO SORT BENGALI 
                                              Table 2. List of Consonant Modifiers with Examples                                                                                                                                                                                           TEXTS 
                                                         Consonant                                                          Consonant                                                             Example                                                                   The problems associated with sorting of Bengali texts are as 
                                                                                                                               Modifier                                                                                                                                     follows: 
                                                                         ন                                                           ন-পরা                                                                   মত্ন                                                                         Bengali  words  should  be  sorted  according  to  Bangla 
                                                                         ভ                                                           ভ-পরা                                                                আত্মা                                                                            Academy  [4]  standard.  But  Unicode  representation  of 
                                                                         ম                                                           ম-পরা                                                                 চনয                                                                             Bengali alphabets are not in Bangla Academy Dictionary 
                                                                                                                                                                                                                                                                                           order. So, mapping is required to sort texts. 
                                                                         য                                                           য-পরা                                                                 প্রিঢ                                                                          Compound  characters  with  consonant  modifier  or 
                                                                         র                                                           র-পরা                                                                  শুক্ল                                                                          conjunct character make Bengali sorting more complex. 
                                                                         ফ                                                           ফ-পরা                                                                  জ্বয                                                                          Vowel modifiers can precede or follow the base letters in 
                                                                                                                                                                                                                                                                                           Bengali text, but the modifier should be considered after 
                                                                                                                                                                                                                                                                                           the base letter in computation for proper sorting. 
                                              2.3  Compound Characters                                                                                                                                                                                                                    Unicode characters য, ৞, ড়, ঢ় can be written in two ways 
                                              When two or more consonant characters used together, then                                                                                                                                                                                    – as a single character or as a compound character with ং 
                                              they  are  called  compound  characters.  There  are  about  285                                                                                                                                                                                                                                                                                                                                                            ঵
                                              compound  characters  in  Bengali  [7].  Some  examples  of                                                                                                                                                                                  character. 
                                              compound characters are listed below:                                                                                                                                                                                                       Two  vowel  modifier  েংা  and  েং   can  be  written  as  a 
                                              Table 3. Some Compound Characters with usage                                                                                                                                                                                                 single Unicode character or as preceding and following 
                                                                                               Compound                                       Decompressed                                                      No. of                                                                     two modifiers. 
                                                        Word                                    Character                                                   Form                                         Alphabets                                                                        Ambiguity  between  ময  and  য‌ং  ম  adds  a  bit  more 
                                                                                                                                                                                                                  Used                                                                     complexity in sorting Bengali texts. In both case, we get 
                                                         উজ্জ্বর                                              জ্জ্ব                            চ + ং   + চ + ং   +                                                      3                                                                  য + ং   + ম but they are not same ( য‌ং  ম  = য + ZWNJ + ং   
                                                                                                                                                                   ফ                                                                                                                       + ম). 
                                                         উচ্ছ্বা঳                                             চ্ছ্ব                             ঘ + ং   + ঙ + ং   +                                                     3                                                   4.  PREVIOUS WORKS 
                                                                                                                                                                   ফ                                                                                                        Md.  Ruhul  Amin  et  al.  [8]  proposed  an  efficient  Unicode 
                                                                                                                                                        দ + ং   + ফ                                                                                                         based sorting algorithm for Bengali words. They have used 
                                                                                                               দ্ব                                                                                                      2                                                   null modifier which not mandatory. This approach cannot sort 
                                                             দ্বন্দ্ব                                         ন্দ্ব                             ন + ং   + দ + ং   +                                                     3                                                   texts in the following situation: 
                                                                                                                                                                   ফ                                                                                                                                    Table 4. Situation cannot be solved by [8] 
                                                            ফিি                                                ি                                        ল + ং   + ঝ 
                                                                ৃ                                                                                                                                                       2                                                                                                                                                                                            Representation 
                                                            ভিিু                                              ি                                        ও + ং   + ঢ                                                      2                                                                      Word                                                 Decompressed                                                         with mapped 
                                                                                                                                                                                                                                                                                                                                                                  Form                                                                value 
                                               
                                              2.4  Alphabetical order of Bangla Academy                                                                                                                                                                                                           ফ঳িঢ                                                     ফ ৹ ঳ ৹ ঢ িং                                                520161014503 
                                              Generally,  we  use  the  following  alphabetical  order 
                                              everywhere:                                                                                                                                                                                                                                         ফস্‌িঢ                                                  ফ ৹ ঳ ং   ঢ িং                                               520161124503 
                                              অ আ ই ঈ উ ঊ ঋ এ ঐ ঑ ঒                                                                                                                                                                                                                                 ফিি                                                   ফ ৹ ঳ ং   ঢ িং                                               520161124503 
                                              ও ঔ ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ঠ ড ঢ ণ দ ধ ন ঩ প ফ ব ভ ম য র ঱                                                                                                                                                                      
                                              ল ঳ ঴ ড় ঢ় ৞ ৎ ং  ং  ং 
                                                                                                                 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                    36 
                                                                                        International Journal of Computer Applications (0975 – 8887) 
                                                                                                                    Volume 147 – No.14, August 2016 
               We actually get ফিি  = ফ + ৹ + ঳ + ং   + ZWNJ + ঢ + িং                            We assume that, য, ৞, ড়, ঢ়  are  made  up  with  a 
               where  ZWNJ  is  not  mentioned  in  their  process.  So  their                    single character, not a conjunct with ং character. েংা 
                                                                                                                                        ঵
               algorithm will treat both ফিি  and ফিি as same word.                               and েং  are also assumed as single modifier. 
               Aamira  Shabnam  et  al.  [9]  have  described  an  easily               5.2  Mapping 
               comprehendible Unicode based sorting algorithm for Bangla                Our  proposed  mapping  scheme  is  listed  below.  We  are 
               words. They didn’t use any null modifier and used single digit           proposing at least two digits for each letter or modifier.  
               mapping. 
                           Table 5. Situation not handled by [9]                                Table 6. Mapping for our proposed method 
                                                             Representation               Unicode Value           Character          Mapped Value 
                      Word             Decompressed           with mapped                      200C                 ZWNJ                    00 
                                            Form                  value 
                                                                                               200D                  ZWJ                    01 
                       ওরভ                ও + র + ভ              255652                        0985                   অ                     02 
                       ওরাভ             ও + র + ংা + ভ          2556052                        0986                   আ                     03 
                                                                                               0987                    ই                    04 
               If the mapped string is sorted in lexicographical order, we will                0988                    ঈ                    05 
               get ওরাভ before ওরভ which is not correct. 
               Aamira  Shabnam  et  al.  [10]  have  also  described  a  faster                0989                    উ                    06 
               approach  to  sort  Unicode  represented  Bengali  words.  This                 098A                    ঊ                    07 
               paper also has the drawbacks of the previous one. In addition 
               to this, the order mentioned in the discussion is different from                098B                    ঋ                    08 
               Bangla  Academy  standard.  They  used  just  the  regular                                              এ                    09 
               sequence of Bengali alphabets.                                                  098F 
               Partha Sarathi Kar et al. [11] proposed an improved Unicode                     0990                   ঐ                     10 
               based sorting algorithm for Bengali words. It is a bit different                0993                    ঑                    11 
               from the previous approaches. They mapped each character 
               and their modifier together and also mapped the joined letters.                 0994                   ঒                     12 
               They  used  the  mapping  value  according  to  the  following 
               order:                                                                          0982                   ং                     13 
               Base letter < Base letter with vowel modifier < Base letter                     0983                   ং                     14 
               with consonant modifier + Joint letter (according to order of 
                                                                                                                       ং                    15 
               each character)                                                                 0981                      
               There is about 285 joint letters [7] which we have mentioned                    0995                    ও                    16 
               earlier. The mapping for all alphabets and joint letters adds an 
               extra  overhead  in  this  algorithm.  Again,  joint  letters  with             0996                    ঔ                    17 
               more than two characters are not mapped here. So the words                      0997                    ক                    18 
               like উজ্জ্বর , উচ্ছ্বা঳ cannot be sorted using this algorithm.  
               5.  PROPOSED METHOD                                                             0998                    খ                    19 
               5.1  Assumptions                                                                0999                    গ                    20 
                        Mapping  is  must  as  Unicode  character  set  for                   099A                    ঘ                    21 
                         Bengali is not sorted. 
                        We need to use same number of digits for mapping                      099B                    ঙ                    22 
                         a letter or modifier to get rid of the drawbacks of [9]               099C                    চ                    23 
                         and [10].                                                             099D                    ছ                    24 
                        ZWJ (Zero-Width-Joiner) and ZWNJ (Zero-Width-
                         Non-Joiner)  should  be  considered  while  mapping                   099E                   জ                     25 
                         and also while decompressing a word.                                  099F                    ঝ                    26 
                        It  is  important to maintain the alphabetic order or                 09A0                    ঞ                    27 
                         Bangla Academy to sort text according to Bangla 
                         Academy Dictionary.                                                   09A1                    ট                    28 
                        The  precedence  to  follow  Bangla  Academy                         09DC                     ড়                    29 
                         Dictionary order: 
                         ZWJ,  ZWNJ  <  Vowel  <  Consonant  <  Vowel                          09A2                    ঠ                    30 
                         Modifier < Conjunct Character (ং  )                                  09DD                     ঢ়                    31 
                                                                                                                                                    37 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           International Journal of Computer Applications (0975 – 8887) 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Volume 147 – No.14, August 2016 
                                                                                                                      Unicode Value                                                                                                                                                              Character                                                                                                                                 Mapped Value                                                                                                                                                                                                                                                                        Word                                                                                                                                                                           Decompressed Word 
                                                                                                                                                                                                                                                                                                                                   ড                                                                                                                                                     32                                                                                                                                                                                                                                              ওা ঘ                                                                                                                                                                                                                     ও ংা ং ঘ 
                                                                                                                                                         09A3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
                                                                                                                                                         09A4                                                                                                                                                                     ঢ                                                                                                                                                      33                                                                                                                                                                                                                                              ওাঘ                                                                                                                                                                                                                             ও ংা ঘ 
                                                                                                                                                         09CE                                                                                                                                                                      ৎ                                                                                                                                                     34                                                                                                                                                                                                                                    য‌ং  মা দা                                                                                                                                                                                  য ZWJ ং   ম ংা ং দ ংা 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
                                                                                                                                                         09A5                                                                                                                                                                      ণ                                                                                                                                                     35                                                                                                                                                                                                                                     য‌ং  মাভ                                                                                                                                                                                                  য ZWJ ং   ম ংা ভ 
                                                                                                                                                         09A6                                                                                                                                                                       দ                                                                                                                                                    36                                                                                                                                                                                                                                     য‌ং  মাফ                                                                                                                                                                                                  য ZWJ ং   ম ংা ফ 
                                                                                                                                                         09A7                                                                                                                                                                      ধ                                                                                                                                                     37                                                                                                                                                                                                                                         ফ঳িঢ                                                                                                                                                                                                                          ফ ঳ ঢ িং 
                                                                                                                                                         09A8                                                                                                                                                                      ন                                                                                                                                                     38                                                                                                                                                                                                                                         ফস্‌িঢ                                                                                                                                                                                          ফ ঳ ং   ZWNJ ঢ িং 
                                                                                                                                                       09AA                                                                                                                                                                        ঩                                                                                                                                                     39                                                                                                                                                                                                                                               ফিি                                                                                                                                                                                                              ফ ঳ ং   ঢ িং 
                                                                                                                                                        09AB                                                                                                                                                                      প                                                                                                                                                      40                                                                                                                                                                                                                                                  ফই                                                                                                                                                                                                                                    ফ ই 
                                                                                                                                                        09AC                                                                                                                                                                       ফ                                                                                                                                                     41                                                                                                                                                                                                                                                 ফর                                                                                                                                                                                                                                    ফ র 
                                                                                                                                                       09AD                                                                                                                                                                       ব                                                                                                                                                      42                                                                                                                                                                                                                                                 ফন                                                                                                                                                                                                                                    ফ ন 
                                                                                                                                                        09AE                                                                                                                                                                       ভ                                                                                                                                                     43                                                                                                                                                                                                                                     উঢযাই                                                                                                                                                                                                                      উ ঢ য ংা ই 
                                                                                                                                                         09AF                                                                                                                                                                      ম                                                                                                                                                     44                                                                                                                                                                                                                                      উৎযাই                                                                                                                                                                                                                      উ ৎ য ংা ই 
                                                                                                                                                         09DF                                                                                                                                                                      ৞                                                                                                                                                     45                                                                                                                                                                                                                                           উত্তয                                                                                                                                                                                                                উ ঢ ং   ঢ য 
                                                                                                                                                          09B0                                                                                                                                                                     য                                                                                                                                                     46                                                                                                                                                                                                                                             ও ঳                                                                                                                                                                                                                            ও ং  ঳ 
                                                                                                                                                          09B2                                                                                                                                                                     র                                                                                                                                                     47                                                                                                                                                                                                                                           ওা ঳                                                                                                                                                                                                                    ও ংা ং  ঳ 
                                                                                                                                                                                                                                                                                                                                   ঱                                                                                                                                                     48                                                                                                                                                                                                                                             ওা ও                                                                                                                                                                                                                    ও ংা ং ও 
                                                                                                                                                          09B6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
                                                                                                                                                          09B7                                                                                                                                                                     ল                                                                                                                                                     49                                                                                                                                                                                                                                             ওাও                                                                                                                                                                                                                            ও ংা ও 
                                                                                                                                                          09B8                                                                                                                                                                     ঳                                                                                                                                                     50                                                                                                                                                                                                                                          আক্দ                                                                                                                                                                                                 আ ও ং   ZWNJ দ 
                                                                                                                                                          09B9                                                                                                                                                                      ঴                                                                                                                                                    51                                                                                                                                                                                                                                       আক্কের                                                                                                                                                                                                        আ ও ং   ও েং র 
                                                                                                                                                         09BE                                                                                                                                                                    ংা                                                                                                                                                      52                                                                                                                                                  
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Step 2: Generate the mapped string with corresponding values 
                                                                                                                                                         09BF                                                                                                                                                                    িং                                                                                                                                                      53                                                                                                                                                 for each letter and modifier. 
                                                                                                                                                          09C0                                                                                                                                                                   ংী                                                                                                                                                      54                                                                                                                                                                                                                Table 8. Second step for proposed method 
                                                                                                                                                                                                                                                                                                                                   ং 
                                                                                                                                                          09C1                                                                                                                                                                           ু                                                                                                                                               55                                                                                                                                                                     Word                                                                                          Decompressed Word                                                                                                                                                                                        Mapped String 
                                                                                                                                                                                                                                                                                                                                   ং                                                                                                                                                                                                                                                                                                                                    ওযা ঝ                                                                                                                       ও ং   ম ংা ং ঝ                                                                                                                                                         166244521526 
                                                                                                                                                          09C2                                                                                                                                                                           ূ                                                                                                                                               56                                                                                                                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                                                                   ং                                                                                                                                                                                                                                                                                                                         ওযাঝারক                                                                                                                 ও ং   ম ংা ঝ ংা র ক                                                                                                                                                 1662445226524718 
                                                                                                                                                          09C3                                                                                                                                                                           ৃ                                                                                                                                               57 
                                                                                                                                                                                                                                                                                                                                েং                                                                                                                                                       58                                                                                                                                                                                ওা ঘ                                                                                                                                   ও ংা ং ঘ                                                                                                                                                                  16521521 
                                                                                                                                                          09C7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
                                                                                                                                                          09C8                                                                                                                                                                   ৈং                                                                                                                                                      59                                                                                                                                                                                ওাঘ                                                                                                                                          ও ংা ঘ                                                                                                                                                                       165221 
                                                                                                                                                                                                                                                                                                                              েংা                                                                                                                                                        60                                                                                                                                                                     য‌ং  মা দা                                                                                                 য ZWJ ং   ম ংা ং দ ংা                                                                                                                                                         4601624452153652 
                                                                                                                                                        09CB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
                                                                                                                                                        09CC                                                                                                                                                                  েং                                                                                                                                                         61                                                                                                                                                                      য‌ং  মাভ                                                                                                                 য ZWJ ং   ম ংা ভ                                                                                                                                                                 460162445243 
                                                                                                                                                        09CD                                                                                                                                                                       ং                                                                                                                                                     62                                                                                                                                                                      য‌ং  মাফ                                                                                                                 য ZWJ ং   ম ংা ফ                                                                                                                                                                 460162445241 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ফ঳িঢ                                                                                                                                        ফ ঳ ঢ িং                                                                                                                                                                   41503353 
                                                                                                          5.3  Steps for Sorting                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ফস্‌িঢ                                                                                                         ফ ঳ ং   ZWNJ ঢ িং                                                                                                                                                                      415062003353 
                                                                                                          Step 1: Decompress each word into smaller parts like letter or                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ফিি                                                                                                                             ফ ঳ ং   ঢ িং                                                                                                                                                             4150623353 
                                                                                                          modifier.  
                                                                                                                                                                                  Table 7. First step for proposed method                                                                                                                                                                                                                                                                                                                                                                                                                                                     ফই                                                                                                                                                  ফ ই                                                                                                                                                                         4104 
                                                                                                                                                                                              Word                                                                                                                                                                           Decompressed Word                                                                                                                                                                                                                                                                               ফর                                                                                                                                                   ফ র                                                                                                                                                                         4147 
                                                                                                                                                                                                      ওযা ঝ                                                                                                                                                                                                        ও ং   ম ংা ং ঝ                                                                                                                                                                                                                                            ফন                                                                                                                                                   ফ ন                                                                                                                                                                         4138 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                           ওযাঝারক                                                                                                                                                                                                  ও ং   ম ংা ঝ ংা র ক                                                                                                                                                                                                                                          উঢযাই                                                                                                                                     উ ঢ য ংা ই                                                                                                                                                               0633465204 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     38 
The words contained in this file might help you see if this file matches what you are looking for:

...International journal of computer applications volume no august a revised unicode based sorting algorithm for bengali texts md mahfuzur rahaman dept science and engineering shahjalal university technology sylhet bangladesh abstract with representation according to this paper describes bangla academy standard as is which one the most vital tasks natural s national language authority processing much more preferable promoting in than ascii encoding we need use follow set but due some distinct properties linguistic works they cannot be sorted directly using order character scheme few have been written alphabet done on topics them are th whether still widely used writing system world drawbacks there sort script shared by assamese minor variants basis discussed about previous other systems like meithei bishnupriva approaches proposing easier procedure manipuri has also write sanskrit mapping simplify region bengal process efficiency depends base letters method able any vowels consonants form...

no reviews yet
Please Login to review.