Pakistan Research Repository

Online Urdu Character Recognition In Unconstrained Environment

Muhammad Imran , Razzak (2011) Online Urdu Character Recognition In Unconstrained Environment. PhD thesis, International Islamic University, Islamabad.



Computer, the humongous giant of technology, has brought innovative changes in every aspect of life, especially in applications imitating humans.Currently, it is used in every field of life to facilitate human endeavor.One such application is character recognition. Character recognition is an important offshoot of pattern recognition problems. It imitates a human’s ability to read, using a machine.It has been a field of intensive, if exotic, research since the early days of the computer.This task becomes more complex and demanding in case of handwritten and cursive text.Arabic script-based languages, which are used by almost a quarter of the world’s population [Belaid et. al, 2010], are cursive, rich in diacritical marks and variety of writing styles present a challenging task for the researchers.Urdu is an Arabic script based languages however the Urdu character set is the superset of all Arabic script-based languages.Character recognition has been performed either through segmentation free or segmentation based approaches.There are numerous issues with a segmentation free approach, and it is very difficult to train using a large dataset.On the other hand in Urdu, a segmentation based approach has a large overhead and has less accuracy for cursive script as compared to segmentation free methods.In terms of classification, this thesis presents two approaches for Urdu character recognition: segmentation free method based on a hybrid approach (HMM and fuzzy logic), and bio-inspired character recognition system that uses fuzzy logics. Fuzzy is used as inner and outer shells for preprocessing and post processing of HMM. Biologically inspired multilayered fuzzy rules based system has been presented.Using the human visual concept, a layered approach has been suggested where the diacritical marks are separated from the ghost characters and mapped onto the primary ligature in the final layer.The proposed technique also caters to Multilanguage character recognition system for all Arabic script-based languages like Arabic, Persian, Urdu, Punjabi etc.The presented multilayered bio-inspired approach recognizes the ligature by extracting the features and combining them to find new premises in a bottom up fashion and it provided accuracy of 87.4%.

Item Type:Thesis (PhD)
Uncontrolled Keywords:Technology, Urdu, Unconstrained, Ghost, Character, Environment, Logics, Visual, Online, Concept, Recognition, Script, Arabic, Classification
Subjects:Engineering & Technology (e) > Engineering(e1) > Computer Sciences & related disciplines(e1.9)
ID Code:7493
Deposited By:Mr. Javed Memon
Deposited On:08 Nov 2012 15:06
Last Modified:08 Nov 2012 15:06

Repository Staff Only: item control page