04-05-2011, 11:55 AM
OCR for Script Identification of Hindi (Devnagari) Numerals using Feature Sub Selection by Means of End-Point with Neuro-Memetic Model
Abstract
Recognition of Indian languages scripts ischallenging problems. In Optical Character Recognition [OCR], acharacter or symbol to be recognized can be machine printed orhandwritten characters/numerals. There are several approaches thatdeal with problem of recognition of numerals/character depending onthe type of feature extracted and different way of extracting them.This paper proposes a recognition scheme for handwritten Hindi(devnagiri) numerals; most admired one in Indian subcontinent. Ourwork focused on a technique in feature extraction i.e. global basedapproach using end-points information, which is extracted fromimages of isolated numerals. These feature vectors are fed to neuromemeticmodel [18] that has been trained to recognize a Hindinumeral. The archetype of system has been tested on varieties ofimage of numerals. . In proposed scheme data sets are fed to neuromemeticalgorithm, which identifies the rule with highest fitnessvalue of nearly 100 % & template associates with this rule is nothingbut identified numerals. Experimentation result shows thatrecognition rate is 92-97 % compared to other models.
Keywords—OCR, Global Feature, End-Points, Neuro-Memeticmodel.
I. INTRODUCTION
N Optical Character Recognition [OCR],a character orsymbol to be recognized can be machine printed orhandwritten characters/numerals [1].Handwritten numeralrecognition is an exigent task due to the restricted shapevariation, different script style & different kind of noise thatbreaks the strokes in number or changes their topology [1]. Ashandwriting varies when person write a same character twice,one can expect enormous dissimilarity among people. Theseare the reason that made researchers to find techniques thatwill improve the knack of computers to characterize andrecognize handwritten numerals are presented in [14]. Offlinerecognition and online recognition is reviewed in [7, 10,12, 15] and [16, 17] respectively. Some development can be observed for isolated digit recognition because many researchscholars [8, 9, 11, and 13] across the global have chosen theirfield in handwritten numeral/character recognition
.II. OVERVIEW OF RECOGNITION SYSTEM IN OUR PROPOSEDWORK
The recognition system consists of three parts each dealingwith feature extractor, Learning stage & recognition stage.In feature extractor, global based approach using end-pointsinformation is extracted from binary image and fed to Neuro-Memetic model [18] that has been trained to recognize anumeral.Block diagram of recognition model are portray in Fig. 1.Fig. 1 Block diagram of recognition model1-> Binary Image2->Draw the directed graph3->Extract the feature from the End pointApproach4->Memetic algorithm (MA)5->Neural network6->Recognition ResultA. Feature ExtractionFeature extractor is a vital part of any recognition system.The main intend of feature extraction is to depict the patternby means of bare minimum number of attributes. Onesignificant job in design of pattern recognition system is todevelop as algorithm to extort characteristics of pattern frominitial measurement. Some features that have been carried fornumeral recognition are geometric feature, topological,directional, mathematical & structural features. [1].Thederived features are then used as input to numeral classifier.
Download full report
http://citeseerx.ist.psu.edu/viewdoc/dow...1&type=pdf