Document Processing for Automatic Color Form Dropout
#1

This article is presented by:
Andreas E. Savakis
Chris R. Brow
n
Document Processing for Automatic Color Form Dropout


ABSTRACT
Color dropout refers to the process of converting color form documents to black and white by removing the colors that are part of the blank form and maintaining only the information entered in the form. In this paper, no prior knowledge of the form type is assumed. Color dropout is performed by associating darker non-dropout colors with information that is entered in the form and needs to be preserved. The color dropout filter parameters include the color values of the non-dropout colors, e.g. black and blue, the distance metric, e.g. Euclidian, and the tolerances allowed around these colors. Color dropout is accomplished by converting pixels that have color within the tolerance sphere of the non-dropout colors to black and all others to white. This approach lends itself to high-speed hardware implementation with low memory requirements, such as an FPGA platform. Processing may be performed in RGB or a Luminance-Chrominance space, such as YC. The color space transformation from RGB to YC involves a matrix multiplication and the dropout filter implementation is similar in both cases. Results for color dropout processing in both RGB and YC space are presented.



INTRODUCTION
Color forms constitute a large number of documents that are scanned using high-speed scanners. In color forms, the information of interest is the text that has been entered, while the document background and document lines, originally placed in the document to facilitate data entry, are not of any practical use. Representative documents of this type are medical forms, insurance forms, census forms, etc. When performing character recognition on these forms, it is desirable to eliminate the color background and lines that are part of the form, and keep only the textual information that is of relevance. Color dropout is the image processing function whose purpose is to convert the scanned color document to a binary image where the form background colors are turned to white and the text colors are turned to black. To accomplish this we need to distinguish between the colors of the background and the colors of the entered text. Color dropout may be viewed as a form of color image rendering, since the image is converted from a full-color form to black and white. There are several advantages to performing color dropout. First the textual information of interest is enhanced, because it is rendered black, while the background color, that may reduce the text contrast, is suppressed. In addition, the removal of the form lines minimizes interference with the text characters, and may reduce errors during character recognition. Another advantage is that the uncompressed file size is reduced by a factor of 24, since the color image consisting of 24 bits per pixel is converted to a binary image with only one bit per pixel. This fact significantly reduces the storage requirements for the resulting document files. Color dropout may be accomplished using optical or digital methods. Optical filters have been used when the document form involves a single dropout color. However, optical filters cannot be used with multiple dropout colors, and it is difficult to adjust the optical filter parameters of the optical filters to match nonstandard colors.
Color dropout methods based on digital processing methods sometimes attempt to remove the form lines and background information from the scanned gray scale image by postprocessing. Examples of this approach include [1], where form frames are identified for the purpose of form line removal, and [2], where the distance transformation and its gradient flow are employed to remove form lines. Such approaches may work for specific cases, but require significant computational effort and are very expensive to implement in real-time hardware that are used in high-speed scanners. Another approach to color dropout, originally developed in the context of optical character recognition, was developed by Rudak [3]. In this work, the average RGB dropout colors in color patches are determined and used in a dropout filter that can be implemented using electronic hardware. The filter bandwidth is adjusted to accommodate for color variations between forms. The advantage of this approach is that the presence of noise, e.g. black specs, does not significantly affect the average color in the color patch considered, and consequently does not affect the final color dropout result. Another approach presented in [4] proposes scanning a blank form, extracting the dropout colors from the blank form, and using them to perform color dropout when scanning other forms.


For more information about this article,please follow the link:
http://googleurl?sa=t&source=web&cd=1&ve....1.87.7005%26rep%3Drep1%26type%3Dpdf&ei=Szq0TJT0EYemvgOwuLGHCg&usg=AFQjCNEcFNNa8Pt89jy3sBedJkhZKdEr6Q

Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: jubasri form, abstract of the document, kiken yochi form, youvosree form filap, jubashree vata form, juboshree form wb, lokomotiv lukhanow summar treing form,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  Image Processing & Compression Techniques (Download Full Seminar Report) Computer Science Clay 42 22,826 07-10-2014, 07:57 PM
Last Post: seminar report asees
  On-line Analytical Processing (OLAP) computer science crazy 2 2,623 01-04-2014, 11:11 PM
Last Post: seminar report asees
  Hardware for image processing - Basics Eye – Human vision sensor ppt computer topic 0 7,756 25-03-2014, 11:12 PM
Last Post: computer topic
Question Space-time Adaptive Processing (STAP) computer science crazy 2 3,146 16-10-2013, 03:09 PM
Last Post: Guest
  Digital Light Processing computer science crazy 1 2,262 11-01-2013, 10:56 AM
Last Post: seminar details
  digital image processing project topics 1 2,276 19-11-2012, 01:46 PM
Last Post: seminar details
  Fuzzy Random Impulse Noise Removal From Color Image Sequences computer girl 1 1,686 24-10-2012, 01:45 PM
Last Post: seminar details
  GPUs - Graphics Processing Units computer girl 0 1,007 07-06-2012, 03:45 PM
Last Post: computer girl
  Parallel Computing In Remote Sensing Data Processing computer science crazy 4 4,844 01-03-2012, 09:32 AM
Last Post: seminar paper
  PARALLEL COMPUTING IN REMOTE SENSING DATA PROCESSING seminar projects crazy 1 2,930 24-02-2012, 11:40 AM
Last Post: seminar paper

Forum Jump: