ASK HERE

seminar class · 02-05-2011, 04:23 PM

[attachment=13260]
ABSTRACT
When implemented in hardware, image-processing algorithms
should be robust to memory limitations because some hardware
architectures may not have memory size as large as the whole
frame size. Although this is not generally a problem for low-level
processing, higher-level understanding, such as object detection,
demands novel solutions because the available information may, in
some cases, be very local, e.g., only a partial view of the object
could fit in the available memory size. In this paper, we propose a
novel hardware-oriented overlaid text detection algorithm that can
detect text with height as large as five times the memory size. The
algorithm integrates a connected component (CC)-based algorithm
with a texture-based machine learning approach. The CC-based
algorithm uses character-level features in the horizontal direction
whereas the texture-based algorithm extracts block-based features
to integrate information from all directions. Furthermore, the
texture-based algorithm employs a support vector machine (SVM)
to benefit from the strength of machine learning tools. In order to
detect text of large font size, we also propose a novel hardwareoriented,
height-preserving multi-resolution analysis. Finally, the
results of the two classifiers as well as color and edge cues are
used for the final pixel-based text/non-text decision.
1. INTRODUCTION
Text overlays are added onto TV broadcast to supplement the
audio-visual content with additional metadata. Because of its
information value, overlaid text detection has mainly been
considered in the context of video indexing and retrieval. In
contrast, our main application is visual quality improvement. This
difference is important because most storage and retrieval
applications can afford to have offline and distributed (in the sense
that the task can be assigned to a non-TV device) processing
whereas visual quality improvement should often be performed
online by using very limited computational and memory resources
of a TV set. In this context, memory size refers to the number of
image lines available for processing at a single instant. In software,
whole frame or video information can usually be used; but only a
few lines of image data are available in hardware. In the following,
we will mainly focus on the implications of the memory
limitations to overlaid text detection algorithms.
In general, overlaid text detection involves two steps: 1) text
candidate extraction, and 2) text verification. In the first stage, the
smallest spatial processing units, such as pixels and blocks, are
processed independently from each other and are assigned as text
candidate or not. In the second stage, spatially connected
detections are merged for region-level morphological analysis,
e.g., by using region bounding box information. In one or both of
these steps, the state-of-the-art text detection algorithms rely on
non-local information, which can even be at the frame level. In this
paper, we define the local region as lines of data supported by the
hardware memory. For example, Figure 1 shows the local region
for the yellow line in red. This value is 11 lines in our case. The
memory size could be much less than the height of text, but the
width is the same as the image width. The problem we would like
to solve in this paper is to be able to robustly detect text of any
size by processing only limited information available in the
memory. This is a challenging problem especially when the text
height is greater than the memory size as shown in Figure 1.
Figure 1: The local area for the line in yellow is highlighted in
red (total of 11 lines); we use only the information from the
local region for text detection and verification.
The existing text detection algorithms exploit color
connectedness (connected component, CC-, based approaches)
and/or textural features [1][2]. Recently, machine learning tools,
such as neural nets [3], or SVMs [4], are also applied to determine
the text/non-text boundary. In these algorithms, non-local
information is needed mainly for multi-resolution image analysis
(for large-sized text detection), for text segmentation (for
extraction of word and text line boundaries), and finally for
suppression of false alarms.
Having constraints not applicable to the existing approaches,
we propose an integrated framework that employs a CC-based and
a texture-based algorithm. CC-based algorithm utilizes characterbased
features in the horizontal direction; hence, it processes each
image line independently from the others. In contrast, texturebased
algorithm uses block-based features to use information from
all directions. In order to take advantage of machine learning tools,
an SVM with a linear kernel has been trained to make text/non-text
decision for the texture-based algorithm. We also propose a novel
height-preserving horizontal scaling method for multi-resolution
analysis. In the verification stage, we also use color and edge
features to fully utilize the existing information. As a result, the
proposed algorithm is able to robustly detect text whose height can
be five times bigger than the memory size.
Section 2 explains the text candidate extraction that is
comprised of edge-based preprocessing, CC-based detection, and
texture-based algorithm. After that, Section 3 describes the text
verification stage where the individual detections are fused.
Section 4 presents the experimental results while Section 5
concludes the paper.
2. TEXT CANDIDATE EXTRACTION
In this section, we first explain the edge-based preprocessing step
that determines the region-of-interest for the CC- and texturebased
algorithms that are explained in Sections 2.2 and 2.3,
respectively.
2.1. Edge-based preprocessing
The insertion of text onto video should result in large intensity and
color differences from the background so that the viewers can read
the overlaid text easily. In this section, we use this feature to
eliminate non-text regions to speed up the overall processing.
We first detect horizontal, Gh(x,y), and vertical, Gv(x,y),
derivatives at each image location (x,y) by applying a 2x2 mask.
After that, the edge strength, ES(x,y), is computed for the pixel
(x,y) as in Equation 1, where we prefer L1 norm to the Euclidean
distance because of the computational reasons. The pixels having
edge strength greater than the adaptively computed threshold
value, GThrHigh, are assumed to include text regions if there are
any. The value of GThrHigh is determined as a function of the
average edge strength as shown in Equation 2, where k is a
constant coefficient (We found k = 5 as an appropriate value not to
lose any pixels at the text boundaries), M and N are image width
and height, respectively. Figure 2 demonstrates the output of this
stage proving that strong edges should exist at the transitions
between text and natural video content.

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	ULTRASONIC TECHNIQUES FOR HIDDEN CORROSION DETECTION IN AIRCRAFT WING SKIN	smart paper boy	2	3,100	13-04-2017, 03:53 PM Last Post: jaseela123d
	WORMHOLE ATTACK DETECTION IN WIRELESS ADHOC SENSOR NETWORKS	seminar class	7	19,052	17-08-2016, 09:23 AM Last Post: jaseela123d
	Brain Tumour Detection Using Water shedding and basic Image Processing Techniques	smart paper boy	2	3,075	01-08-2015, 02:53 PM Last Post: seminar report asees
	MOBILE DETECTION AND JAMMING	computer science crazy	14	12,416	13-11-2013, 05:35 AM Last Post: Guest
	AUTOMATIC ENERGY SAVING SYSTEM WITH HUMAN HEAT DETECTION	smart paper boy	11	7,131	27-08-2013, 10:41 AM Last Post: computer topic
	AUTOMATED TOLL COLLECTION AND ALCHCOL DETECTION USING PSOC full report	project topics	6	7,352	08-04-2013, 03:28 PM Last Post: computer topic
	Medical image fusion	smart paper boy	3	2,338	13-03-2013, 11:42 AM Last Post: computer idea
	SUPER RFID APPLIED IN AUTOMATIC VISITOR / TOURIST GUIDE INFORMATION FOR PERSONNEL ANO	computer science crazy	1	1,844	15-01-2013, 07:39 PM Last Post: Guest
	AUTOMATIC VEHICLE ACCIDENT DETECTION AND MESSAGING SYSTEM USING GSM AND GPS MODEM	smart paper boy	14	10,773	02-01-2013, 06:16 PM Last Post: naidu sai
	ADAPTIVE TECHNIQUES BASED HIGH IMPULSIVE NOISE DETECTION AND REDUCTION OF A DIGITAL	smart paper boy	1	1,938	05-12-2012, 03:58 PM Last Post: seminar details

Important Note..!

ASK HERE