A VLIW Vector Media Coprocessor With Cascaded SIMD ALUs
#1

Abstract—
High-definition video applications, such as digital
TV and digital video cameras, require high processing performance
for high-quality visual images in addition to a complex
video CODEC. Pre-/postprocessing to improve video quality
is becoming much more important because requirements for
pre-/postprocessing vary among applications and processing
algorithms have not been stabilized. Therefore, a new processor
architecture that has a highly parallel datapath is needed. In this
paper, we introduce a VLIW vector media coprocessor, “vector
coprocessor (VCP),” that includes three asymmetric execution
pipelines with cascaded SIMD ALUs. To improve performance
efficiency, we reduce the area ratio of the control circuit while
increasing the ratio of the arithmetic circuit. The total gate count
of VCP is 1268 kgates and its maximum operating frequency is
300 MHz at 90-nm CMOS process. Some of the processing kernels
in an adaptive prefilter that is applied to preprocessing for video
encoding are evaluated. In the case of the edgeness and the sum of
absolute differences, the performance is 183 giga operations per
second. VCP offers enough performance for HD video processing
and good cost-performance while all processing pipeline units
operate effectively.
Index Terms—Single instruction stream, multiple data stream
(SIMD), vector coprocessor (VCP), very long instruction word
(VLIW).
I. INTRODUCTION
NOWADAYS, high-definition video applications, such as
digital TV and digital video cameras require high processing
performance for high-quality visual images in addition
to a complex video CODEC. Pre-/postprocessing to improve
video quality is becoming much more important because requirements
for pre-/postprocessing vary among applications and
processing algorithms have not been stabilized.
We focused on the fact that image processing for much video
pre-/postprocessing is characterized by operating on sets of data
elements as vectors that evolve continuously in time and that
image processing algorithms are characterized by frequent executions
of the same computation on each of the elements in a
vector and by execution of sequences of operations on vector elements.
With the execution of such sequences of operations, an
effective implementation includes performing loop operations
using the same single instruction stream, multiple data stream
(SIMD) ALUs for as many times as necessary in the sequence
and structuring the hardware as a pipeline with the cascaded
ALUs. This would achieve a high-performance and energy-efficient
architecture while providing reusable hardware. Reuse for
many video coding applications would realize a low development
cost.
In this paper, we introduce a very long instruction word
(VLIW) vector coprocessor, “vector coprocessor (VCP),”
that has been customized to the computation requirements of
image processing. The coprocessor architecture includes three
asymmetric execution pipelines with cascaded SIMD ALUs to
exploit the loop-level parallelism. The new architecture of VCP
is a combination of cascaded SIMD ALUs and asymmetric
parallel pipelines, which provide good cost-performance to enhance
specialized datapaths for lower-level image processing,
such as preprocessing and postprocessing, at the expense of
generality compared with conventional processors with SIMD
instructions. VCP is designed to be a coprocessor for image
processing of video CODECs and the width of SIMD ALUs
is limited to that of macroblocks of CODECs. Therefore, we
introduce a cascaded structure of SIMD ALUs to exploit high
parallelism. To achieve high performance with small hardware
size, we reduce the area ratio of the control circuit while
increasing the ratio of the arithmetic circuit. For instance, to
assume static optimizations by the compiler, the coprocessor
architecture does not have forwarding hardware. This allows
increasing the ratio of the arithmetic circuit.
The remainder of this paper is organized as follows. Section II
reviews previous related work. We introduce the architecture of
VCP in Section III. Section IV shows examples of image processing
kernels in an adaptive prefilter used in preprocessing
for high-definition video encoding. Finally, in Section V, we
present the hardware implementation and the performance evaluation
results and discuss the effectiveness of the architecture
in real-time adaptive prefilter processing and other image processing
kernels.

Download full report
http://googleurl?sa=t&source=web&cd=3&ve...799236.pdf%3Farnumber%3D4799236&ei=X2UITuSEPITUiAKn_aGlDQ&usg=AFQjCNG-e8H4CAfRZMttY1Gy4yvg9r5R_A&sig2=6wATA65_grgGOxfdybfzcA
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: a very long instruction word vector media, viva questions cascaded h bridge, vliw architecture computer architecture, vliw vs superscalar, vliw architecture ppt, 2nd level data forwarding vliw, vliw processor,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  VECTOR CONTROL DRIVE OF PERMANENT MAGNET SYNCHRONOUS MOTOR USING MATLAB/SIMULINK seminar class 2 12,262 05-04-2017, 01:18 PM
Last Post: surya256
  SPEDD CONTROL OF AN INDUCTION MOTOR USING IDIRECT VECTOR CONTROL full report project topics 9 9,705 20-02-2012, 04:53 PM
Last Post: seminar paper
  Fundamental Frequency Switching Strategies of a Seven-Level Hybrid Cascaded H-Bridge smart paper boy 0 1,077 24-08-2011, 11:17 AM
Last Post: smart paper boy
  Performance Evaluation of Optical CDMA Networks with Random Media Access Schemes smart paper boy 0 986 18-08-2011, 11:51 AM
Last Post: smart paper boy
  SPACE VECTOR PWM INVERTER seminar surveyer 1 2,023 06-05-2011, 06:00 PM
Last Post: raul7real_90
  Experimental Comparison of Carrier and Space Vector PWM Control Methods seminar class 0 989 05-05-2011, 03:32 PM
Last Post: seminar class
  A Simple Space Vector PWM Generation Scheme for Any General n-Level Inverter seminar class 0 1,610 05-05-2011, 02:46 PM
Last Post: seminar class
  Universal Algorithm Control for Asymmetric Cascaded Multilevel Inverter seminar class 0 1,347 05-05-2011, 12:26 PM
Last Post: seminar class
  Switching Characterization of Cascaded Multilevel-Inverter-Controlled Systems seminar class 0 1,441 03-05-2011, 05:01 PM
Last Post: seminar class
  Space Vector Modulation Control Algorithm for VSI Multi-Level Converters seminar class 0 1,538 03-05-2011, 12:39 PM
Last Post: seminar class

Forum Jump: