29-12-2010, 01:02 PM
[attachment=7737]
ABSTRACT
Template matching is an object detection technique in image processing to find small parts of an image which match a template image. There are many methods one can adopt for this purpose.
Matching can be Template based, in which case every pixel that contributes to the template may be matched against every pixel of the image using any means of comparison (eg. SAD, SSD, cross-correlation), or it can me Feature based, where the matching is on the basis of some strong features that the template may possess.
We use a Template Based approach, With Normalized Cross Correlation (NCC) as the specific method.
The project involved use of OpenCV – an open source computer vision library for most of the image processing functions. We tested the algorithm by finding instances of soft-drink cans in a database of images, which was also compiled by us. Several methods of pre-processing the image and scoring the image pixels on point of Match were tried. Finally, the algorithm will be implemented to run on GPGPUs using NVIDIA’s CUDA architecture, in order to make use of the massive scope of parallelization in the program.
INTRODUCTION
Object detection in image processing is useful and important in many ways – as a part of quality control in assembly lines, in medical imaging, navigation by machine intelligence etc.
The automatic detection of objects in environments has been tackled in many different ways throughout history. One of these attempts is by Template matching.
A template is a small image which may be matched to a part of the larger image by correlating them in some manner. Template matching is a technique in digital image processing for finding small parts of an image which match a template image.
Template matching can be subdivided into two approaches: feature-based and template-based matching. The feature-based approach uses the features of the search and template image, such as edges or corners, as the primary match-measuring metrics to find the best matching location of the template in the source image. This is more suited for specific detection challenges, where the template is not generalized and always retains certain features. The template-based, also called global, approach, uses the entire template, with generally a sum-comparing metric (using SAD, SSD, cross-correlation, etc.) that determines the best location by testing all or a sample of the viable test locations within the search image that the template image may match up to.
Since we required our matching to be as accurate as possible, and since our template image was non-specific (cans, bottles, logos, anything) we chose the second approach – Template-based matching. This involved the consideration of some other advantages and disadvantages, as will be mentioned later.
To test our algorithm, we compiled a database of images featuring soft-drink cans in different environments (light and shadow differences were found to effect the detection the most). The template used was the small image of the can, which was then to be detected in the image.
By examining the different methods of Template-based Matching, we concluded that Normalized Cross Correlation (NCC) would be the most suitable method for us to run.
Also created were certain preprocessing and peripheral tools – such as for extracting the template image from a larger image (basically cutting out the part with the can), and for automatically judging the results of the tests (whether the can had been detected or not) without human interference.
We decided to use OpenCV, which is an open souce computer vision library, from an early stage. Considerations when it came to deciding which library to use (or to use libraries at all) are highlighted later in the report.
Since Template-based Matching is a compute-intensive job, it has a lot of scope for parallelization. Finally, it was decided that we would adapt our code to compile and run on NVIDIA’s CUDA (Compute Unified Device Architecture) platform. CUDA is an abstraction of the workings of NVIDIA GPGPUs (General Purpose Graphics Processing Units) in APIs with a structure similar to C. This enables programmers and scientists more access to GPGPU programming, taking advantage of its parallel architecture, without actually having to learn the inner workings.
We aim to achieve some degree of speedups in running our program using a parallel architecture, to cut down the running, and the actual testing times, which can run into several hours for even a meager database of 75 images.
TEMPLATE MATCHING
As mentioned above, template matching has two approaches, from which we chose the Template-based or global approach rather than the Feature-based approach.