Exploiting the Human-Machine Gap in Image Recognition for Designing CAPTCHAs
#1

Abstract
Security researchers have, for long, devised mechanismsto prevent adversaries from conducting automated networkattacks, such as denial-of-service, which lead to significantwastage of resources. On the other hand, several attempts havebeen made to automatically recognize generic images, make themsemantically searchable by content, annotate them, and associatethem with linguistic indexes. In the course of these attempts, thelimitations of state-of-the-art algorithms in mimicking humanvision have become exposed. In this paper, we explore theexploitation of this limitation for potentially preventing automatednetwork attacks. While undistorted natural images havebeen shown to be algorithmically recognizable and searchableby content to moderate levels, controlled distortions of specifictype and strength can potentially make machine recognitionharder without affecting human recognition. This difference inrecognizability makes it a promising candidate for automatedTuring tests called CAPTCHAs which can differentiate humansfrom machines.We empirically study the application of controlleddistortions of varying nature and strength, and their effect onhuman and machine recognizability. While human recognizabilityis measured on the basis of an extensive user study,machine recognizability is based on memory-based contentbasedimage retrieval (CBIR) and matching algorithms. We givea detailed description of our experimental image CAPTCHAsystem, IMAGINATION, that uses systematic distortions at itscore. A significant research topic within signal analysis, CBIR isactually conceived here as a tool for an adversary, so as to helpus design more foolproof image CAPTCHAs.Index Terms—Automated Turing tests, CAPTCHAs, Systematicnetwork attacks, Image recognition.
I. INTRODUCTION
Robust image understanding remains an open problem. Thegap between human and computational ability to recognizingvisual content has been termed by Smeulders et al. [30]as the semantic gap. A key area of research that wouldgreatly benefit from the narrowing of this gap is content-basedimage retrieval (CBIR). Over more than a decade, attemptshave been made to build tools and systems that can retrieveimages (from repositories) that are semantically similar toquery images, which have enjoyed moderate success [7], [30].While the inability to bridge the semantic gap highlights theR. Datta is with the Department of Computer Science and Engineering,The Pennsylvania State University, University Park, PA 16802, USA. PhoneSad814) 865-6168. E-mail: datta[at]cse.psu.edu. Fax: (814) 865-6426.J. Li is with the Department of Statistics, The Pennsylvania StateUniversity, University Park, PA 16802, USA. Phone: (814) 863-3074.E-mail: jiali[at]stat.psu.edu. Fax: (814) 865-6426.J. Z. Wang is with the College of Information Sciences and Technology,The Pennsylvania State University, University Park, PA 16802, USA. PhoneSad814) 865-7889. E-mail: jwang[at]ist.psu.edu. Fax: (814) 865-6426.limitations of the state-of-the-art in image content analysis,we see in it an opportunity for system security. This, andany task that humans are better at performing than the bestcomputational means, can be treated as an ‘automated Turingtest’ [1], [32] that tells humans and computers apart. Typicallyreferred to as HIP (Human Interactive Proof) or CAPTCHA(Completely Automated Public Turing test to tell Computersand Humans Apart) [3], they help reduce e-mail spam, stopautomated blog and forum responses, save resources, andprevent denial-of-service (DoS) attacks on Web servers [23],among others. In general, DoS attacks involve generating alarge number of automated (machine) requests to one or morenetwork devices (e.g., servers) for resources in some form,with the goal of overwhelming them and preventing legitimate(human) users from getting their service. In a distributed DoS,multiple machines are compromised and used for coordinatedautomated attacks, making it hard to detect and block theattack sources. To prevent such forms of attacks and saveresources, the servers or other network devices can requireCAPTCHA solutions to accompany each request, thus forcinghuman intervention, and hence, in the very least, reducing theintensity of the attacks. Because CAPTCHAs can potentiallyplay a very critical role in system security, it is imperative thatthe design and implementation of CAPTCHAs be relativelyfoolproof.There has been sizable research output in designing as wellas breaking CAPTCHAs. In both these efforts, computingresearch stands to benefit. A better CAPTCHA design meansgreater security for computing systems, and the breaking of anexisting CAPTCHA usually means the advancement of artificialintelligence (AI). While text-based CAPTCHAs have beentraditionally used in real-world applications (Yahoo! Mail Signup, PayPal Sign up, Ticketmaster search, Blogger Commentposting, etc.), their vulnerability has been repeatedly shownby computer vision researchers [24], [31], [4], [25], reportingover 90% success rate. Among the earliest commercial ones,the Yahoo! CAPTCHA has also been reportedly compromised,with a success rate of 35% [29], allowing e-mail accounts tobe opened automatically, and encouraging e-mail spam.In principle, there exist many hard AI problems that canreplace text-based CAPTCHAs, but in order to have generalappeal and accessibility, recognition of image content hasbeen an oft-suggested alternative [1], [5], [8], [9], [27]. Whileautomatic image recognition is usually considered to be amuch harder problem than text recognition (which is a reasonfor it to be suggested as an alternative to text CAPTCHAs), ithas also enjoyed moderate success as part of computer visionresearch. This implies that a straightforward replacement of2 Sample CAPTCHAs proposed or in real-world use. (a)-(b) Text-basedCAPTCHAs in public use. © Image-based CAPTCHA proposed by CMU’sCaptcha Project. User is asked to choose an appropriate label from a list. (d)Asirra presents pictures of cats and dogs and asks users to select all the cats.text with images may subject it to similar risks of being‘broken’ by image recognition techniques. Techniques suchas near-duplicate image matching [15], content-based imageretrieval [30], and real-time automatic image annotation [19]are all potential attack tools for an adversary. One approachthat can potentially make it harder for automated attack whilemaintaining recognizability by humans is systematic distortion.A brief mention of the use of distortions in the contextof image CAPTCHAs has been made in the literature [5], butthis has not been followed up by any study or implementation.Furthermore, while there have been ample studies on the algorithmicability to handle noisy signals (occlusion, low light,clutter, noise), most often to test robustness of recognitionmethods, their behavior under strong artificial distortions hasbeen rarely studied systematically.In this work, we explore the use of systematic imagedistortion in designing CAPTCHAs, for inclusion in our experimentalsystem called IMAGINATION.We compare humanand machine recognizability of images under distortion basedon extensive user studies and image matching algorithmsrespectively. The criteria for a distortion to be eligible forimage CAPTCHA design are that when applied, they1) make it difficult for algorithmic recognition, and2) have minor effect on recognizability by humans.Formally, let H denote a representative set of humans, andlet M denote one particular algorithm of demonstrated imagerecognition capability. We introduce a recognizability functionρX(I) to indicate whether image I has been correctly recognizedby X or not. Thus, ρH(I) and ρM(I) are humanand machine recognizabilities respectively, and we refer to|ρH(I) − ρM(I)| as the recognizability gap with respect toimage I. This image can be visually distorted to varyingdegrees. We define a distortion function δy(•) that can beapplied to a natural image, the degree of distortion beingabstractly represented by parameter y. This study focuses onanalyzing (a) recognizability, and (b) recognizability gap, ofdistorted images δy(I), over a large number of natural images.The following are of interest:• Current state-of-the-art in image recognition typicallytest and report results on undistorted natural images,and on minor distortions. The ‘breaking’ of an imageCAPTCHA, in the absence of distortion, is thereforeroughly as likely as the performance of these imagerecognition techniques.• On application of a distortion, the image recognitionperformance is expected to degrade. There has been nocomprehensive study on the effect of various artificialdistortions on image recognizability.• Distortion also affects human recognizability of images.It is safe to assume, though, that humans are relativelymore resilient to distortion; they can ‘see through’ clutterand fill in the missing pieces, owing to their power ofimagination.• In CAPTCHA design, the goal is to evade recognition bymachines while being easily recognizable by humans. Itis therefore important to be able to figure out the typesand strengths of distortion on images that keep humanrecognizability high while significantly affecting machinerecognizability.While the primary aim of this work is the systematic designof a security mechanism, the results from the study (See,e.g., Figs. 7, 8, 9, 10, and 11) also reveal to us some ofthe shortcomings of image matching algorithms, i.e., how theapplication of certain distortions makes it difficult for evenstate-of-the-art image matching methods to pair up distortedimages with their originals. Furthermore, through large-scaleuser studies, we are also made aware of the kinds of distortionsthat make image recognition difficult for humans. These peripheralobservations may find use in other research domains.

Download full report
http://infolab.stanford.edu/~wangz/proje.../datta.pdf
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: energy gap of a semiconductor experiment viva questions, captchas, project manager gap inc, gap analysis in erp, audio captchas ppt, bridge the gap, viva question for band gap practical,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  The Impact of the Automated Teller Machine smart paper boy 2 2,426 08-01-2018, 09:52 AM
Last Post: RaymondGom
  A Validation Framework for the Service-Oriented Process Designing 1 955 15-02-2017, 03:58 PM
Last Post: jaseela123d
  Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To 1 781 14-02-2017, 04:15 PM
Last Post: jaseela123d
  Content-based image retrieval (CBIR) System project topics 15 13,868 13-05-2016, 02:30 PM
Last Post: dhanabhagya
  image processing projects ideas project topics 4 5,104 05-01-2016, 02:22 PM
Last Post: seminar report asees
  Image Processing - Noise Reduction project topics 3 3,787 26-08-2015, 02:55 PM
Last Post: dhivya srinivasan
  Developing a web application to transfer image and patient information project report maker 2 3,701 21-03-2014, 01:44 AM
Last Post: MichaelPn
  Digital Image Processing Techniques for the Detection and Removal of Cracks in Digiti electronics seminars 4 4,915 22-07-2013, 09:37 PM
Last Post: Guest
  Handwriting Recognition computer science topics 9 6,516 20-07-2013, 11:07 AM
Last Post: computer topic
  Handwriting recognition project report seminar addict 3 4,186 24-06-2013, 11:24 AM
Last Post: computer topic

Forum Jump: