Student Seminar Report & Project Report With Presentation (PPT,PDF,DOC,ZIP)

Full Version: An Alternative To Captcha - Video Captcha
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
An Alternative To Captcha - Video Captcha
B.Tech Seminar report
by
Nawas Abdulla
Department of Computer Science And Engineering
Government Engineering College, Thrissur
December 2010

report:
[attachment=8396]

Abstract
Video Captcha is a new technique for using content based video as a CAPTCHA
task. These CAPTCHAs are presently generated from YouTube videos, which con-
tains tags supplied by the person while uploading that video. These videos are graded
using the given videos tags, and also tags from related videos. The human success
rate of video CAPTCHA comes upto roughly 70 % to 90 %, while the attack rate
at around 13 %. The usability and security of video CAPTCHA is comparable to
existing CAPTCHAs, and nds more enjoyable than traditional CAPTCHAs.

Contents
1 Introduction 1
2 COLLECTING VIDEO SAMPLES 3
3 CHALLENGE GENERATION 5
3.1 Related Videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Cosine Similarity of Tag Sets . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Adding Related Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Rejecting Frequent Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 GRADING FUNCTION 9
4.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Expanding Tags through Word Stemming . . . . . . . . . . . . . . . . 9
4.3 Allowing Inexact Matching . . . . . . . . . . . . . . . . . . . . . . . . 10
5 ATTACK SIMULATION 11
6 USER STUDIES 12
7 Conclusion and Future Works 15
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
References 17

Chapter 1
Introduction

A Completely Automated Public Turing test to tell Computers and Humans
Apart (CAPTCHA) is a di erent type of the Turing test in which an online challenge
is used to distinguish whether the user is human or a computer program. The purpose
of captcha is for preventing the abuse of online services, that is making programs
that creates thousands of free email accounts and then using them to send SPAM.
Di erent types of CAPTCHAs are Character recognition, Speech recognition and
Image understanding.
Four desirable properties for CAPTCHAs are :
1. Automated: Challenges should be automatically gen- erated and graded by a
computer.
2. Open: The database(s) and algorithm(s) used for generating and grading the
challenges should be made public. By Kerckho s Principle, A system should
remain secure even if everything about the system is public knowledge.
3. Usable: Challenges must be easily solved in a reasonable amount of time by
humans.
4. Secure: Challenges should be dicult for machines to solve algorithmically.
The widely used type of CAPTCHA requires a user to tran- scribe distorted char-
acters displayed within a noisy image. The algorithms that is used to automatically
generate the challenges are publicly available, but many users nd them frustrating.
Researchers have developed many automated programs that have been successful at
defeating them. For example, researchers have developed a program that eventu-
ally yielded an attack success rate of 60% against Microsofts Hotmail CAPTCHA.
Therefore a need for a new CAPTCHA which is automated, open, usable, and secure
arises.

Chapter 2
COLLECTING VIDEO SAMPLES

YouTube.com is utilized as the dataset for challenge generation, which is at present
the largest user-generated content video system avail- able. It currently stores and
indexes about 150 million videos.
Generating YouTube video identi ers (IDs) randomly would yield a true sample, but
it is not practicable for collecting a large sample in this fashion. YouTube video IDs
are of 11 characters length with a character set comprising of numbers (0-9), lower
case letters (a-z), uppercase let- ters (A-Z), underscores ( ) and dashes (-) with a
total of 64 di erent characters. On calculation there will be 6411  7:41019 possible
IDs. Currently there are approximately 1.5 108 videos on YouTube.com, So the
probability of generating a valid video ID randomly is approximately 2 10