FUZYY SIMILARITY APPROACH full report
#1

[attachment=13712]
[attachment=13713]
CHAPTER – 1
INTRODUCTION

E-mail spam has continued to increase at a very fast rate over the last couple of years. It has become a major threat for business users, network administrators and even normal users[2]. A study in July 1997 reported that spam messages constituted approximately 10% of the incoming messages to a corporate network.
More recently, Message Labs[1] stated in its 2006 Annual Security Report that spam activity has Increased significantly in 2006 with levels that reach 86.2% of the e-mail traffic. The report has also indicated that largely due to the increased sophistication of robot networks, a.k.a. botnets, the spam volumes have increased by 70% over the last quarter of 2006 which in turn increased the overall email traffic by a third. Based on projections of current analysis and trends, it was expected that by the end of 2007, spam will continue to rise, reaching a plateau at around 92% of e-mail traffic [2]. There is a prediction that by year 2015 spam will exceed 95% of all e-mail traffic [4]. Although these figures might not be accurate enough, what can be concluded is that spam volume is dramatically increasing over years.
1.1 PROBLEM STATEMENT
Internet has opened new channels of communication; enabling an e-mail to be sent to a relative thousands of kilometers away. This medium of communication opens doors for virtually free mass e-mailing, reaching out to hundreds of thousands users within seconds. However, this freedom of communication can be misused.
A study in July 1997 reported[1] that spam messages constituted approximately 10% of the incoming messages to a corporate network. Most recently, Message Labs stated in its 2006 Annual Security Report that spam activity has increased significantly in 2006 with levels that reach 86.2% of the e-mail traffic. The report has also indicated that largely due to the increased sophistication of robot networks, a.k.a. botnets, spam volumes have increased by 70% over the last quarter of 2006 which in turn increased the overall e-mail traffic by a third. Based on projections of current analysis and trends, it was expected that by the end of 2007,[1] spam will continue to rise, reaching a plateau at around 92% of e-mail traffic. There is a prediction that by the year 2015 spam will exceed 95% of all e-mail traffic
Spam can be very costly to e-mail recipients; According to Ferris Research, if an employee got five e-mails per day and consumes 30 seconds on each, then he/she will waste 15 hours a year on them. Multiplying this by the hourly rate of each employ in a company will give the cost of spam to this company. Spam software can also be used to distribute harmful content such as viruses, Trojan horses, worms & other malicious codes. It can be a means of phishing attacks as well.
As a result, spam has become an area of growing concern attracting the attention of many security researchers & practitioners. In addition to regulations and legislations, various anti-spam technical solutions have been proposed and deployed to combat this problem. However, most of them were static; it means they used a blacklist of spammers, a white list of good sources, or fixed set of keywords to identify spam messages. Although these substantially reduced the risk but failed to scale and adapt to spammers tactics. They can be defeated easily by changing the senders address each time, intentionally misspelling words or forging the content to bypass spam filters. Although these substantially reduced the risk but failed to scale and adapt to spammers tactics. They can be defeated easily by changing the senders address each time, intentionally misspelling words or forging the content to bypass spam filters.[1]
The proposed system is based On Fuzzy logic that can automatically classify e-mail messages as spam or legitimate. We study its performance for various conjunction and disjunction operators for several datasets. The proposed system checks both the predefined list of words and also the content of the message.[7]
1.2 MOTIVATION
The use of fuzzy similarity is motivated by the fact that category of a document cannot be determined by only from a single term , rater it is determined from a set of terms that co-occur in training documents.
To estimate the membership function of a temporal fuzzy set first determines the membership function of a property. Then, is obtained by tracking the feature vector as it travels along the trajectory. In most practical applications, the trajectory is given by a collection of samples, or feature vectors, equally spaced in time. Thus, the membership function estimation problem may be approached as a pattern recognition problem. By partitioning the samples of a trajectory into classes one obtains the regions of attraction. Since in general, it is not possible to determine crisp boundaries between classes, fuzzy partitioning is a preferred methodology.
1.3 SCOPE:
The scope of this project is to provide a Automated Spam Filtering with Fuzzy Similarity Approach. This method considers the content of the message to predict its category rather than the relying on a set of preclassified set of keywords. Thus it can adapt to spammer tactics and dynamically builds its knowledge base
1.4 OUTLINE
We investigate an alternative automated spam filtering technique based on fuzzy similarity. We briefly review related work on applying machine-learning techniques for classifying e-mail messages. We describe the fuzzy similarity approach to anti-spam filters. The performance evaluation of the proposed method using several datasets from a public corpus.
The purpose of this document is to present a detailed description of the e-mail Filtering System. It will explain the purpose and features of the system, the interfaces of the system, what the system will do, the constraints under which it must operate and how the system will react to external stimuli. This document is intended for both the stakeholders and the developers of the system and will be proposed to the Regional Historical Society for its approval.
CHAPTER-2
BACKGROUND

E-mail spam, also known as "bulk e-mail" or "junk e-mail," is a subset of spam that involves nearly identical messages sent to numerous recipients by e-mail. A common synonym for spam is unsolicited bulk e-mail (UBE). Definitions of spam usually include the aspects that email is unsolicited and sent in bulk. "UCE" refers specifically to "unsolicited commercial e-mail."[1]
E-mail spam slowly but exponentially grew for several decades to several billion messages a day. Spam has frustrated, confused, and annoyed e-mail users. Laws against spam have been sporadically implemented, with some being opt-out and others requiring opt in e-mail. The total volume of spam (over 100 billion emails per day as of April 2008) has leveled off slightly in recent years, and is no longer growing exponentially. About 80% of all spam is sent by fewer than 200 spammers.
2.1 DEFINITION
The term spam refers to unsolicited, unwanted, inappropriate bulk email. Stupid Pointless Annoying Messages
2.2 TYPES OF SPAM
Spam has several definitions, varying by the source.[21]
• Unsolicited bulk e-mail (UBE)—unsolicited e-mail, sent in large quantities.
• Unsolicited commercial e-mail (UCE)—this more restrictive definition is used by
regulators whose mandate is to regulate commerce. [Any email message that is
Fraudulent]
• Any email message where the sender’s identity is forged, or messages sent though
Unprotected SMTP servers, unauthorized proxies
2.3 CHARACTERISTICS OF SPAM:
Spam characteristics appear in two parts of a message: email headers and message content[2].
E-mail headers: email headers show the route an email has taken in order to arrive at its destination. They also contain other information about the email, such as the sender and recipient, the message ID, date and time of transmission, subject and several other email characteristics.
Typical email header characteristics in spam messages:
• Recipients email address is not in the To: or Cc field
• Empty To: field
• To: contains the invalid email address
• Missing To: field
• From: field is same as To: field
• Missing From: field
• More than 10 recipients in To: and/or Cc: fields
• Bcc: header exists
• Message contents
The email filtering system should filter out spam messages in three ways:
1. Block spam at the gateway by checking domains in real time black hole lists.
2. Filter out spam based on email characteristics
3. Identify Junk mail content
2.4 THE PROBLEMS ASSOCIATED WITH SPAM:
Spam is becoming increasingly unpopular among Internet users. An estimated 74% are proponents of making spam illegal, while only 12% are opposed to banning it.1 The sections below outline a number of the reasons that spam has become such a problem.[1]
COSTS OF SPAM:
Spam imposes costs on all Internet users. These costs have been increasing with the growth in the number of spam messages infiltrating the Internet daily. Spam consumes network and computing resources, e-mail administrator and helpdesk personnel time, and reduces worker productivity.
It is difficult to calculate the total costs of spam at the global level, though estimates suggest the costs are high. The same study estimates that if people receive six spam messages a day, two hours are wasted each year deleting spam (assuming it takes 3-4 seconds to determine the nature of a message and delete it.)
COSTS FOR INDIVIDUAL USERS:
Consumers waste time deleting repeated unsolicited commercial messages. The costs for consumers can also include additional communications charges from ISPs or telephone companies (or both) as well as additional data storage charges. Likewise, costs incurred by ISPs in dealing with spam tend to be passed on to consumers.
COSTS FOR COMPANIES:
Companies are beginning to have concerns about the significant costs of spam, which threaten the business environment in multiple ways. Using techniques that harvest e-mail addresses on the Internet, spammers have databases of addresses taken from corporate Web sites. Furthermore, the increase in spam attacks on companies introduces serious threats.
The costs of spam to companies can be categorized as follows. First, there is the productivity loss from employees dealing with spam. Second, there are additional costs for network and computing resources. Third, there are additional human resources and financial burdens for deploying technical tools to deal with spam. A fourth category is the security risks due to spam attacks such as dictionary attacks and e-mail-borne viruses and worms. Finally there is potential legal liability.
The network and computing resources to deal with spam messages may be quite high. Spam may require filtering resources may slow down company networks by increasing the traffic load and will have an impact on the enterprise’s computer storage space and bandwidth. Spam consumes e-mail administrator and helpdesk personnel time and increases financial costs for deploying anti-spam technology and operating filtering systems.
Reply
#2
Hello sir

I want help for this topic regarding the project development.

Project Name : A fuzzy similarity for automated spam filtering

Ravi Deshpande
Reply
#3


To get more information about the topic "FUZYY SIMILARITY APPROACH full report " please refer the page link below

http://studentbank.in/report-fuzyy-simil...7#pid56217
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: botnets**tion of the student loans, ppt for a fuzzy similarity approach for automated spam, ppt on an approach for mesauring semantic similarity between words using multiple information sources, fuzzy similarity approach ppt, fuzzy similarity approach for automated spam ppt, ppt for a multidimensional sequence approach to measuring tree similarity, a multidimensional sequence approach to measuring tree similarity,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  FINGER PRINT BASED ELECTRONIC VOTING MACHINE full report project topics 60 50,924 11-05-2017, 10:43 AM
Last Post: jaseela123d
  AUTOMATIC BUS STATION ANNOUNCEMENT SYSTEM full report project report tiger 4 10,919 13-08-2016, 11:16 AM
Last Post: jaseela123d
  MICROCONTROLLER BASED DAM GATE CONTROL SYSTEM full report seminar class 13 17,304 19-06-2016, 07:53 PM
Last Post: Saianjana
  METAL DETECTOR full report project report tiger 14 23,819 12-03-2016, 01:51 PM
Last Post: seminar report asees
  Solar power plant full report seminar class 2 3,370 11-11-2015, 01:49 PM
Last Post: seminar report asees
  MICROCONTROLLER BASED AUTOMATIC RAILWAY GATE CONTROL full report project topics 49 58,014 10-09-2015, 03:18 PM
Last Post: seminar report asees
  RELAY CO-ORDINATION full report project report tiger 2 4,424 24-02-2015, 10:18 AM
Last Post: seminar report asees
  COIN BASED MOBILE CHARGER full report seminar class 25 23,045 08-12-2014, 11:40 PM
Last Post: seminar report asees
  Microstrip Patch Antenna - full report seminar surveyer 6 10,299 11-11-2014, 11:32 PM
Last Post: jaseela123d
  microprocessor INTEL 8086 kit full report seminar class 2 3,445 04-08-2014, 10:53 PM
Last Post: seminar report asees

Forum Jump: