ASK HERE

seminar class · 05-05-2011, 09:54 AM

[attachment=13311]
Abstract
Data may contain sensitive information, which ifexposed to unauthorized parties may lead to harmfulaffect. Hence, exposing of data to the unauthorizedparties has always raised questions relating toindividual privacy, data and information security. Stillthese data needs to be exposed for effective datamining applications. Till today, researchers are tryingtheir best to address the privacy issue by differenttechniques such as randomization and cryptographictechniques. This paper addresses the problem ofprivacy preserving data mining by transforming theattributes to fuzzy attributes. Thus, the data security isalso maintained, as one can not predict the exactvalue, at the same time, better accuracy of miningalgorithms is achieved. K-NN and J48 classificationalgorithms with four datasets are used in theexperiments to show the effectiveness of the approach.In this work, we have used two privacy measuresnamely, Bias in Mean (BIM) and Bias in StandardDeviation (BISD) to measure information loss.Experimental results, both in terms of classifieraccuracy as well as privacy measures, suggest that thefuzzy based transformation technique is better and canbe adopted for privacy preserving data mining.
1. Introduction
Explosive growth in data storing and dataprocessing technologies has led to the creation of largedatabases that record unprecedented amount ofinformation. Consequently, with the increase in datastorage and processing, concerns about informationprivacy have emerged. Data mining, with its promise toefficiently discover valuable non-obvious informationfrom large databases, is particularly sensitive to privacyconcerns. In recent years, data mining has alsoendeavored to become compatible with privacy.Organizations provide assurance of individualprivacy and data will be used only for the primarilywell defined purpose. However, it is the commonpractice of an organization to use individual data forsecondary purpose. By secondary purpose, it meansthat data is being used for which they were notcollected initially. Many organizations sell the data toother organizations, which use these data for their ownpurposes. Thus, the data gets exposed to a number ofparties including collectors, owners, users and miners;the privacy of individual is being questioned.Fruitful research has been produced by differentresearchers on the topic of privacy preserving datamining (PPDM). PPDM deals with the problem oflearning accurate models over aggregate data, whileprotecting privacy at the level of individual records.In this paper, we have addressed issues relateing toprivacy- preserving data mining. In particular, we focuson privacy preserving data classification.Classification techniques have been extensively studiedfor more than two decades [1]. The main objective ofdata classification is to build a classifier to predict theclass labels of data tuples based on a training data set[2]. The classifier is usually represented byclassification rules, decision trees, neural networks ormathematical formulae that can be used forclassification.The issue of privacy protection in classification hasbeen raised by many researchers in the past [3, 4]. Theobjective of privacy preserving data classification is tobuild accurate classifiers without disclosing privateinformation while the data is being mined. Theperformance of privacy preserving techniques shouldbe analyzed and compared in terms of both the privacyprotection of individual data and the predictiveaccuracy of the constructed classifiers.Recent research in the area of privacy preservingdata mining has devoted much effort to determine atrade-off between privacy and the need for knowledgediscovery, which is crucial in order to improvedecision-making processes and other human activities.Mainly, three approaches are being adopted for privacypreserving data mining namely, heuristic based,cryptographic based and reconstruction based [5].Heuristic based techniques are mainly adopted incentralized database scenario, whereas cryptographicbased technique finds its application in distributedenvironment. However, reconstruction basedalgorithms are well accepted in both centralized as wellas the distributed environment.There is a clear trade off between accuracy ofknowledge and the privacy. That is higher the accuracylower the privacy and lower the accuracy higher theprivacy. Hence, privacy preserving data miningremains as an open research issue.This paper suggests a simple approach for privacypreserving data mining using the concept of fuzzy sets.It shows that the data privacy can be maintainedwithout compromising the accuracy of the result, ifattributes are transformed into fuzzy sets. Reason forthe same is that, if data points are transformed intofuzzy sets, the similarity between points is stillpreserved. The accuracy of the mining results ispreserved as similar attributes still lie in one class. Thisis because the dissimilarity between data points fallingin each class is very small.The rest of the paper is organized as follows. Section2 describes the related work in the area of privacypreserving data mining. Section 3 gives the outline ofour methodology to carry out privacy preserving datamining. Section 4 describes the detailedexperimentation and finally we conclude in Section

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	A Link-Based Cluster Ensemble Approach for Categorical Data Clustering		1	1,100	16-02-2017, 10:51 AM Last Post: jaseela123d
	Privacy- and Integrity-Preserving Range Queries in Sensor Networks		1	881	15-02-2017, 04:10 PM Last Post: jaseela123d
	Protecting Location Privacy in Sensor Networks Against a Global Eavesdropper		1	816	15-02-2017, 11:01 AM Last Post: jaseela123d
	Protecting Location Privacy in Sensor Networks Against a Global Eavesdropper		1	786	15-02-2017, 11:00 AM Last Post: jaseela123d
	Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To		1	779	14-02-2017, 04:15 PM Last Post: jaseela123d
	SPOC: A Secure and Privacy-preserving Opportunistic Computing Framework for Mobile-He		1	924	14-02-2017, 03:49 PM Last Post: jaseela123d
	An Efficient Algorithm for Mining Frequent Patterns full report	project topics	3	4,806	01-10-2016, 10:02 AM Last Post: Guest
	Remote Server Monitoring System For Corporate Data Centers	smart paper boy	3	2,878	28-03-2016, 02:51 PM Last Post: dhanabhagya
	Secured Data Hiding and Extractions Using BPCS	project report helper	4	3,690	04-02-2016, 12:52 PM Last Post: seminar report asees
	Data Hiding in Binary Images for Authentication & Annotation	project topics	2	1,848	06-11-2015, 02:27 PM Last Post: seminar report asees

Important Note..!

ASK HERE