Simple Approach for Privacy Preserving Data Mining
#1

[attachment=13311]
Abstract
Data may contain sensitive information, which ifexposed to unauthorized parties may lead to harmfulaffect. Hence, exposing of data to the unauthorizedparties has always raised questions relating toindividual privacy, data and information security. Stillthese data needs to be exposed for effective datamining applications. Till today, researchers are tryingtheir best to address the privacy issue by differenttechniques such as randomization and cryptographictechniques. This paper addresses the problem ofprivacy preserving data mining by transforming theattributes to fuzzy attributes. Thus, the data security isalso maintained, as one can not predict the exactvalue, at the same time, better accuracy of miningalgorithms is achieved. K-NN and J48 classificationalgorithms with four datasets are used in theexperiments to show the effectiveness of the approach.In this work, we have used two privacy measuresnamely, Bias in Mean (BIM) and Bias in StandardDeviation (BISD) to measure information loss.Experimental results, both in terms of classifieraccuracy as well as privacy measures, suggest that thefuzzy based transformation technique is better and canbe adopted for privacy preserving data mining.
1. Introduction
Explosive growth in data storing and dataprocessing technologies has led to the creation of largedatabases that record unprecedented amount ofinformation. Consequently, with the increase in datastorage and processing, concerns about informationprivacy have emerged. Data mining, with its promise toefficiently discover valuable non-obvious informationfrom large databases, is particularly sensitive to privacyconcerns. In recent years, data mining has alsoendeavored to become compatible with privacy.Organizations provide assurance of individualprivacy and data will be used only for the primarilywell defined purpose. However, it is the commonpractice of an organization to use individual data forsecondary purpose. By secondary purpose, it meansthat data is being used for which they were notcollected initially. Many organizations sell the data toother organizations, which use these data for their ownpurposes. Thus, the data gets exposed to a number ofparties including collectors, owners, users and miners;the privacy of individual is being questioned.Fruitful research has been produced by differentresearchers on the topic of privacy preserving datamining (PPDM). PPDM deals with the problem oflearning accurate models over aggregate data, whileprotecting privacy at the level of individual records.In this paper, we have addressed issues relateing toprivacy- preserving data mining. In particular, we focuson privacy preserving data classification.Classification techniques have been extensively studiedfor more than two decades [1]. The main objective ofdata classification is to build a classifier to predict theclass labels of data tuples based on a training data set[2]. The classifier is usually represented byclassification rules, decision trees, neural networks ormathematical formulae that can be used forclassification.The issue of privacy protection in classification hasbeen raised by many researchers in the past [3, 4]. Theobjective of privacy preserving data classification is tobuild accurate classifiers without disclosing privateinformation while the data is being mined. Theperformance of privacy preserving techniques shouldbe analyzed and compared in terms of both the privacyprotection of individual data and the predictiveaccuracy of the constructed classifiers.Recent research in the area of privacy preservingdata mining has devoted much effort to determine atrade-off between privacy and the need for knowledgediscovery, which is crucial in order to improvedecision-making processes and other human activities.Mainly, three approaches are being adopted for privacypreserving data mining namely, heuristic based,cryptographic based and reconstruction based [5].Heuristic based techniques are mainly adopted incentralized database scenario, whereas cryptographicbased technique finds its application in distributedenvironment. However, reconstruction basedalgorithms are well accepted in both centralized as wellas the distributed environment.There is a clear trade off between accuracy ofknowledge and the privacy. That is higher the accuracylower the privacy and lower the accuracy higher theprivacy. Hence, privacy preserving data miningremains as an open research issue.This paper suggests a simple approach for privacypreserving data mining using the concept of fuzzy sets.It shows that the data privacy can be maintainedwithout compromising the accuracy of the result, ifattributes are transformed into fuzzy sets. Reason forthe same is that, if data points are transformed intofuzzy sets, the similarity between points is stillpreserved. The accuracy of the mining results ispreserved as similar attributes still lie in one class. Thisis because the dissimilarity between data points fallingin each class is very small.The rest of the paper is organized as follows. Section2 describes the related work in the area of privacypreserving data mining. Section 3 gives the outline ofour methodology to carry out privacy preserving datamining. Section 4 describes the detailedexperimentation and finally we conclude in Section
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: simple edge preserving denoising technique wikipedia, best newest approach for web mining, a privacy preserving remote, sequence diagram for privacy preserving data mining, privacy preserving data mining 2013 seminars, enabling multi level trust in privacy preserving data mining free ppt, privacy preserving clustering in data mining 2012,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  A Link-Based Cluster Ensemble Approach for Categorical Data Clustering 1 1,100 16-02-2017, 10:51 AM
Last Post: jaseela123d
  Privacy- and Integrity-Preserving Range Queries in Sensor Networks 1 881 15-02-2017, 04:10 PM
Last Post: jaseela123d
  Protecting Location Privacy in Sensor Networks Against a Global Eavesdropper 1 816 15-02-2017, 11:01 AM
Last Post: jaseela123d
  Protecting Location Privacy in Sensor Networks Against a Global Eavesdropper 1 786 15-02-2017, 11:00 AM
Last Post: jaseela123d
  Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To 1 779 14-02-2017, 04:15 PM
Last Post: jaseela123d
  SPOC: A Secure and Privacy-preserving Opportunistic Computing Framework for Mobile-He 1 924 14-02-2017, 03:49 PM
Last Post: jaseela123d
  An Efficient Algorithm for Mining Frequent Patterns full report project topics 3 4,806 01-10-2016, 10:02 AM
Last Post: Guest
  Remote Server Monitoring System For Corporate Data Centers smart paper boy 3 2,878 28-03-2016, 02:51 PM
Last Post: dhanabhagya
  Secured Data Hiding and Extractions Using BPCS project report helper 4 3,690 04-02-2016, 12:52 PM
Last Post: seminar report asees
  Data Hiding in Binary Images for Authentication & Annotation project topics 2 1,848 06-11-2015, 02:27 PM
Last Post: seminar report asees

Forum Jump: