Data Types Generalization for Data Mining Algorithm;
#1

Abstract
With the increasing of database applications, mining interestinginformation from huge databases becomes of most concern and avariety of mining algorithms have been proposed in recent years.As we know, the data processed in data mining may be obtainedfrom many sources in which different data types may be used.However, no algorithm can be applied to all applications due tothe difficulty for fitting data types of the algorithm, so the selectionof an appropriate mining algorithm is based on not only thegoal of application, but also the data fittability. Therefore, totransform the non-fitting data type into target one is also an importantwork in data mining, but the work is often tedious orcomplex since a lot of data types exist in real world. Merging thesimilar data types of a given selected mining algorithm into ageneralized data type seems to be a good approach to reduce thetransformation complexity. In this work, the data types fittabilityproblem for six kinds of widely used data mining techniques isdiscussed and a data type generalization process includingmerging and transforming phases is proposed. In the mergingphase, the original data types of data sources to be mined are firstmerged into the generalized ones. The transforming phase is thenused to convert the generalized data types into the target ones forthe selected mining algorithm. Using the data type generalizationprocess, the user can select appropriate mining algorithm just forthe goal of application without considering the data types.to choose an appropriate one by themselves. This is because thedata provided can not be directly used for data mining algorithms.Since most data mining algorithms can only be applied to somespecific data types, the types of data stored in databases restrictsthe choice of data mining methods. If certain kinds of knowledgeneed to be obtained using some data mining algorithms, datatypes transformation should be done first and this is what wecalled “the data types fittability problem” for data mining. For thetime being, there is no tool that can help users to do this kind ofdata types transformation. In this paper, we will survey and analyzethe data types fittability problem for data mining algorithms,and then we propose a “data types generalization process” tosolve the data types fittability problem for the attributes in relationaldatabases.The “data types generalization process” including mergingand transforming phases is a procedure to transform the data typesof atttributes contained in relations (tables). In the merging phase,the original data types of data sources to be mined are first mergedinto the generalized ones. The transforming phase is then used toconvert the generalized data types into the target ones for theselected mining algorithm. Using the data type generalizationprocess, the user can select appropriate mining algorithm just forthe goal of application without considering the data types.
2. Related work
1. Introduction

As mentioned above, because many data mining algorithmscan only be applied to the data types with restricted range,In recent years, the amount of various data grows rapidly. users possibly need to do data types transformation before theWidely available, low-cost computer technology now makes it selected algorithm has been executed. In this paper, we propose apossible to both collect historical data and also institute on-line general concept called “data types generalization process“ whichanalysis for newly arriving data. Automated data generation and provide a procedure for doing this kind of data types transformagatheringleads to tremendous amounts of data stored in databases. tion. Data types generalization can be seen as a pre-processing ofAlthough we are filled with data, but we lack for knowledge. Data data mining. Of course, other pre-processing such as data selecmining[4, 9, 161 is the automated discovery of non-trivial, tion, data cleaning, dimension (attribute) reduction, missing datapreviously unknown, and potentially useful knowledge embedded handling may also need to be performed before running the seindatabases. Different kinds of data mining methods and lected data mining algorithm. In summary, the whole process ofalgorithms have been proposed [4, 91, each of which has its own data mining is the so-called KDD


Download full report
http://ieeexplore.ieeeiel5/6569/17812/00...ber=823352
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: data mining eth, ucsd data mining competition, report on data mining, abstract data types wikipedia, nine basic data types in keil c compiler, data mining algorithm in java with source code, algorithm generalization,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  A Link-Based Cluster Ensemble Approach for Categorical Data Clustering 1 1,084 16-02-2017, 10:51 AM
Last Post: jaseela123d
  Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To 1 768 14-02-2017, 04:15 PM
Last Post: jaseela123d
  An Efficient Algorithm for Mining Frequent Patterns full report project topics 3 4,764 01-10-2016, 10:02 AM
Last Post: Guest
  watermarking algorithm seminar class 3 2,676 27-04-2016, 11:17 AM
Last Post: dhanabhagya
  Remote Server Monitoring System For Corporate Data Centers smart paper boy 3 2,850 28-03-2016, 02:51 PM
Last Post: dhanabhagya
  Secured Data Hiding and Extractions Using BPCS project report helper 4 3,668 04-02-2016, 12:52 PM
Last Post: seminar report asees
  Data Hiding in Binary Images for Authentication & Annotation project topics 2 1,836 06-11-2015, 02:27 PM
Last Post: seminar report asees
  DATA LEAKAGE DETECTION project topics 16 13,117 31-07-2015, 02:59 PM
Last Post: seminar report asees
  DYNAMIC SEARCH ALGORITHM IN UNSTRUCTURED PEER-TO-PEER NETWORKS--PARALLEL AND DISTRIBU electronics seminars 9 7,365 14-07-2015, 02:25 PM
Last Post: seminar report asees
  Privacy Preservation in Data Mining sajidpk123 3 2,974 13-11-2014, 10:48 PM
Last Post: jaseela123d

Forum Jump: