A Link-Based Cluster Ensemble Approach for Categorical Data Clustering
#1

Abstract

Although attempts have been made to solve the problem of clustering categorical data via cluster ensembles, with the cresults being competitive to conventional algorithms, it is observed that these techniques unfortunately generate a final data partition based on incomplete information. The underlying ensemble-information matrix presents only cluster-data point relations, with many entries being left unknown. The paper presents an analysis that suggests this problem degrades the quality of the clustering result, and it presents a new link-based approach, which improves the conventional matrix by discovering unknown entries through similarity between clusters in an ensemble. In particular, an efficient link-based algorithm is proposed for the underlying similarity assessment. Afterward, to obtain the final clustering result, a graph partitioning technique is applied to a weighted bipartite graph that is formulated from the refined matrix. Experimental results on multiple real data sets suggest that the proposed link-based method almost always outperforms both conventional clustering algorithms for categorical data and well-known cluster
Reply
#2

Although attempts have been made to solve the problem of grouping categorical data across clusters, the results being competitive with conventional algorithms, it is observed that these techniques, unfortunately, generate a final data partition based on incomplete information. The underlying set information array only displays cluster data point relationships, with many entries remaining unknown. The article presents an analysis that suggests that this problem degrades the quality of the result of the cluster and presents a new approach based on links that improves the conventional matrix by discovering unknown inputs through the similarity between groups in a set. In particular, we propose an efficient link-based algorithm for the evaluation of underlying similarity. Subsequently, to obtain the final grouping result, a graphical partitioning technique is applied to a weighted bipartite graph that is formulated from the refined matrix. Experimental results in multiple real data-sets suggest that the proposed link-based method almost always outperforms conventional clustering algorithms for well-known categorical data and clustering techniques.

Data mining is the practice of automatically searching for large data stores to discover patterns and trends that go beyond simple analysis. Data mining models (prediction and description) are achieved using the following main data mining tasks: Classification, Regression, Grouping, Summarization and Dependency Modelling, and Detection of Changes and Deviations. The grouping groups the elements into a data set according to their similarity in such a way that the elements of each grouping are similar, whereas the elements of different groups are dissimilar. It is about analysing or processing multivariate data, such as: characterise customer groups based on purchasing patterns, classify web documents, group genes and proteins that have similar functionality, group spatial locations prone to earthquakes based on seismological data, etc. It is the integration of the results of several clustering algorithms using a consensus function to obtain stable results. The idea of combining different clustering results (cluster set or cluster aggregation) emerged as an alternative approach to improve the quality of clustering algorithm results. In this work we have designed and implemented a clusters cluster approach using the divide and conquer technique to treat this type of mixed data sets. Therefore, the initial data set is divided into sub-sets of data, ie, numerical and categorical. Next, clustering algorithms designed for numeric and categorical data sets can be used to produce corresponding clusters. Finally, the grouping results from the previous step are combined as a categorical data set in which the same categorical grouping algorithm or any other one can be used to produce the final output clusters.

Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: cluster ensemble doc, a link based cluster ensemble approach for categorical data clustering ppt download, seminar triples categorical, cluster ensemble approach for categorical data clustering, a link based cluster ensemble approach for categorical data clustering ppt, categorical data clustering projects, a divide and conquer approach for minimum spanningtree based clustering,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To 1 750 14-02-2017, 04:15 PM
Last Post: jaseela123d
  Remote Server Monitoring System For Corporate Data Centers smart paper boy 3 2,806 28-03-2016, 02:51 PM
Last Post: dhanabhagya
  Secured Data Hiding and Extractions Using BPCS project report helper 4 3,644 04-02-2016, 12:52 PM
Last Post: seminar report asees
  Data Hiding in Binary Images for Authentication & Annotation project topics 2 1,812 06-11-2015, 02:27 PM
Last Post: seminar report asees
  DATA LEAKAGE DETECTION project topics 16 13,002 31-07-2015, 02:59 PM
Last Post: seminar report asees
  An Acknowledgement-Based Approach for the Detection of routing misbehavior in MANETs mechanical engineering crazy 2 2,941 26-05-2015, 03:04 PM
Last Post: seminar report asees
  An Acknowledgment-Based Approach For The Detection Of Routing Misbehavior In MANETs electronics seminars 7 4,672 27-01-2015, 12:09 AM
Last Post: Guest
  Privacy Preservation in Data Mining sajidpk123 3 2,930 13-11-2014, 10:48 PM
Last Post: jaseela123d
  projects on data mining? shakir_ali 2 2,026 05-11-2014, 09:30 PM
Last Post: jaseela123d
  data mining full report project report tiger 25 171,040 07-10-2014, 09:10 PM
Last Post: ToPWA

Forum Jump: