ASK HERE

jaseela123d · 16-02-2017, 10:51 AM

Although attempts have been made to solve the problem of grouping categorical data across clusters, the results being competitive with conventional algorithms, it is observed that these techniques, unfortunately, generate a final data partition based on incomplete information. The underlying set information array only displays cluster data point relationships, with many entries remaining unknown. The article presents an analysis that suggests that this problem degrades the quality of the result of the cluster and presents a new approach based on links that improves the conventional matrix by discovering unknown inputs through the similarity between groups in a set. In particular, we propose an efficient link-based algorithm for the evaluation of underlying similarity. Subsequently, to obtain the final grouping result, a graphical partitioning technique is applied to a weighted bipartite graph that is formulated from the refined matrix. Experimental results in multiple real data-sets suggest that the proposed link-based method almost always outperforms conventional clustering algorithms for well-known categorical data and clustering techniques.

Data mining is the practice of automatically searching for large data stores to discover patterns and trends that go beyond simple analysis. Data mining models (prediction and description) are achieved using the following main data mining tasks: Classification, Regression, Grouping, Summarization and Dependency Modelling, and Detection of Changes and Deviations. The grouping groups the elements into a data set according to their similarity in such a way that the elements of each grouping are similar, whereas the elements of different groups are dissimilar. It is about analysing or processing multivariate data, such as: characterise customer groups based on purchasing patterns, classify web documents, group genes and proteins that have similar functionality, group spatial locations prone to earthquakes based on seismological data, etc. It is the integration of the results of several clustering algorithms using a consensus function to obtain stable results. The idea of combining different clustering results (cluster set or cluster aggregation) emerged as an alternative approach to improve the quality of clustering algorithm results. In this work we have designed and implemented a clusters cluster approach using the divide and conquer technique to treat this type of mixed data sets. Therefore, the initial data set is divided into sub-sets of data, ie, numerical and categorical. Next, clustering algorithms designed for numeric and categorical data sets can be used to produce corresponding clusters. Finally, the grouping results from the previous step are combined as a categorical data set in which the same categorical grouping algorithm or any other one can be used to produce the final output clusters.

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To		1	816	14-02-2017, 04:15 PM Last Post: jaseela123d
	Remote Server Monitoring System For Corporate Data Centers	smart paper boy	3	2,944	28-03-2016, 02:51 PM Last Post: dhanabhagya
	Secured Data Hiding and Extractions Using BPCS	project report helper	4	3,742	04-02-2016, 12:52 PM Last Post: seminar report asees
	Data Hiding in Binary Images for Authentication & Annotation	project topics	2	1,892	06-11-2015, 02:27 PM Last Post: seminar report asees
	DATA LEAKAGE DETECTION	project topics	16	13,264	31-07-2015, 02:59 PM Last Post: seminar report asees
	An Acknowledgement-Based Approach for the Detection of routing misbehavior in MANETs	mechanical engineering crazy	2	3,034	26-05-2015, 03:04 PM Last Post: seminar report asees
	An Acknowledgment-Based Approach For The Detection Of Routing Misbehavior In MANETs	electronics seminars	7	4,808	27-01-2015, 12:09 AM Last Post: Guest
	Privacy Preservation in Data Mining	sajidpk123	3	3,064	13-11-2014, 10:48 PM Last Post: jaseela123d
	projects on data mining?	shakir_ali	2	2,110	05-11-2014, 09:30 PM Last Post: jaseela123d
	data mining full report	project report tiger	25	171,511	07-10-2014, 09:10 PM Last Post: ToPWA

Important Note..!

ASK HERE