ASK HERE

smart paper boy · 30-08-2011, 09:38 AM

Abstract
The association rules represent an important class of knowledge that can be discovered from data warehouses. Current research efforts are focused on inventing efficient ways of discovering these rules from large databases. As databases grow, the discovered rules need to be verified and new rules need to be added to the knowledge base. Since mining afresh every time the database grows is inefficient, algorithms for incremental mining are being investigated. Their primary aim is to avoid or minimize scans of the older database by using the intermediate data constructed during the earlier mining. In this paper, we present one such algorithm. We make use of large and candidate itemsets and their counts in the older database, and scan the increment to find which rules continue to prevail and which ones fail in the merged database. We are also able to find new rules for the incremental and updated database. The algorithm is adaptive in nature, as it infers the nature of the increment and avoids altogether, if possible, multiple scans of the incremental database. Another salient feature is that it does not need multiple scans of the older database. We also indicate some results on its performance against synthetic data.
1. Introduction
Data mining, which is also referred to as knowledge discovery in databases, is a process of nontrivial extraction of implicit, previously unknown, and potentially useful information from data in a database. Data mining has recently attracted considerable attention from database user community as they realize that this information, locked inside the large organizational databases built over many years, can provide information and knowledge for enhancing their organization’s effectiveness and competitiveness. The process of data mining provides knowledge in the form of rules and patterns based on statistical analysis of data. The process is challenging because the source databases from which the knowledge is extracted are large and growing. The knowledge itself is time-varying, as some rules and patterns may hold now but not in future, or vice-versa. The mining techniques must scale well to handle very large and growing databases, and should permit efficient maintenance of extracted knowledge. One of the most studied data mining problems is mining for association rules. Given a collection of items and a set of records (i.e, transactions), each of which contain some number of items from the given collection, the association rules indicate affinities that exist among the collection of items. These affinities can be expressed by rules such as ”62 % of all the records that contain items A, B and C also contain items D and E.” The specific percentage of occurrences is called the confidence factor of the rule. A database may throw up a very large number of association rules. Much work has been done in the field of finding association rules [1] [2] [3] [8] [6]. These efforts are directed at devising algorithms to mine the rules efficiently in large databases. They commonly require multiple scans of the given database. As databases grow over time, there is a need to undertake mining again for maintaining (i.e., verifying) rules discovered earlier and also for discovering new rules. However, it has been realized that applying the proposed algorithms on the updated database (i.e, the older and the incremental database together) may be too costly. Researchers are now investigating ways by which rule maintenance can be done by processing the incremental part separately, and scanning the older database only if necessary. To achieve this, the incremental mining algorithms generally plan to use intermediate information collected during earlier mining process.

Download full report
http://googleurl?sa=t&source=web&cd=2&ve...F15304.pdf&ei=_mFcTuHnO4bsrAfwppSlDw&usg=AFQjCNHloi7ghY1gAqHdriBuM9YIgLLHDQ

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	Space Time Adaptive Processing	smart paper boy	2	1,834	29-03-2014, 10:38 PM Last Post: ramsaini
	adaptive pid controller full report	seminar topics	4	5,069	11-01-2013, 12:42 PM Last Post: seminar details
	ADAPTIVE TECHNIQUES BASED HIGH IMPULSIVE NOISE DETECTION AND REDUCTION OF A DIGITAL	smart paper boy	1	1,936	05-12-2012, 03:58 PM Last Post: seminar details
	On-line emission and economic load dispatch using adaptive Hopfield neural network	computer science topics	4	3,400	24-10-2012, 04:02 PM Last Post: seminar details
	ODAM: An Optimized Distributed Association Rule Mining Algorithm	computer science crazy	5	3,354	23-01-2012, 11:56 AM Last Post: seminar addict
	quantum cost optimization algorithm full report	computer science crazy	0	886	16-01-2012, 06:02 PM Last Post: computer science crazy
	Simulation of Dijkstra Routing Algorithm full report	project topics	2	3,815	30-11-2011, 10:22 PM Last Post: VickyBujju
	Adaptive Lighting System for Automobiles	project topics	4	2,834	19-11-2011, 09:31 AM Last Post: seminar addict
	A Time-Varying Convergence Parameter for the LMS Algorithm in the Presence of White	smart paper boy	0	925	19-08-2011, 10:37 AM Last Post: smart paper boy
	An Adaptive K-means Clustering Algorithm for Breast Image Segmentation	smart paper boy	0	1,140	29-07-2011, 02:19 PM Last Post: smart paper boy

Important Note..!

ASK HERE