ASK HERE

seminar class · 13-05-2011, 12:00 PM

Abstract: Spatial data mining algorithms heavily depend on the efficient processing of neighborhood
relations since the neighbors of many objects have to be investigated in a single run of a typical
algorithm. Therefore, providing general concepts for neighborhood relations as well as an efficient
implementation of these concepts will allow a tight integration of spatial data mining algorithms
with a spatial database management system. This will speed up both, the development and
the execution of spatial data mining algorithms. In this paper, we define neighborhood graphs and
paths and a small set of database primitives for their manipulation. We show that typical spatial
data mining algorithms are well supported by the proposed basic operations. For finding significant
spatial patterns, only certain classes of paths “leading away” from a starting object are relevant.
We discuss filters allowing only such neighborhood paths which will significantly reduce the
search space for spatial data mining algorithms. Furthermore, we introduce neighborhood indices
to speed up the processing of our database primitives. We implemented the database primitives on
top of a commercial spatial database management system. The effectiveness and efficiency of the
proposed approach was evaluated by using an analytical cost model and an extensive experimental
study on a geographic database.
1 Introduction
The computerization of many business and government transactions and the advances in scientific
data collection tools provide us with a huge and continuously increasing amount of data. This
explosive growth of databases has far outpaced the human ability to interpret this data, creating an
urgent need for new techniques and tools that support the human in transforming the data into useful
information and knowledge. Knowledge discovery in databases (KDD) has been defined as the
non-trivial process of discovering valid, novel, and potentially useful, and ultimately understandable
patterns from data [FPS 96]. The process of KDD is interactive and iterative, involving several
steps such as the following ones:
• Selection: selecting a subset of all attributes and a subset of all data from which the knowledge
should be discovered.
submitted to Special Issue on: "Integration of Data Mining with Database
• Data reduction: using dimensionality reduction or transformation techniques to reduce the effective
number of attributes to be considered.
• Data mining: the application of appropriate algorithms that, under acceptable computational efficiency
limitations, produce a particular enumeration of patterns over the data.
• Evaluation: interpreting and evaluating the discovered patterns with respect to their usefulness
in the given application.
Spatial Database Systems (SDBS) (see [Gue 94] for an overview) are database systems for the
management of spatial data. To find implicit regularities, rules or patterns hidden in large spatial
databases, e.g. for geo-marketing, traffic control or environmental studies, spatial data mining algorithms
are very important (see [KHA 96] for an overview of spatial data mining).
Most existing data mining algorithms run on separate and specially prepared files, but integrating
them with a database management system (DBMS) has the following advantages. Redundant
storage and potential inconsistencies can be avoided. Furthermore, commercial database systems
offer various index structures to support different types of database queries. This functionality can
be used without extra implementation effort to speed-up the execution of data mining algorithms
(which, in general, have to perform many database queries). Similar to the relational standard language
SQL, the use of standard primitives will speed-up the development of new data mining algorithms
and will also make them more portable.
In this paper, we introduce a set of database primitives for mining in spatial databases. [AIS 93]
follows a similar approach for mining in relational databases. Our database primitives are based on
the concept of neighborhood relations since attributes of the neighbors of some object of interest
may have an influence on the object itself. The proposed primitives are sufficient to express most
of the algorithms for spatial data mining from the literature. We present techniques for efficiently
supporting these primitives by a DBMS.
The rest of the paper is organized as follows. Section 2 introduces our database primitives for
spatial data mining. In section 3, we review spatial data mining algorithms and demonstrate how
they can be expressed by using the proposed primitives. Section 4 presents methods of efficiently
supporting our database primitives by existing DBMSs. Section 5 summarizes the contributions
and discusses several issues for future research.

Download full report
http://dbs.informatik.uni-muenchen.de/~e...mitted.pdf

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	A Link-Based Cluster Ensemble Approach for Categorical Data Clustering		1	1,086	16-02-2017, 10:51 AM Last Post: jaseela123d
	Service-Oriented Architecture for Weaponry and Battle Command and Control Systems in		1	1,063	15-02-2017, 03:40 PM Last Post: jaseela123d
	Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To		1	768	14-02-2017, 04:15 PM Last Post: jaseela123d
	An Efficient Algorithm for Mining Frequent Patterns full report	project topics	3	4,764	01-10-2016, 10:02 AM Last Post: Guest
	Remote Server Monitoring System For Corporate Data Centers	smart paper boy	3	2,853	28-03-2016, 02:51 PM Last Post: dhanabhagya
	Secured Data Hiding and Extractions Using BPCS	project report helper	4	3,672	04-02-2016, 12:52 PM Last Post: seminar report asees
	Data Hiding in Binary Images for Authentication & Annotation	project topics	2	1,836	06-11-2015, 02:27 PM Last Post: seminar report asees
	DATA LEAKAGE DETECTION	project topics	16	13,119	31-07-2015, 02:59 PM Last Post: seminar report asees
	Privacy Preservation in Data Mining	sajidpk123	3	2,974	13-11-2014, 10:48 PM Last Post: jaseela123d
	projects on data mining?	shakir_ali	2	2,047	05-11-2014, 09:30 PM Last Post: jaseela123d

Important Note..!

ASK HERE