DATA MINING AND WARE HOUSING
#6
Presented By
S.PAVANI

[attachment=13396]
ABSTRACT
The words “DATAWAREHOUSE & DATA MINING” seem interesting because in today’s World, the competitive edge is coming less from optimization and more from the use of the information that these systems have been collecting over the years.
Data warehousing has quickly evolved into a unique and popular business application class. Early builders of data warehouses already consider their systems to be key components of their IT strategy and architecture.
In reviewing the development of data warehousing, we need to begin with a review of what had been done with the data before of evolution of data warehouses.
In this paper firstly, the primary emphasis had been given to the different types of Data warehouses, Architecture of Data warehouses and Analysis Process of Data warehouses. In the next section ways to build Data warehouses have been discussed along with specification of the requirements needed for them. To add more importance another key attribute- about the ETL TOOLS was also given.
No discussion of the data warehousing systems is complete without review of “DATA MINING” This section explores the Processes, Working along with the different approaches and components that are commonly found in Data Mining.
Further evolution of the hardware and software technology will also continue to greatly influence the capabilities that are built into data warehouses. Data warehousing systems have become a key component of information technology architecture. A flexible enterprise data warehouse strategy can yield significant benefits for a long period.
The main idea of data ware houses and data mining have been realized through this paper, with help of several diagrams and examples. As a concluding point it is shown as how “DATA WAREHOUSES & DATA MINING” can be used in it’s nearest Future.
INTRODUCTION:
"A data warehouse is a subject oriented, integrated, time variant, non volatile collection of data in support of management's decision making process".
A data warehouse is a relational/multidimensional database that is designed for query and analysis rather than transaction processing. It usually contains historical data that is derived from transaction data. It separates analysis workload from transaction workload and enables a business to consolidate data from several sources.
In addition to a relational/multidimensional database, a data warehouse environment often consists of an ETL solution, an OLAP engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.
Data warehouses can be classified into three types:
 Enterprise data warehouse: An enterprise data warehouse provides a central database for decision support throughout the enterprise.
 Operational data store (ODS): This has a broad enterprise wide scope, but unlike the real enterprise data warehouse, data is refreshed in near real time and used for routine business activity.
 Data Mart: Data mart is a subset of data warehouse and it supports a particular region, business unit or business function.
Data warehouses and data marts are built on dimensional data modeling where fact tables are connected with dimension tables. This is most useful for users to access data since a database can be visualized as a cube of several dimensions. A data warehouse provides an opportunity for slicing and dicing that cube along each of its dimensions.
It is designed for a particular line of business, such as sales, marketing, or finance. In a dependent data mart, data can be derived from an enterprise-wide data warehouse. In an independent data mart, data can be collected directly from sources.
In order to store data, over the years, many application designers in each branch have made their individual decisions as to how an application and database should be built. Therefore, source systems will be different in naming Conventions, variable measurements, encoding structures, and physical attributes of data.
Consider a bank that has several branches in several countries has millions of customers and the lines of business of the enterprise are savings, and loans. The following example explains how the data is integrated from source systems to target
Systems.
EXAMPLE OF SOURCE DATA:
Attribute Name Column Name Data type Values
Source System 1 Customer Application Date CUSTOMER_APPLICATION_DATE NUMERIC(8,0) 11012005
Source System 2 Customer Application Date CUST_APPLICATION_DATE DATE 11012005
Source System 3 Application Date APPLICATION_DATE DATE 01NOV2005
In the aforementioned example, attribute name, column name, data type and values are entirely different from one source system to another. This inconsistency in data can be avoided by integrating the data into a data warehouse with good standards.
EXAMPLE OF TARGET DATA (DATA WAREHOUSE)
Target System Attribute Name Column Name Data type Values
Record #1 Customer Application Date CUSTOMER_APPLICATION_DATE DATE 01112005
Record #2 Customer Application Date CUSTOMER_APPLICATION_DATE DATE 01112005
Record #3 Customer Application Date CUSTOMER_APPLICATION_DATE DATE 01112005
In the above example of target data, attribute names, column names, and data types are consistent throughout the target system. This is how data from various source systems is integrated and accurately stored into the data warehouse.
The primary concept of data warehousing is that the data stored for business analysis can most effectively be accessed by separating it from the data in the operational systems. Many of the reasons for this separation have evolved over the years. In the past, legacy systems archived data onto tapes as it became inactive and many analysis reports ran from these tapes or mirror data sources to minimize the performance impact on the operational systems. Data warehousing systems are most successful when data can be combined from more than one operational system. When the data needs to be brought together from more than one source application, it is natural that this integration be done at a place independent of the source applications. Before the evolution of structured data warehouses, analysts in many instances would combine data extracted from more than one operational system into a single spreadsheet or a database.
The data warehouse model needs to be extensible and structured such that the data from different applications can be added as a business case can be made for the data.
The architecture of the data warehouse and the data warehouse model greatly impact the success of the project.
Fig:1 ARCHITECTURE OF DATA WAREHOUSE:
The Data warehouse architecture has several flows in it. The first stage in this architecture is:
 Business modeling: Many organizations justify building data warehouses as an “act of faith”. This stage is necessary as to identify the projected
business benefits that should be derived from using the data warehouse.
 Data Modeling: Developing data modules for the source system and develops dimensional data modules for the Data warehouse.
 In this third stage several data source systems are collected together.
 In the ETL process stage, the fourth stage actions like developing ETL process, extraction of data, Transformation of data, loading of data are done.
 The fifth stage is Target data, which is in the form of several data marts.
 The last stage is generating several business reports called the “Business Intelligence stage.”
Data marts are generally called the subset of data warehouse. They diagrammatically look like: Generally, when we consider an example of an organization selling products throughout the world, the main four major dimensions are product, location, time and organization.
Interface with other data warehouses:
The data warehouse system is likely to be interfaced with other applications that use it as the source of operational system data. A data warehouse may feed data to other data warehouses or smaller data warehouses called data marts.
The operational system interfaces with the data warehouse often become increasingly stable and powerful. As the data warehouse becomes a reliable source of data that has been consistently moved from the operational systems, many downstream applications find that a single interface with the data warehouse is much easier and more functional than multiple interfaces with the operational applications. The data warehouse can be a better single and consistent source for many kinds of data than the operational systems. It is however, important to remember that the much of the operational state information is not carried over to the data warehouse. Thus, data warehouse cannot be source of all operation system interfaces.
Fig: 2 The Analysis processes of Data warehouse
Figure 2 illustrates the analysis processes that run against a data warehouse. Although a majority of the activity against today’s data warehouses is simple reporting and analysis, the sophistication of analysis at the high end continues to increase rapidly. Of course, all analysis run at data warehouse is simpler and cheaper to run than through the old methods. This simplicity continues to be a main attraction of data warehousing systems.
Four ways to build a data warehouse:
Although we have been building data warehouses since the early 1990s, there is still a great deal of confusion about the similarities and differences among the four major architectures: “top-down, bottom-up, hybrid and federated.” As a result, some firms fail to adopt a clear vision of how their data warehousing environments can and should evolve. Others, paralyzed by confusion or fear of deviating from prescribed tenets for success, cling too rigidly to one approach, undermining their ability to respond to new or unexpected situations. Ideally, organizations need to borrow concepts and tactics from each approach to create an environment that meets their needs.
Top-down vs. bottom-up:
The two most influential approaches are championed by industry heavyweights Bill Inmon and Ralph Kimball, both prolific authors and consultants in the data warehousing field.
Inmon, who is credited with coining the term "data warehousing" in the early 1990s, advocates a top-down approach in which organizations first build a data warehouse followed by data marts.
The data warehouse holds atomic or transaction data that is extracted from one or more source systems and integrated within a normalized, enterprise data model. From there, the data is “summarized”, "dimensionalized" and “distributed” to one or more "dependent" data marts. These data marts are "dependent" because they derive all their data from the centralized data warehouse. A data warehouse surrounded by dependent data marts is often called a "hub-and-spoke" architecture.
Kimball, on the other hand, advocates a bottom-up approach because it starts and ends with data marts, negating the need for a physical data warehouse altogether. Without a data warehouse, the data marts in a bottom-up approach contain all the data -- both atomic and summary -- users may want or need, now or in the future. Data is modeled in a star schema design to optimize usability and query performance. Each data mart (whether it is logically or physically deployed) builds on the next, reusing dimensions and facts so that users can query across data marts, if they want to, to obtain a single version of the truth.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: jobs in ware, bard college housing, data ware hosuing mining, pdf data flow diagram on housing loan, housing finance and project appraisal, ware housing of shampoo, history of data mining ad data ware housing,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Messages In This Thread
DATA MINING AND WARE HOUSING - by project topics - 02-04-2010, 03:36 PM
RE: DATA MINING AND WARE HOUSING - by Sidewinder - 29-05-2010, 09:52 PM
RE: DATA MINING AND WARE HOUSING - by seminar class - 07-05-2011, 11:48 AM

Possibly Related Threads...
Thread Author Replies Views Last Post
  Block Chain and Data Science jntuworldforum 0 8,370 06-10-2018, 12:15 PM
Last Post: jntuworldforum
  Data Encryption Standard (DES) seminar class 2 9,421 20-02-2016, 01:59 PM
Last Post: seminar report asees
  Skin Tone based Secret Data hiding in Images seminar class 9 7,089 23-12-2015, 04:18 PM
Last Post: HelloGFS
Brick XML Data Compression computer science crazy 2 2,428 07-10-2014, 09:26 PM
Last Post: seminar report asees
  Data Security in Local Network using Distributed Firewalls computer science crazy 10 15,254 30-03-2014, 04:40 AM
Last Post: Guest
  GREEN CLOUD -A Data Center Approach computer topic 0 1,565 25-03-2014, 10:13 PM
Last Post: computer topic
  3D-OPTICAL DATA STORAGE TECHNOLOGY computer science crazy 3 8,560 12-09-2013, 08:28 PM
Last Post: Guest
  Security in Data Warehousing seminar surveyer 3 10,239 12-08-2013, 10:24 AM
Last Post: computer topic
  data warehousing concepts project topics 7 7,168 05-02-2013, 12:00 PM
Last Post: seminar details
Star DATA MINING AND WAREHOUSE seminar projects crazy 2 3,399 05-02-2013, 12:00 PM
Last Post: seminar details

Forum Jump: