WEB MINING
#9
[attachment=10210]
INTRODUCTION
What is web mining?

Web mining is the extraction of interesting and potentially useful pattern and implicit information from artifacts or activity related to World Wide Web
Why web usage mining?
E-commerce
E-business
How to perform web usage mining?

Web server log files were used initially by the webmasters and system administrators for the purposes of :
1. How much traffic they are getting?
2. How many requests fail?
3.What kind of errors are being generated?
TAXONOMY OF WEB MINING
 Web content Mining:
 Web crawler: To search the Web pages the problems are:
Scale, Variety, Duplicates, Domain Name Resolution
Types of crawler:
1. Traditional Crawler
2. Periodic Crawler
3. Incremental Crawling
4. Focused Crawling.
 Harvest system:
1. Collector-Internet Service Provider
2. Broker-Index and query interface
 Virtual Web View:
This approach is based in the database.
Personalization:
With Web personalization, users can get more information on the Internet faster because Web sites know their interests and needs.
The Web site then uses the database to match user’s needs to the products or information provided at the site with middleware facilitating the process.
Web Structure Mining
 The two techniques for structure mining:
1. Page Rank: PR is one of the methods Google uses to determine a pages relevance or importance. The PR value for a page is calculated based on the number of pages that point to it. PR is displayed on the toolbar of your browser if you’ve installed the Google toolbar.
Page Rank: The actual page rank for each page is calculated by Google.
Toolbar PR: The page rank is displayed in the Google toolbar in your
browser. This ranges from 0 to 10.
Backline: If page A out to page B then page B is said to have a “Back link” from page A.
 Definition by Google:
We assume page A has pages T1…Tn which point to it. The parameter d is a damping factor, which can be set between 0 and 1. We usually set d to 0.85. Also C(A) is defined as the number of links going out of page A.
The PR of a page A is given as follows:
PR(A)=(1-d)+d(PR(T1)/C(T1)+…+PR(Tn)/C(Tn))
2. Important Pages:
A page is important if important pages page link to it.

Assume that the Web consists of
only three pages say Netscape,
Microsoft and Amazon. The links
among these pages were shown
In the limit, the solution is n=a=6/5; m=3/5. That is Netscape and
Amazon each have the same importance and twice the importance of
Microsoft.
Following are the problems that are faced by on the Web:
a. Dead ends: A page that has no successors has now here to send its importance . Eventually all importance will “leak out of” the web.
b. Spider traps: A group of one or more pages that have no links out of the group eventually accumulate all the importance of the web
Web usage Mining:
Web usage mining has three activities given below:

Preprocessing activities center around reforming the web log data before processing.
Pattern discovery activities form the major portion of the mining activities because these activities look to find hidden pattern within log data.
Pattern analysis is the process of looking at and interpreting the results of discovery activities.
Application is totally different from other traditional data mining application such as “Goods Basket” model. We can interpret this problem from two aspects:
1. Weak Relations between user and site
2. Complicated behaviors
WEB MINING ARCHITECTURE
WebMiner system:

 This system divides the Web Usage mining process into three main parts i.e., access referrer, agents, HTML files that make up the site
Data cleaning
Transaction Identification
Date integration
User identification
Session identification
Preprocessing:
Before processing in web usage mining include the following:
 Collection of usage data for web visitors: In some services it needs the user registration.
 User identification: It is easy to identify different users but it cannot avoid that some private personal registration information is misused by hackers.
 Session construction: A session is a visit. Two time constraints needed for this session construction i.e., time gap between any two continuously accessed and duration for any session can not exceed a defined threshold.
 Behavior recovery:
User behavior is recovered from the session for this user and defined as b=(S’,R), R is relation among S.
<0,292,300,304,350,326,512,510,512,515,513,292,319,350,517,286>
It includes two kinds of behavior
The first is that user behaviors are represented with only those unique accessed pages.
S’= <0,292,300,304,326,510,512, 513,515,319,350,517,286>
The second is that user behaviors are represented wit those unique accessed pages and also the access sequence among these pages.
<0-292-300-304-350-326-512-510-513-515-319-517-286>
Applications
 Intelligent Web services
 Log analysis for security applications
 Contextual information access and retrieval
 Recommendation and personalization systems
 Fraud and misuse detection, such as credit-card fraud and
network detection.
Services:
 User Modeling and Profiling
 Enabling Technologies
 Web content, usage, structure mining
Conclusion:
 In this paper we proposed a definition of Web mining and developed a taxonomy of the various ongoing efforts related to it.
 Companies find a new and better way to do business.
 However, E-business cannot just build a web site and then sit back and reap the benefits, which , in most cases is fruitless.
 Companies have to implement Web mining systems to understand their customers’ profiles, and to identify their own strength and weakness of their E-marketing efforts on the web through continuous improvements.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: astronomy softwares, dom kulture, web mining and weblog and jmeter, web mining for seminar, web mining book download, mining definitions, advantages n disadvantages of web mining,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Messages In This Thread
WEB MINING - by seminar projects crazy - 31-01-2009, 12:52 AM
RE: WEB MINING - by sriman - 12-01-2010, 11:20 AM
RE: WEB MINING - by justlikeheaven - 26-01-2010, 09:56 AM
RE: WEB MINING - by seminar class - 03-03-2011, 11:31 AM
RE: WEB MINING - by seminar class - 04-03-2011, 03:34 PM
RE: WEB MINING - by seminar class - 05-03-2011, 04:08 PM
RE: WEB MINING - by seminar class - 10-03-2011, 11:59 AM
RE: WEB MINING - by seminar class - 15-03-2011, 03:04 PM
RE: WEB MINING - by jacktorson - 15-03-2011, 04:20 PM
RE: WEB MINING - by seminar class - 19-04-2011, 11:34 AM
RE: WEB MINING - by seminar class - 12-05-2011, 09:25 AM
RE: WEB MINING - by bhawnaAggarwal - 09-10-2011, 08:01 PM
RE: WEB MINING - by seminar addict - 10-10-2011, 09:56 AM
RE: WEB MINING - by seminar addict - 02-02-2012, 01:23 PM
RE: WEB MINING - by diamondkaju - 02-02-2012, 10:37 PM
RE: WEB MINING - by seminar addict - 03-02-2012, 10:24 AM
RE: WEB MINING - by seminar details - 16-02-2013, 11:43 AM
RE: WEB MINING - by Guest - 15-12-2018, 10:38 PM

Possibly Related Threads...
Thread Author Replies Views Last Post
  It is Imperative to Strengthen Safety in Mining Crusher Industry wanerjob 1 1,157 25-10-2014, 11:22 PM
Last Post: jaseela123d
  Coal Mining Machine from Shanghai Zenith wanerjob 0 928 27-09-2014, 08:59 PM
Last Post: wanerjob
  Web Server for High Performance Biological Sequence Alignment Based on FPGA seminar-database 0 1,384 23-05-2011, 08:15 AM
Last Post: seminar-database
  The Web Sensor Gateway Architecture for ZIGBEE seminar class 1 1,921 03-05-2011, 12:55 PM
Last Post: seminar class
  web traffic???????????? mena jon 0 994 30-04-2011, 01:46 PM
Last Post: mena jon
  Gesture Controlled Web navigation using GestureCam full report seminar class 0 1,172 27-04-2011, 09:23 AM
Last Post: seminar class
  Developing Mobile Web Applications With ASP.NET Mobile Controls seminar class 0 1,697 28-02-2011, 09:42 AM
Last Post: seminar class
Information WEB BROWSING seminar projects crazy 0 1,342 31-01-2009, 11:20 AM
Last Post: seminar projects crazy

Forum Jump: