ASK HERE

project topics · 22-04-2010, 12:32 AM

Search Engine with Web Crawler

A web crawler (also known as a Web spider or Web robot) is a program or automated script which browses the World Wide Web in a methodical, automated manner.

This process is called Web crawling or spidering. Search engines use spidering as a means of providing up-to-date data. Web crawlers Download and will index web pages to provide fast searches.

A Web crawler starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier. URLs from the frontier are recursively visited according to a set of policies.

There are two important characteristics of the Web that generate a scenario in which Web crawling is very difficult: its large volume and its rate of change, as there are a huge number of pages being added, changed and removed every day. Also, network speed has improved less than current processing speeds and storage capacities.

The large volume implies that the crawler can only download a fraction of the Web pages within a given time, so it needs to prioritize its downloads. The high rate of change implies that by the time the crawler is downloading the last pages from a site, it is very likely that new pages have been added to the site, or that pages have already been updated or even deleted.

The behavior of a Web crawler is the outcome of a combination of policies:

selection policy that states which pages to download.
re-visit policy that states when to check for changes to the pages.
politeness policy that states how to avoid overloading websites.
parallelization policy that states how to coordinate distributed web crawlers.
Implementation

In this search engine project the webcrawler will start with some seeds and Will select the pages using some filters and policies. For Example If we are to create a blog Search Engine the crawler will be programmed to download blog related pages only.

The crawler can be developed using a simple java program. The program will download and index the pages in a database for faster searching.

The search in can be developed using JSP/Servlet and Ajax. The Search engine will accept the search keywords and will search the database for the keywords using some search algorithms. Most relevant results will show as list using paging in basis of merit.

**seminar details** · 26-11-2012, 02:25 PM

to get information about the topic "search engine" full report ppt and related topic refer the page link bellow

http://studentbank.in/report-desktop-sea...e=threaded

http://studentbank.in/report-search-engi...e=threaded

http://studentbank.in/report-3d-search-engine

http://studentbank.in/report-a-search-en...-3d-models

http://studentbank.in/report-image-searc...ike-google

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	Platform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Ar		1	1,042	15-02-2017, 04:39 PM Last Post: jaseela123d
	WEB SERVICE SELECTION BASED ON RANKING OF QOS USING ASSOCIATIVE CLASSIFICATION		1	914	15-02-2017, 04:13 PM Last Post: jaseela123d
	A Validation Framework for the Service-Oriented Process Designing		1	947	15-02-2017, 03:58 PM Last Post: jaseela123d
	Service-Oriented Architecture for Weaponry and Battle Command and Control Systems in		1	1,059	15-02-2017, 03:40 PM Last Post: jaseela123d
	Cloud Computing with Service Oriented Architecture in Business Applications		1	907	15-02-2017, 11:55 AM Last Post: jaseela123d
	Migrating Component-based Web Applications to Web Services: towards considering a ”We		1	839	15-02-2017, 10:56 AM Last Post: jaseela123d
	A Conceptual Overview of Service-Oriented Software Systems Development		1	831	14-02-2017, 03:38 PM Last Post: jaseela123d
	Online Rental House Web Portal	smart paper boy	6	5,428	06-02-2016, 01:00 PM Last Post: seminar report asees
	DYNAMIC SEARCH ALGORITHM IN UNSTRUCTURED PEER-TO-PEER NETWORKS--PARALLEL AND DISTRIBU	electronics seminars	9	7,359	14-07-2015, 02:25 PM Last Post: seminar report asees
	Web Based Blood Bank Management System	project report maker	4	12,608	18-04-2015, 07:12 PM Last Post: Guest

Important Note..!

ASK HERE