21-04-2010, 11:47 PM
A sitemap is a list of pages of a web site accessible to crawlers or users. Site maps can improve search engine optimization of a site by making sure that all the pages can be found. This is especially important if a site uses a dynamic access to content such as Adobe Flash or JavaScript menus that do not include HTML links. They also act as a navigation aid by providing an overview of a site's content at a single glance.
Most search engines will only follow a finite number of links from a page, so if the number of links is very large, the site map may be required so that search engines and visitors can access all content on the site.
Google introduced Google Sitemaps so web developers can publish lists of links from across their sites. The basic premise is that some sites have a large number of dynamic pages that are only available through the use of forms and user entries. The sitemap files can then be used to indicate to a web crawler how such pages can be found. Google, MSN, Yahoo and Ask now jointly support the Sitemaps protocol.XML sitemaps have replaced the older method of "submitting to search engines" by filling out a form on the search engine's submission page. Now web developers submit a sitemap directly, or wait for search engines to find it.
Since new pages are created frequently, sitemaps of dynamic websites such as news portal, community networks etc need to be updated by including the newly created urls.
The proposed project is a webapplication that can be hosted in a separate folder along with the website that reqires daily updation of sitemap. The applications will crawl every pages in a given time interval and the generated sitemap will be saved in a given location. So without the user intervention the site map will be created and saved in the location that submitted to the search engines. PHP, J2EE, ASP.NET versions of this website need to hosted in diffrent servers.