11-08-2011, 12:42 PM
[attachment=15217]
Description :
In this project we build a client side web assistant, which basically eases personalized web search. In contrast to regular linear search-results display model, we display results in clusters(2-dimensions). We go about doing this by classifying the queries based on ODP ontology and query expansion.
1. Introduction:
Web search engine is currently one of the most important information access and management
tools for WWW users. Most users interact with search engine using short queries which are
composed of four words or even fewer. This phenomenon of “short queries” has prevented search
engines from finding users information needs behind their queries.
Given the impact of search engines on the Web users’ experience, improving the quality of
search results has become the holy grail of the search engine operators. As a part of this
endeavour, there has been a recent interest in identifying the “goal” of a user during search, so that
the identified goal can be used to improve page ranking, result clustering and the final answer
presentation. As in the present day the server side models have already been widely accepted, we
propose a client side model for personalization.
Drawbacks of present model:
None of the present day popular search engines such as Google, AOL, and Yahoo etc
enables the personalized results for any user. They show the same set of results for every user
irrespective of the users’ personal interests .For example, when two users User1 and User2 with
their interest in football and cricket respectively fire a query “World Cup” to the search engine, then
the User1 expects most of the results should be related to Football world cup where as User2
expects the results to be dominated by Cricket. In this paper, we deal this problem by calculating
the “Modified Page Rank” which takes the users interest into account and re-ranking the results.
Also the present model of search engines give a linear view of the results even the query
term may refer to multiple domains of interest, the user gets biased by the way the results appear.
For example, Iris could refer to Iris recognition in biometrics or the Iris could refer to a movie. As the
page rank for Iris featuring biometric recognition is high, the results regarding Iris as a movie, goes
completely missing in the first few pages which is undesirable. So in this paper, we show the results
in clusters based on the ontology so that user could get to know the different domains it falls into.