INFORMATION RETREIVAL ppt.
#1



[attachment=8481]


BY
Sudheer reddy . B


Agenda

Definition
History
Overview
Performance Measures
What IR Do-How ?
Traditional View of IR

History :

The idea of using computers to search for relevant pieces of information was popularized by the article “As We May Think” by Vannevar Bush in 1945.
The first automated information retrieval systems were introduced in the 1950s and 1960s.
In 1992, the US Department of Defense along with the NIST cosponsored the Text Retrieval Conference(TREC) program-Web Search Engines.

Overview :

An information retrieval process begins when a user enters a Query into the system.
Process may then be iterated if the user wishes to refine the query.

What IR Systems Try to Do ?

Predict, on the basis of some information about the user, and information about the knowledge resource, what information objects are likely to be the most appropriate for the user to interact with, at any particular time.

How IR Systems Try to Do This

Represent the user’s information problem (the query)
Represent (surrogate) and organize (classify) the contents of the knowledge resource
Compare query to surrogates (predict relevance)
Present results to the user for interaction/judgment

Performance measures :

Traditional goal of IR is to retrieve all and only the relevant IOs in response to a query.
All is measured by recall: the proportion of relevant IOs in the collection which are retrieved
Only is measured by precision: the proportion of retrieved IOs which are relevant



Reply
#2

[attachment=9990]
Information Retrieval Systems
n Information retrieval (IR) systems use a simpler data model than database systems
l Information organized as a collection of documents
l Documents are unstructured, no schema
n Information retrieval locates relevant documents, on the basis of user input such as keywords or example documents
l e.g., find documents containing the words “database systems”
n Can be used even on textual descriptions provided with non-textual data such as images
n Web search engines are the most familiar example of IR systems
n Differences from database systems
l IR systems don’t deal with transactional updates (including concurrency control and recovery)
l Database systems deal with structured data, with schemas that define the data organization
l IR systems deal with some querying issues not generally addressed by database systems
n Approximate searching by keywords
n Ranking of retrieved answers by estimated degree of relevance
Keyword Search
n In full text retrieval, all the words in each document are considered to be keywords.
l We use the word term to refer to the words in a document
n Information-retrieval systems typically allow query expressions formed using keywords and the logical connectives and, or, and not
l Ands are implicit, even if not explicitly specified
n Ranking of documents on the basis of estimated relevance to a query is critical
l Relevance ranking is based on factors such as
 Term frequency
– Frequency of occurrence of query keyword in document
 Inverse document frequency
– How many documents the query keyword occurs in
» Fewer è give more importance to keyword
 Hyperlinks to documents
– More links to a document è document is more important
Relevance Ranking Using Terms
n TF-IDF (Term frequency/Inverse Document frequency) ranking:
l Let n(d) = number of terms in the document d
l n(d, t) = number of occurrences of term t in the document d.
l Relevance of a document d to a term t
 The log factor is to avoid excessive weight to frequent terms
Relevance of document to query Q
n Most systems add to the above model
l Words that occur in title, author list, section headings, etc. are given greater importance
l Words whose first occurrence is late in the document are given lower importance
l Very common words such as “a”, “an”, “the”, “it” etc are eliminated
 Called stop words
l Proximity: if keywords in query occur close together in the document, the document has higher importance than if they occur far apart
n Documents are returned in decreasing order of relevance score
l Usually only top few documents are returned, not all
Similarity Based Retrieval
n Similarity based retrieval - retrieve documents similar to a given document
l Similarity may be defined on the basis of common words
 E.g. find k terms in A with highest TF (d, t ) / n (t ) and use these terms to find relevance of other documents.
n Relevance feedback: Similarity can be used to refine answer set to keyword query
l User selects a few relevant documents from those retrieved by keyword query, and system finds other documents similar to these
n Vector space model: define an n-dimensional space, where n is the number of words in the document set.
l Vector for document d goes from origin to a point whose i th coordinate is TF (d,t ) / n (t )
l The cosine of the angle between the vectors of two documents is used as a measure of their similarity.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: isro information ppt, ppts on recognition and retreival from document image collectionandabazar, information on 3g technologyin ppt, information s ramanujan ppt, retreival, patseek content based image retreival system for patent database technology, haydrojan bessik information ppt davnlod in gujrati,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  What Networking of Information Can Do for Cloud Computing project topics 1 8,196 29-03-2013, 01:03 AM
Last Post: Guest
  Image Segmentation Using Information Bottleneck Method seminar class 4 4,000 19-01-2013, 12:45 PM
Last Post: seminar details
  Management Information System computer science crazy 1 2,113 31-12-2012, 04:14 PM
Last Post: seminar details
  TWO WAY STUDENT INFORMATION SYSTEM USING CELLULAR TECHNOLOGY smart paper boy 3 3,471 24-12-2012, 11:24 AM
Last Post: seminar details
Photo Cybereconomy : Information Technology and Economy computer science crazy 1 2,757 23-11-2012, 01:00 PM
Last Post: seminar details
  Cybereconomy : Information Technology and Economy Electrical Fan 2 2,913 23-11-2012, 01:00 PM
Last Post: seminar details
  Embedded Systems and Information Appliances seminar projects crazy 1 2,125 22-10-2012, 01:21 PM
Last Post: seminar details
  SMS BASED MARK INFORMATION SYSTEM seminar class 5 5,191 07-03-2012, 12:15 PM
Last Post: seminar paper
  Raga Identification of Carnatic music for Music Information Retrieval full report seminar topics 1 2,748 13-02-2012, 02:59 PM
Last Post: seminar paper
  INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT SYSTEM project report helper 1 1,744 13-02-2012, 02:59 PM
Last Post: seminar paper

Forum Jump: