Distributional Features for Text Categorization
#1

Text categorization is the task of assigning predefined categories to natural language text. With the widely used Ëœbag of wordsâ„¢ representation, previous researches usually assign a word with values such that whether this word appears in the document concerned or how frequently this word appears. Although these values are useful for text categorization, they have not fully expressed the abundant information contained in the document. This paper explores the effect of other types of values, which express the distribution of a word in the document. These novel values assigned to a word are called distributional features, which include the compactness of the appearances of the word and the position of the first appearance of the word. The proposed distributional features are exploited by a tfidf style equation and different features are combined using ensemble learning techniques. Experiments show that the distributional features are useful for text categorization. In contrast to using the traditional term frequency values solely, including the distributional features requires only a little additional cost, while the categorization performance can be significantly improved. Further analysis shows that the distributional features are especially useful when documents are long and the writing style is casual.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: drawbacks application features of 5 pen pc technology, download ppt on modern features of automobiles, umass mcb, are there management features with, features oftwo wheeler, embedded system features, engineering features of the,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Messages In This Thread
Distributional Features for Text Categorization - by project report tiger - 10-02-2010, 10:55 PM

Possibly Related Threads...
Thread Author Replies Views Last Post
  Text Mining project topics 2 1,604 09-02-2012, 04:08 PM
Last Post: jonssmith
  BloomCast: Efficient and Effective Full-Text Retrieval in Unstructured P2P Networks Projects9 0 815 23-01-2012, 05:14 PM
Last Post: Projects9
  Topic Mining over Asynchronous Text Sequences Projects9 0 931 23-01-2012, 03:47 PM
Last Post: Projects9
  CoCITe—Coordinating Changes in Text Projects9 0 703 23-01-2012, 03:38 PM
Last Post: Projects9
  Effective Pattern Discovery for Text Mining Projects9 0 1,479 23-01-2012, 03:34 PM
Last Post: Projects9
  Effective Pattern Discovery for Text Mining Projects9 0 1,281 23-01-2012, 03:02 PM
Last Post: Projects9
  TEXT EDITOR IN JAVA seminar class 2 4,256 12-08-2011, 12:42 PM
Last Post: tejusuratwala
  Using Texture and Shape Features to retrieve sets of similar Medical Images project topics 0 1,229 02-05-2011, 12:01 PM
Last Post: project topics
  Intelligent Dictionary Based Encoding Algorithm for Text Data Compression for High-Sp project topics 0 1,288 02-05-2011, 11:56 AM
Last Post: project topics
  Online Text Tile On ICON tiles seminar class 0 889 02-04-2011, 04:52 PM
Last Post: seminar class

Forum Jump: