Frequent Term-Based Text Clustering
#1

[attachment=4047]


ABSTRACT

Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special problems of text clustering: very high dimensionality of the data, very large size of the databases and understandability of the cluster description. In this paper, we introduce a novel approach which uses frequent item (term) sets for text clustering. Such frequent sets can be efficiently discovered using algorithms for association rule mining. To cluster based on frequent term sets, we measure the mutual overlap of frequent sets with respect to the sets of supporting documents. We present two algorithms for frequent term-based text clustering, FTC which creates flat clusterings and HFTC for hierarchical clustering. An experimental evaluation on classical text documents as well as on web documents demonstrates that the proposed algorithms obtain clusterings of comparable quality significantly more efficiently than state-of-the- art text clustering algorithms. Furthermore, our methods provide an understandable description of the discovered clusters by their frequent term sets.
read full report
http://citeseerx.ist.psu.edu/viewdoc/dow...1&type=pdf
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: term paper topics for neural networks, term poaper on topic antifriction material used in car brakes, frequent term based text clustering code, project report for term loan from, condtion and asssessment of electric equipment in term paper, frequent pattern mining wiki, short term course related,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  A Link-Based Cluster Ensemble Approach for Categorical Data Clustering 1 1,099 16-02-2017, 10:51 AM
Last Post: jaseela123d
  An Efficient Algorithm for Mining Frequent Patterns full report project topics 3 4,805 01-10-2016, 10:02 AM
Last Post: Guest
  Medical image segmentation using clustering algorithm computer science technology 2 6,000 08-11-2012, 01:00 PM
Last Post: seminar details
  Privacy-preserving Data Publishing Based on De-clustering project topics 1 1,478 08-11-2012, 12:07 PM
Last Post: seminar details
  Text Mining project topics 2 1,597 09-02-2012, 04:08 PM
Last Post: jonssmith
  Frequent Pattern Mining in Web Log Data smart paper boy 1 1,572 03-02-2012, 10:25 AM
Last Post: seminar addict
  Real-World Sensor Network for Long-Term Volcano Monitoring: Design and Findings Projects9 0 780 23-01-2012, 05:22 PM
Last Post: Projects9
  BloomCast: Efficient and Effective Full-Text Retrieval in Unstructured P2P Networks Projects9 0 812 23-01-2012, 05:14 PM
Last Post: Projects9
  Agglomerative Mean-Shift Clustering Projects9 0 655 23-01-2012, 03:51 PM
Last Post: Projects9
  Topic Mining over Asynchronous Text Sequences Projects9 0 922 23-01-2012, 03:47 PM
Last Post: Projects9

Forum Jump: