java implementation of effective pattern discovery for text mining
#1

plz send a java implementation for this projeect
Reply
#2
i want effective pattern discovery for text mining code
Reply
#3

java implementation of effective pattern discovery for text mining

Abstract

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase)-based approaches should perform better than the term-based ones, but many experiments do not support this hypothesis. This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance.

Existing System :

Sequential patterns, frequent itemsets, co-occurring terms and multiple grams Many data mining techniques have been proposed in the last decade. These techniques include association rule mining, frequent itemset mining, sequential pattern mining, maximum pattern mining, and closed pattern mining. However, using these discovered knowledge (or patterns) in the field of text mining is difficult and ineffective. The reason is that some useful long patterns with high specificity lack in support (i.e., the low-frequency problem).

Limitations and Disadvantages of existing system:

In existing system finds frequent patterns and assign weights according to their occurrence without ordering of pattern like apriori algorithm.
If booth positive and negative documents having common terms the existing system give the error rate high.
If document having less text then system cannot classify the document correctly.
Proposed System :

To improve the effectiveness by effectively using closed patterns in text mining. In addition, a two-stage model that used both term-based methods and pattern based methods was introduced to significantly improve the performance of information filtering. Natural language processing (NLP) is a modern computational technology that can help people to understand the meaning of text documents. For a long time, NLP was struggling for dealing with uncertainties in human languages. Recently, a new concept-based model was presented to bridge the gap between NLP and text mining, which analyzed terms on the sentence and document levels. This model included three components. The first component analyzed the semantic structure of sentences; the second component constructed a conceptual ontological graph (COG) to describe the semantic structures; and the last component extracted top concepts based on the first two components to build feature vectors using the standard vector space model. The advantage of the concept-based model is that it can effectively discriminate between non important terms and meaningful terms which describe a sentence meaning. Compared with the above methods, the concept-based model usually relies upon its employed NLP techniques.

Advantages :

In this system we are giving high priority for long sequence in the evaluated patterin.
In this system we are giving term weight based on occurrence of term in long pattern (sequence).
In this system we are reducing the term we assign negative weights if terms occurs in booth negative and positive documents.
Modules :

1) SP Mining.

2) PTM

3) IPEvolving

4) User Interface Design

SP Mining.

In this module we generate a frequent sequential pattern is a maximal sequential pattern if there exists no frequent sequential pattern. The length of sequential pattern indicates the number of words (or terms) contained in pattern. A sequential pattern which contains n terms extracted from given set of documents. Here we take set of documents as input we generate nterm sequences.

PTM

In this module we present a pattern-based model PTM (Pattern Taxonomy Model) for the representation of text documents. Pattern taxonomy is a tree-like structure that illustrates the relationship between patterns extracted from a text collection. An example of pattern taxonomy. (i.e., maximum sequential patterns). Once the tree is constructed, we can easily find the relationship between patterns. The next step is to prune the meaningless patterns in the pattern taxonomy.

IPEvolving

In this module we take positive documents and negative documents and we adjust the term weights based on term weight of positive document and negative document. Using this technique we can increase maximum likelihood event one documents having more overlapping terms and less content of the document we get accurate results.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: implementation in java, implementation java, effective pattern discovery for text mining existing system, effective pattern discovery, pattern discovery in text mining, data mining and knowledge discovery, class diagram for effective pattern discovery for text mining,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  dwt code in java for image 2 6,352 24-03-2018, 10:06 PM
Last Post: Guest
  ppt on design and implementation of intelligent campus security tracking system based on rfid and zigbee 7 16,061 09-02-2018, 02:20 PM
Last Post: udaya
  to find whether a number is krishnamurthy number or not using java 1 11,262 01-01-2018, 11:39 AM
Last Post: dhanabhagya
  java programmings for bus ticket reservation source code 1 6,222 09-11-2017, 11:28 PM
Last Post: Ayushi Nagar
Wink student online counselling simple projects in core java with source code 3 6,833 10-06-2017, 10:21 AM
Last Post: jaseela123d
  download free pdf gasturbine text book by v ganesan 2 1,697 03-06-2017, 12:38 PM
Last Post: jaseela123d
  government scheme management system in java 2 6,153 26-04-2017, 11:40 PM
Last Post: dumpo
  m phil computer science thesis topics in data mining 1 1,351 15-04-2017, 02:12 PM
Last Post: jaseela123d
Wink smarter electricity billing system source code in java 2 6,287 14-04-2017, 02:36 AM
Last Post: dy52225
  multi user mobile bluetooth two way text chat 1 989 13-04-2017, 02:49 PM
Last Post: jaseela123d

Forum Jump: