i want fulls source code of automatic discovery of personal name aliases from the web, a successfully running one.[/align][/size][/font]
Posts: 14,118
Threads: 61
Joined: Oct 2014
An individual is typically referred to by numerous alias names on the web. The exact alias identification of a given person's name is useful in various web-related tasks such as information retrieval, feeling analysis, personal name disambiguation, and relationship extraction. We propose a method to extract aliases of a given personal name from the web. Given a personal name, the proposed method first extracts a set of candidate aliases. Second, we classify extracted candidates according to the probability that a candidate is a correct alias of the given name. We propose a new and extracted lexical model based on an approach to efficiently extract a large set of candidate aliases from fragments retrieved from a web search engine. We defined several ranking scores to evaluate candidate aliases using three approaches: frequency of lexical patterns, co-occurrences of words in an anchor text graphic, and page counts on the web.
To build a robust alias detection system, we integrate the different classification scores into a unique classification function using vector classification support machines. We evaluated the proposed method in three sets of data: a set of personal name data in English, a set of place name data in English, and a set of Japanese person name data. The proposed method overcomes numerous baselines and methods of extracting aliases from previously proposed names, achieving a statistically significant range of significance of 0.67. Experiments using localization names and Japanese personal names suggest the possibility of extending the proposed method to extract aliases from different types of named entities and for different languages. In addition, aliases extracted using the proposed method are used successfully in an information retrieval task and improve memory by 20% in a relationship detection task.