NATURAL LANGUAGE GRAMMAR INDUCTION USING PARALLEL GENETIC ALGORITHM full report
#1

NATURAL LANGUAGE GRAMMAR INDUCTION USING PARALLEL GENETIC ALGORITHM

Presented By:
Hari Mohan Pandey
Department of Computer Engineering
MPSTME, NMIMS University, Mumbai
Under the Guidance of
Prof. Dr. N.S. Choubey

ABSTRACT
As we all are aware that there are only two ways one can use for communication first is oral and the second one is written. Natural Language processing is the area where we deal with both the approach of communication. For oral, natural language processing support the area called as "Speech Processing". In the same line to handle written communication NLP supports the area known as "Text Processing". This paper focuses the second version of communication where we communicate using any language by writing some thing. In this paper we have given a proposed methodology, by which we can describes the component and operation of a program that induces the grammatical structure of a sample from a set of sentences (Simple English Sentences), or natural language. To optimize the result, we have used the parallel genetic algorithm to extract grammar as a finite set of string of non-terminals and pre-terminals. In this paper we are also presenting an architectural approach for load balancing for global class of genetic algorithm (Semi Synchronous Master-Slaves). In this paper we have used global parallel genetic algorithm for inducing grammar in the form of CFG.
LIST OF F IGURE
1. Figure 2.1: Natural Language Processing Levels
2. Figure 2.2: Approaches for Natural Language Processing
3. Figure 2.3: A schematic of a Master-Slave Parallel GAs
4. Figure 2.4: A schematic of a Fine-Grained Parallel GAs
5. Figure 2.5: A schematic of a Multiple population GAs
6. Figure 2.6: A schematic of a Hierarchical GAs
7. Figure 2.7: A schematic of a Hierarchical Parallel GA
8. Figure 2.8: A schematic of a Hybrid Uses Multi-deme GA at both the upper and lower level.
9. Figure 4.1: Parallelization of Genetic Algorithm showing overall tasks.
10. Figure 4.2: Layout work to the nodes (CPUs) used to balance the compute load between synchronous points.
LIST OF TABLE
1. Table-1: Summary of related work of Global, Regional and Local Model
2. Table-2: Algorithm to construct Grammar from Individual chromosome.
3. Table-3: Extraction of Grammar from set of sample by the Induction Algorithm.
LIST OF ALGORITHMS
1. Algorithm to Construct Grammar from Individual Chromosome.
2. Grammar Induction Algorithm
CONTENTS
1. Introduction
1. 1 Introduction To Natural Language Processing
1.2 NLP: Goals
1.3 NLP: Origins
1.4 NLP: Divisions
1.5 Report Outlines
2. Background
2.1 History of Natural Language Processing.
2.2 Levels of Natural Language Processing
2.3 Approaches to Natural Language Processing.
2.3.1 Symbolic Approach
2.3.2 Statistical Approach
2.3.3 Connectionist Approach
2.3.4 Hybrid Approach
2.4 Comparison Among Approaches
2.5 NLP: Application
2.6 INTRODUCTION: Genetic Algorithms
2.7 Parallel Genetic Algorithms
2.7.1 Classification of Parallel Genetic Algorithms
2.8 Master Slave Parallelization
2.9 Multiple-Deme Parallel GAs- The First Generation
2.10 Coarse-Grained Parallel GAs-The Second Generation
2.11 Fine-Grained Parallel GAs
2.12 Hierarchical Parallel Algorithms
3. Problem Definition and Proposed Solution
3.1 Introduction
3.2 Proposed Method
3.3 Previous Work
4. Implementation Methodology
4.1 Introduction
1. Grammar Induction Process
2. Genetic Algorithms
4.2 Parallel Genetic Algorithm
a) Global
b) Coarse Grained
c) Fine Grained
d) Hybrid GA
4.2.1 Parallel Genetic Algorithms: Population Model
a) Global Model
b) Regional Model
c) Local Model
4.3 System Model for the Problem
4.3.1 Load Balancing in Master-Slave type GA
4.3.2 Learning Approach for CFG Using GAs
4.4 Suitability of GAs for the problem
4.5 Parser and Tagger
4.6 Representation of Individuals
4.7 Fitness Function
4.8 Grammar Induction: A Case Study
5. Conclusion and Future Enhancements 7. References
CHAPTER-1
INTRODUCTION
1.1 NATURAL LANGUAGE PROCESSING
Natural Language Processing (NLP) is the computerized approach to analyzing text that is based on both a set of theories and a set of technologies. And, being a very active area of research and development, there is not a single agreed-upon definition that would satisfy everyone, but there are some aspects, which would be part of any knowledgeable person's definition. The definition one can offer is:
Definition: Natural Language Processing is a theoretically motivated range of computational techniques for analyzing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose of achieving human-like language processing for a range of tasks or applications.
Work in computational linguistics began very soon after the development of the first computers (Booth, Brandwood and Cleave 1958), yet in the intervening four decades there has been a pervasive feeling that progress in computer understanding of natural language has not been commensurate with progress in other computer applications. In the year i993, a number of prominent researchers in natural language processing met to assess the state of the discipline and discuss future directions (Bates and Weischeded i993). The consensus of this meeting was that increased attention to large amount of lexical and domain knowledge was essential for significant progress, and current research efforts in the field reflects this point of view.
The traditional approach in computational linguistics (Allen i987) included a prominent concentration on the formal mechanism available for processing language, especially as these applied to syntactic processing and, somewhat less so, to semantic interpretation. In recent efforts, work in these areas continues, but there has been a marked trend toward enhancing these core resources with statistical knowledge acquisition techniques. There is considerable research aimed at using online resources for assembling large knowledge bases, drawing on both natural language corpora and dictionaries and other structured resources. Recent research in lexical semantics reflects in the proper structuring of this information to support linguistic processing. Furthermore, the availability of large amounts of machine-readable text naturally supports continued work in analysis of connected discourse. In other trends the use of statistical techniques are being used as part of the parsing process, for automatic part of speech assignment, and for word sense disambiguation.
An indication of the development of natural language processing systems is that they are increasingly being used in support of other computer programs. This trend is particularly noticeable with regard to information management applications. Natural language processing provides a potential means of gaining access to the information inherent in the large amount of text made available through the Internet.
Several things come into picture from the above definition of Natural Language Processing.
A) NATURALLY OCCURING TEXT: Naturally occurring texts can be of any language,
mode, genre etc. the texts can be in two forms:
a) Oral or
b) Written
The only requirement is that they be in a language used by humans to communicate to one another. Also, the text being analyzed should not be specifically constructed for the purpose of the analysis, but rather that the text be gathered from actual usage.
B) LEVELS OF LIGUISTIC ANALYSIS: It refers to the fact that there are multiple types of language processing known to be at work when humans produce or comprehend language. It is thought that humans normally utilize all of these levels since each level conveys different types of meaning. But various NLP systems utilize different levels, or combinations of levels of linguistic analysis, and this is seen in the differences amongst various NLP applications. This is also leads to much confusion on the part of non-specialists as to what NLP really is, because a system that uses any subset of these levels of analysis can be said to be an NLP-based system. The difference, between them, therefore, may actually be whether the system uses 'weak' NLP of 'strong' NLP.
C) HUMAN-LIKE LANGUAGE PROCESSING: It reveals that NLP is considered a discipline within Artificial Intelligence (AI). And while the full lineage of NLP does depend on a number of other disciplines, since NLP strives for human-like performance, it is appropriate to consider it an AI discipline.
D) RANGE OF TASK OR APPLICATIONS: The application areas point out that Natural
Language Processing is not usually considered a goal in and of itself, except perhaps for Ai researchers. For others, NLP is the means for accomplishing a particular task. Therefore, we have Information Retrieval (IR) systems that utilize NLP, as well as Machine Translations (MT), Question-Answering etc.
1.2 NLP: GOAL
The goal of NLP as stated above is "to accomplish human-like language processing'". The choice of the word 'processing' is very deliberate, and should not be replaced with 'understanding'. For although the field of NLP was originally referred to as Natural Language Understanding (NLU) in the early days of AI, it is well agreed today that while the goal of NLP is true NLU, that goal has not yet been accomplished.
A full NLU System would be able to:
1. Paraphrase an input text
2. Translate the text into another language
3. Answer questions about the contents of the text
4. Draw inferences from the text
While NLP has made serious inroads into accomplishing goals 1 to 3, the fact that NLP systems cannot, of themselves, draw inferences from text, NLU still remains the goal of
NLP.
There are more practical goals for NLP, many related to the particular application for which it is being utilized. For example: an NLP-based IR system has the goal of providing more precise, complete information in response to a user's real information need. The goal of the NLP system here is to represent the true meaning and intent of the user's query, which can be expressed as naturally in everyday language as if they were speaking to a reference librarian. Also, the contents of the documents that are being searched will be represented at all their levels of meaning so that a true match between need and response can be found, no matter how either are expressed in their surface form.
1.3 NLP: ORIGINS
As most modern disciplines, the lineage of NLP is indeed mixed, and still today has strong emphases by different groups whose backgrounds are more influenced by one or another of the disciplines.
Key among the contributors to the discipline and practice of NLP are:
a) Linguistics - It focuses on formal, structural models of language and the discovery of language universals - in fact the field of NLP was originally referred to as Computational Linguistics.
b) Computer Science - It is concerned with developing internal representations of data and efficient processing of these structures.
c) Cognitive Psychology - It looks at language usage as a window into human cognitive processes, and has the goal of modeling the use of language in a psychologically plausible way.
1.4 NLP: DIVISIONS
While the entire field is referred to as Natural Language Processing, there are in fact two distinct focuses:
1. Language Processing
2. Language Generation.
The first of these refers to the analysis of language for the purpose of producing a meaningful representation, while the latter refers to the production of language from a representation.
The task of Natural Language Processing is equivalent to the role of reader/listener, while the task of Natural Language Generation is that of the writer/speaker.
While much of the theory and technology are shared by these two divisions, Natural Language Generation also requires a planning capability. That is, the generation system requires a plan or model of the goal of the interaction in order to decide what the system should generate at each point in an interaction.
We will focus on the task of natural language analysis, as this is most relevant to Library and Information Science.
Another distinction is traditionally made between language understanding and speech understanding. Speech understanding starts with, and speech generation ends with, oral language and therefore rely on the additional fields of acoustics and phonology.
Speech understanding focuses on how the 'sounds' of language as picked up by the system in the form of acoustical waves are transcribed into recognizable morphemes and words. Once in this form, the same levels of processing which are utilized on written text are utilized.
1.5 REPORT OUTLINE
In this report we propose a method of "Natural Language Grammar Induction Using Parallel Genetic Algorithm'. Many previous grammar induction systems have worked on the assumption that grammar induction is a sunset of inductive logic programming (ILP), and as such, have primarily depended on ILP techniques. Much less work has been done on using evolutionary to evolve natural language grammar. One of the few papers applying evolutionary techniques to grammar induction, Keller and Lutz [4], used a genetic algorithm to evolve stochastive context-free grammars for finite language. In this work only positive examples of language data were presented. Margaret Aycinena, Mykel J. Kochenderfer and David Carl Mulford [5] presented paper at Stanford University in the year 2003; the paper presents some work that goes significantly beyond what was done in [4]. First, the system actually evolves non-stochastic grammars, rather than simply the probability parameters for a pre-chosen grammar. Second, the system uses both positive and negative examples of language data. Third, the fitness function was based on the ability of a given grammar to pare the data. Finally, the corpora were samples of part-of-speech tagged natural English language.
This project presents some work that goes significantly beyond what was done in [4] and [5]. Here in this project we propose a technique for grammar induction using parallel genetic algorithm for getting the well structured grammar for the natural language (set of sentences, positive and negative). Finally, we will form the parse tree to check the correctness of the grammar. There are several techniques given for grammar induction by using genetic algorithm but here we are using the concept of parallel genetic algorithm by which we can get more optimum result.
The remaining part of the report contains the detailed information about the previous systems that we have made extensions on and the enhancements that are done.
Chapter-2 covers the historical development of Natural Language Processing, different levels of natural language processing, approaches to natural language processing. And Basic introduction of Genetic Algorithms.
Chapter-3 gives the detailed descript of the problem definition and proposed solution.
Chapter-4: this chapter gives the detailed description of the solution and methodology for achieving the goal.
Chapter-4: is for conclusion and future enhancements.
CHAPTER-2
BACKGROUND: NATURAL LANGUAGE PROCESSING, GENETIC ALGORITHMS
2.1 HISTROY OF NLP
Research in natural language processing has been going on for several decades dating back to the late 1940s. Machine translation (MT) was the first computer-based application related to natural language. While Weaver and Booth started one of the earliest MT projects in 1946 on computer translation based on expertise in breaking enemy codes during World War II, it was generally agreed that it was Weaver's memorandum of 1949 that brought the idea of MT to general notice and inspired many projects. He suggested using ideas from cryptography and information theory for language translation. Research began at various research institutions in the United States within a few years.
Early work in MT took the simplistic view that the only differences between languages resided in their vocabularies and the permitted word orders. Systems developed from this perspective simply used dictionary-lookup for appropriate words for translation and reordered the words after translation to fit the word-order rules of the target language, without taking into account the lexical ambiguity inherent in natural language. This produced poor results. The apparent failure made researchers realize that the task was a lot harder than anticipated, and they needed a more adequate theory of language. However, it was not until 1957 when Chomsky published Syntactic Structures introducing the idea of generative grammar, did the field gain better insight into whether or how mainstream linguistics could help MT.
During this period, other NLP application areas began to emerge, such as speech recognition. The language processing community and the speech community then was split into two camps with the language processing community dominated by the theoretical perspective of generative grammar and hostile to statistical methods, and the speech community dominated by statistical information theory and hostile to theoretical linguistics.
Due to the developments of the syntactic theory of language and parsing algorithms, there was over-enthusiasm in the 1950s that people believed that fully automatic high quality translation systems would be able to produce results indistinguishable from those of human translators, and such systems should be in operation within a few years. It was not only unrealistic given the then-available linguistic knowledge and computer systems, but also impossible in principle.
The inadequacies of then-existing systems, and perhaps accompanied by the over enthusiasm, led to the ALPAC (Automatic Language Processing Advisory Committee of the National Academy of Science - National Research Council) report of 1966. The report concluded that MT was not immediately achievable and recommended it not be funded. This had the effect of halting MT and most work in other applications of NLP at least within the United States.
Although there was a substantial decrease in NLP work during the years after the ALPAC report, there were some significant developments, both in theoretical issues and in construction of prototype systems. Theoretical work in the late 1960's and early 1970's focused on the issue of how to represent meaning and developing computationally tractable solutions that the then-existing theories of grammar were not able to produce. In 1965, Chomsky introduced the transformational model of linguistic competence. However, the transformational generative grammars were too syntactically oriented to allow for semantic concerns. They also did not lend themselves easily to computational implementation. As a reaction to Chomsky's theories and the work of other transformational generativists, case grammar of Fillmore, semantic networks of Quillian, and conceptual dependency theory of Schank, were developed to explain syntactic anomalies, and provide semantic representations. Augmented transition networks of Woods, extended the power of phrase-structure grammar by incorporating mechanisms from programming languages such as LISP. Other representation formalisms included Wilks' preference semantics, and Kay's functional grammar.
Alongside theoretical development, many prototype systems were developed to demonstrate the effectiveness of particular principles. Weizenbaum's ELIZA was built to replicate the conversation between a psychologist and a patient, simply by permuting or echoing the user input. Winograd's SHRDLU simulated a robot that manipulated blocks on a tabletop. Despite its limitations, it showed that natural language understanding was indeed possible for the computer. PARRY attempted to embody a theory of paranoia in a system. Instead of single keywords, it used groups of keywords, and used synonyms if keywords were not found. LUNAR was developed by Woods as an interface system to a database that consisted of information about lunar rock samples using augmented transition network and procedural semantics.
In the late 1970's, attention shifted to semantic issues, discourse phenomena, and communicative goals and plans. Grosz analyzed task-oriented dialogues and proposed a theory to partition the discourse into units based on her findings about the relation between the structure of a task and the structure of the task-oriented dialogue. Mann and Thompson developed Rhetorical Structure Theory, attributing hierarchical structure to discourse. Other researchers have also made significant contributions, including Hobbs and Rosenschein, Polanyi and Scha, and Reichman.
This period also saw considerable work on natural language generation. McKeown's discourse planner TEXT and McDonald's response generator MUMMBLE used rhetorical predicates to produce declarative descriptions in the form of short texts, usually paragraphs. TEXT's ability to generate coherent responses online was considered a major achievement.
In the early 1980s, motivated by the availability of critical computational resources, the growing awareness within each community of the limitations of isolated solutions to NLP problems, and a general push toward applications that worked with language in a broad, real-world context, researchers started re-examining non-symbolic approaches that had lost popularity in early days. By the end of 1980s, symbolic approaches had been used to address many significant problems in NLP and statistical approaches were shown to be complementary in many respects to symbolic approaches.
In the last ten years of the millennium, the field was growing rapidly. This can be attributed to:
a) Increased availability of large amounts of electronic text.
b) Availability of computers with increased speed and memory.
c) The advent of the Internet. Statistical approaches succeeded in dealing with many generic problems in computational linguistics such as part-of-speech identification, word sense disambiguation, etc., and have become standard throughout NLP.
NLP researchers are now developing next generation NLP systems that deal reasonably well with general text and account for a good portion of the variability and ambiguity of language.
2.2 LEVELS OF NATURAL LANGUAGE PROCESSING
The most explanatory method for presenting what actually happens within a Natural Language Processing system is by means of the 'levels of language' approach. This is also referred to as the synchronic model of language and is distinguished from the earlier sequential model, which hypothesizes that the levels of human language processing follow one another in a strictly sequential manner. Psycholinguistic research suggests that language processing is much more dynamic, as the levels can interact in a variety of orders. Introspection reveals that we frequently use information we gain from what is typically thought of as a higher level of processing to assist in a lower level of analysis.
For example, the pragmatic knowledge that the document you are reading is about biology will be used when a particular word that has several possible senses (or meanings) is encountered, and the word will be interpreted as having the biology sense.
Of necessity, the following description of levels will be presented sequentially. The key point here is that meaning is conveyed by each and every level of language and that since humans have been shown to use all levels of language to gain understanding, the more capable an NLP system is, the more levels of language it will utilize.
NLP: Levels
Phonology
Morphology
Lexical
Syntactic
Semantic
Discourse
Pragmatic
Figure 2.1: Natural Language Processing Levels
2.2.1 Phonology
This level deals with the interpretation of speech sounds within and across words. Here are, in fact, three types of rules used in phonological analysis:
1) Phonetic rules - for sounds within words;
2) Phonemic rules - for variations of pronunciation when words are spoken together, and;
3) Prosodic rules - for fluctuation in stress and intonation across a sentence.
In an NLP system that accepts spoken input, the sound waves are analyzed and encoded into a digitized signal for interpretation by various rules or by comparison to the particular language model being utilized.
2.2.2 Morphology
This level deals with the componential nature of words, which are composed of morphemes - the smallest units of meaning. For example, the word preregistration can be morphologically analyzed into three separate morphemes:
1. The prefix pre
2. The root registra, and
3. The suffix tion.
Since the meaning of each morpheme remains the same across words, humans can break down an unknown word into its constituent morphemes in order to understand its meaning. Similarly, an NLP system can recognize the meaning conveyed by each morpheme in order to gain and represent meaning.
For example, adding the suffix -ed to a verb, conveys that the action of the verb took place in the past. This is a key piece of meaning, and in fact, is frequently only evidenced in a text by the use of the -ed morpheme.
2.2.3 Lexical
At this level, humans, as well as NLP systems, interpret the meaning of individual words. Several types of processing contribute to word-level understanding - the first of these being assignment of a single part-of-speech tag to each word. In this processing, words that can function as more than one part-of-speech are assigned the most probable part-of speech tag based on the context in which they occur.
Additionally at the lexical level, those words that have only one possible sense or meaning can be replaced by a semantic representation of that meaning. The nature of the representation varies according to the semantic theory utilized in the NLP system. The following representation of the meaning of the word launch is in the form of logical predicates. As can be observed, a single lexical unit is decomposed into its more basic properties. Given that there is a set of semantic primitives used across all words; these simplified lexical representations make it possible to unify meaning across words and to produce complex interpretations, much the same as humans do.
Launch (a large boat used for carrying people on rivers, lakes harbors, etc.)
((CLASS BOAT) (PROPERTIES (LARGE) (PURPOSE (PREDICATION (CLASS
CARRY) (OBJECT PEOPLE))))
The lexical level may require a lexicon, and the particular approach taken by an NLP system will determine whether a lexicon will be utilized, as well as the nature and extent of information that is encoded in the lexicon. Lexicons may be quite simple, with only the words and their part(s)-of-speech, or may be increasingly complex and contain information on the semantic class of the word, what arguments it takes, and the semantic limitations on these arguments, definitions of the sense(s) in the semantic representation utilized in the particular system, and even the semantic field in which each sense of a polysemous word is used.
2.2.4 Syntactic
This level focuses on analyzing the words in a sentence so as to uncover the grammatical structure of the sentence. This requires both:
1. A grammar and
2. A parser.
The output of this level of processing is a (possibly delinearized) representation of the sentence that reveals the structural dependency relationships between the words. There are various grammars that can be utilized, and which will, in turn, impact the choice of a parser. Not all NLP applications require a full parse of sentences, therefore the remaining challenges in parsing of prepositional phrase attachment and conjunction scoping no longer stymie those applications for which phrasal and clausal dependencies are sufficient. Syntax conveys meaning in most languages because order and dependency contribute to meaning.
For example the two sentences:
S1: 'The dog chased the cat.' S2: 'The cat chased the dog.'
Differ only in terms of syntax, yet convey quite different meanings.
2.2.5 Semantic
This is the level at which most people think meaning is determined, however, as we can see in the above defining of the levels, it is all the levels that contribute to meaning. Semantic processing determines the possible meanings of a sentence by focusing on the interactions among word-level meanings in the sentence. This level of processing can include the semantic disambiguation of words with multiple senses; in an analogous way to how syntactic disambiguation of words that can function as multiple parts-of-speech is accomplished at the syntactic level. Semantic disambiguation permits one and only one sense of polysemous words to be selected and included in the semantic representation of the sentence. For example, amongst other meanings, 'file' as a noun can mean either a folder for storing papers, or a tool to shape one's fingernails, or a line of individuals in a queue. If information from the rest of the sentence were required for the disambiguation, the semantic, not the lexical level, would do the disambiguation. A wide range of methods can be implemented to accomplish the disambiguation, some which require information as to the frequency with which each sense occurs in a particular corpus of interest, or in general usage, some which require consideration of the local context, and others which utilize pragmatic knowledge of the domain of the document.
2.2.6 Discourse
While syntax and semantics work with sentence-length units, the discourse level of NLP works with units of text longer than a sentence. That is, it does not interpret multisentence texts as just concatenated sentences, each of which can be interpreted singly. Rather, discourse focuses on the properties of the text as a whole that convey meaning by making connections between component sentences. Several types of discourse processing can occur at this level, two of the most common being anaphora resolution and discourse/text structure recognition. Anaphora resolution is the replacing of words such as pronouns, which are semantically vacant, with the appropriate entity to which they refer. Discourse/text structure recognition determines the functions of sentences in the text, which, in turn, adds to the meaningful representation of the text. For example, newspaper articles can be deconstructed into discourse components such as: Lead, Main Story, Previous Events, Evaluation, Attributed Quotes, and Expectation.
2.2.7 Pragmatic
This level is concerned with the purposeful use of language in situations and utilizes context over and above the contents of the text for understanding. The goal is to explain how extra meaning is read into texts without actually being encoded in them. This requires much world knowledge, including the understanding of intentions, plans, and goals. Some NLP applications may utilize knowledge bases and inferencing modules.
For example, the following two sentences require resolution of the anaphoric term 'they', but this resolution requires pragmatic or world knowledge.
S1: The city councilors refused the demonstrators a permit because they feared violence.
S2: The city councilors refused the demonstrators a permit because they advocated revolution.
2.3 APPROACHES TO NATURAL LANGUAGE PROCESSING
Natural language processing approaches fall roughly into four categories: symbolic, statistical, connectionist, and hybrid. Symbolic and statistical approaches have coexisted since the early days of this field.
NLP: Approaches

Symbolic Approach * .s* ^ Statistical Approach
Connectionist Approach Hybrid Approach
Figure 2.2: Approaches for Natural Language Processing
Connectionist NLP work first appeared in the 1960's. For a long time, symbolic approaches dominated the field. In the 1980's, statistical approaches regained popularity as a result of the availability of critical computational resources and the need to deal with broad, real-world contexts. Connectionist approaches also recovered from earlier criticism by demonstrating the utility of neural networks in NLP. This section examines each of these approaches in terms of their foundations, typical techniques, differences in processing and system aspects, and their robustness, flexibility, and suitability for various tasks.
2.3.1 Symbolic Approach
Symbolic approaches perform deep analysis of linguistic phenomena and are based on explicit representation of facts about language through well-understood knowledge representation schemes and associated algorithms. In fact, the description of the levels of language analysis in the preceding section is given from a symbolic perspective. The primary source of evidence in symbolic systems comes from human-developed rules and lexicons.
A good example of symbolic approaches is seen in logic or rule-based systems. In logic based systems, the symbolic structure is usually in the form of logic propositions. Manipulations of such structures are defined by inference procedures that are generally truth preserving. Rule-based systems usually consist of a set of rules, an inference engine, and a workspace or working memory. Knowledge is represented as facts or rules in the rule-base. The inference engine repeatedly selects a rule whose condition is satisfied and executes the rule.
Another example of symbolic approaches is semantic networks. First proposed by Quillian to model associative memory in psychology, semantic networks represent knowledge through a set of nodes that represent objects or concepts and the labeled links that represent relations between nodes. The pattern of connectivity reflects semantic organization, that is; highly associated concepts are directly linked whereas moderately or weakly related concepts are linked through intervening concepts. Semantic networks are widely used to represent structured knowledge and have the most connectionist flavor of the symbolic models.
Symbolic approaches have been used for a few decades in a variety of research areas and applications such as information extraction, text categorization, ambiguity resolution, and lexical acquisition. Typical techniques include: explanation-based learning, rule-based learning, inductive logic programming, decision trees, conceptual clustering, and K-nearest neighbor algorithms.
2.3.2 Statistical Approach
Statistical approaches employ various mathematical techniques and often use large text corpora to develop approximate generalized models of linguistic phenomena based on actual examples of these phenomena provided by the text corpora without adding significant linguistic or world knowledge. In contrast to symbolic approaches, statistical approaches use observable data as the primary source of evidence.
A frequently used statistical model is the Hidden Markov Model (HMM) inherited from the speech community. HMM is a finite state automaton that has a set of states with probabilities attached to transitions between states. Although outputs are visible, states themselves are not directly observable, thus "hidden" from external observations. Each state produces one of the observable outputs with a certain probability.
Statistical approaches have typically been used in tasks such as speech recognition, lexical acquisition, parsing, part-of-speech tagging, collocations, statistical machine translation, and statistical grammar learning, and so on.
2.3.3 Connectionist Approach
Similar to the statistical approaches, connectionist approaches also develop generalized models from examples of linguistic phenomena. What separates connectionism from other statistical methods is that connectionist models combine statistical learning with various theories of representation - thus the connectionist representations allow transformation, inference, and manipulation of logic formulae. In addition, in connectionist systems, linguistic models are harder to observe due to the fact that connectionist architectures are less constrained than statistical ones.
Generally speaking, a connectionist model is a network of interconnected simple processing units with knowledge stored in the weights of the connections between units. Local interactions among units can result in dynamic global behavior, which, in turn, leads to computation.
Some connectionist models are called localist models, assuming that each unit represents a particular concept. For example, one unit might represent the concept "mammal" while another unit might represent the concept "whale". Relations between concepts are encoded by the weights of connections between those concepts. Knowledge in such models is spread across the network, and the connectivity between units reflects their structural relationship. Localist models are quite similar to semantic networks, but the links between units are not usually labeled as they are in semantic nets. They perform well at tasks such as word-sense disambiguation, language generation, and limited inference.
Other connectionist models are called distributed models. Unlike that in localist models, a concept in distributed models is represented as a function of simultaneous activation of multiple units. An individual unit only participates in a concept representation. These models are well suited for natural language processing tasks such as syntactic parsing, limited domain translation tasks, and associative retrieval.
2.4 COMPARISON AMONG APPROACHES
From the above section, we have seen that similarities and differences exist between approaches in terms of their assumptions, philosophical foundations, and source of evidence. In addition to that, the similarities and differences can also be reflected in the processes each approach follows, as well as in system aspects, robustness, flexibility, and suitable tasks.
2.4.1 Process: Research using these different approaches follows a general set of steps, namely, data collection, data analysis/model building, rule/data construction and application of rules/data in system. The data collection stage is critical to all three approaches although statistical and connectionist approaches typically require much more data than symbolic approaches. In the data analysis/model building stage, symbolic approaches rely on human analysis of the data in order to form a theory while statistical approaches manually define a statistical model that is an approximate generalization of the collected data. Connectionist approaches build a connectionist model from the data. In the rule / data construction stage, manual efforts are typical for symbolic approaches and the theory formed in the previous step may evolve when new cases are encountered. In contrast, statistical and connectionist approaches use the statistical or connectionist model as guidance and build rules or data items automatically, usually in relatively large quantity. After building rules or data items, all approaches then automatically apply them to specific tasks in the system. For instance, connectionist approaches may apply the rules to train the weights of links between units.
2.4.2 System aspects: By system aspects, we mean source of data, theory or model formed from data analysis, rules, and basis for evaluation.
2.4.3 Data: As mentioned earlier, symbolic approaches use human introspective data, which are usually not directly observable. Statistical and connectionist approaches are built on the basis of machine observable facets of data, usually from text corpora.
2.4.4 Theory or model based on data analysis: As the outcome of data analysis, a theory is formed for symbolic approaches whereas a parametric model is formed for statistical approaches and a connectionist model is formed for connectionist approaches.
2.4.5 Rules: For symbolic approaches, the rule construction stage usually results in rules with detailed criteria of rule application. For statistical approaches, the criteria of rule application are usually at the surface level or under-specified. For connectionist approaches, individual rules typically cannot be recognized.
2.4.6 Basis for Evaluation: Evaluation of symbolic systems is typically based on intuitive judgments of unaffiliated subjects and may use system-internal measures of growth such as the number of new rules. In contrast, the basis for evaluation of statistical and connectionist systems are usually in the form of scores computed from some evaluation function. However, if all approaches are utilized for the same task, then the results of the task can be evaluated both quantitatively and qualitatively and compared.
2.4.7 Robustness: Symbolic systems may be fragile when presented with unusual or noisy input. To deal with anomalies, they can anticipate them by making the grammar more general to accommodate them. Compared to symbolic systems, statistical systems may be more robust in the face of unexpected input provided that training data is sufficient, which may be difficult to be assured of. Connectionist systems may also be robust and fault tolerant because knowledge in such systems is stored across the network. When presented with noisy input, they degrade gradually.
2.4.8 Flexibility: Since symbolic models are built by human analysis of well-formulated examples, symbolic systems may lack the flexibility to adapt dynamically to experience. In contrast, statistical systems allow broad coverage, and may be better able to deal with unrestricted text for more effective handling of the task at hand. Connectionist systems exhibit flexibility by dynamically acquiring appropriate behavior based on the given input. For example, the weights of a connectionist network can be adapted in real time to improve performance. However, such systems may have difficulty with the representation of structures needed to handle complex conceptual relationships, thus limiting their abilities to handle high-level NLP.
2.4.9 Suitable tasks: Symbolic approaches seem to be suited for phenomena that exhibit identifiable linguistic behavior. They can be used to model phenomena at all the various linguistic levels described in earlier sections. Statistical approaches have proven to be effective in modeling language phenomena based on frequent use of language as reflected in text corpora. Linguistic phenomena that are not well understood or do not exhibit clear regularity are candidates for statistical approaches. Similar to statistical approaches, connectionist approaches can also deal with linguistic phenomena that are not well understood. They are useful for low-level NLP tasks that are usually subtasks in a larger problem.
2.5 NATURAL LANGUAGE PROCESSING APPLICATIONS
Natural language processing provides both theory and implementations for a range of applications. In fact, any application that utilizes text is a candidate for NLP.
The most frequent applications utilizing NLP include the following:
a) Information Retrieval - given the significant presence of text in this application, it is surprising that so few implementations utilize NLP. Recently, statistical approaches for accomplishing NLP have seen more utilization, but few systems other than those by Liddy and Strzalkowski have developed significant systems based on NLP
b) Information Extraction (IE) - a more recent application area, IE focuses on the recognition, tagging, and extraction into a structured representation, certain key elements of information, e.g. persons, companies, locations, organizations, from large collections of text. These extractions can then be utilized for a range of applications including question-answering, visualization, and data mining.
c) Question-Answering - in contrast to Information Retrieval, which provides a list of potentially relevant documents in response to a user's query, question-answering provides the user with either just the text of the answer itself or answer-providing passages.
d) Summarization - the higher levels of NLP, particularly the discourse level, can
empower an implementation that reduces a larger text into a shorter, yet richly constituted
abbreviated narrative representation of the original document.
e) Machine Translation - perhaps the oldest of all NLP applications, various levels of NLP have been utilized in MT systems, ranging from the 'word-based' approach to applications that include higher levels of analysis.
f) Dialogue Systems - perhaps the omnipresent application of the future, in the systems envisioned by large providers of end-user applications. Dialogue systems, which usually focus on a narrowly defined application (e.g. your refrigerator or home sound system), currently utilize the phonetic and lexical levels of language. It is believed that utilization of all the levels of language processing explained above offer the potential for truly habitable dialogue systems.
2.6 INTRODUCTION: GENETIC ALGORITHMS
In the case of evolutionary computation, there are four historical paradigms that have served as the basis for much of the activity of the field:
1. Genetic Algorithms (Holland, 1975)
2. Genetic Programming (Koza, 1992, 1994)
3. Evolutionary Strategies (Recheuberg, 1973), and
4. Evolutionary Programming (Forgel et al., 1966).
The basic differences between the paradigms lie in the nature of the representation schemes, the reproduction operators and selection methods.
SIMPLE GENETIC ALGORITHMS: Simple GAs operates on a fixed-sized population of fixed-length individuals. The individuals are a binary string that encodes the variables of the problem that the algorithm is trying to optimize.
Simple GAs uses three operators:
1. Selection
2. Crossover and
3. Mutation.
The selection operator identifies the fittest individuals of the current population to serve as parents of the next generation. The fitness value of each individual is given by a problem-dependent function. The selection mechanism can take many forms but it always ensures that the best individuals have a higher probability to be selected to reproduce to form a new generation.
The primary exploration mechanism for GAs is crossover. This operator randomly chooses a pair of individuals among those previously selected to breed and exchanges substrings between them. The exchange occurs around randomly selected crossing point.
The mutation operator is usually considered a secondary operator. Its main function is to restore diversity that may be lost from the repeated application of selection and crossover. This operator simply takes one string from the population and randomly alters some value within it. Following nature's example the probability of applying the mutation operator is very low compared to the probability of applying the crossover operator.
It has been established that the time required by a GA to converge is O (n log n) function evaluations (Goldberg, 1991) where n is the population size. We say that a population has converged when all the individuals are very much alike and further improvement may only be possible by a favorable mutation.
Genetic algorithms are not guaranteed to find an optimal solution and their effectivity is determined largely by the population size n (Goldberg, 1991). As the population size increases the GA has a better chance of finding the global solution but as we saw above the computation cost also increases as a function of the population size.
With serial GAs we have to choose between getting a good result with a high confidence and pay a high computational cost or loosen the confidence requirement and get (possibly poor) results fast. In contrast parallel GAs can keep the quality of the results high and find them fast because using parallel machines larger populations can be processed in less time. This keeps the confidence factor high and the response time low opening opportunities to apply genetic algorithms in time-constrained applications. Additionally parallel GAs may evolve several different independent solutions that may be recombined at later stages to form better solutions.
2.7 PARALLEL GENETIC ALGORITHMS
Parallel execution of various Simple Genetic Algorithms (SGAs) is called PGA (Parallel Genetic Algorithm). It is used to solve Job shop scheduling problem by making use of various precedence constraints to achieve high optimization. Parallel Genetic Algorithms
(PGAs) have been developed to reduce the large execution times that are associated with simple genetic algorithms for finding near-optimal solutions in large search spaces. They have also been used to solve larger problems and to find better solutions. PGAs have considerable gains in terms of performance and scalability. PGAs can easily be implemented on networks of heterogeneous computers or on parallel mainframes.
The way in which GAs can be parallelized depends upon the following elements:
¢ How fitness is evaluated and mutation is applied
¢ How selection is applied locally or globally
¢ If single or multiple subpopulations are used
¢ If multiple populations are used how individuals are exchanged
The simplest way of parallelizing a GA is to execute multiple copies of the same SGA, one on each Processing Element (PE). Each of the PEs starts with a different initial subpopulation evolves and stops independently. The complete PGA halts when all PE stop. There are no inter-PE communications.
The various methods of PGA are:
¢ Independent PGA
¢ Migration PGA
¢ Partition PGA
¢ Segmentation PGA
¢ Segmentation-Migration PGA
The advantage of independent PGA approach is that each PE starts with an independent subpopulation. Such subpopulation diversity reduces the chance that all PEs prematurely converge to the same poor quality solution. This approach is equivalent to simply taking the best solution after multiple executions of the SGA on different initial populations.
The second PGA approach is the migration PGA, augments the independent approach with periodic chromosome migrations among the PEs to prevent premature convergence and share high quality solutions. Chromosome migrations occur after certain iterations, with each PE sending a copy of its locally best chromosome to PE P1 modulo N at the first migration step, then PE P2 modulo N at the second migration step and so on. The chromosome received replaces the locally worst chromosome unless an identical chromosome already exists in the local population.
Partition PGA is to partition the search space into disjoint subspaces and to force PEs to search in different subspaces. The segmentation PGA starts by segmenting tours into sub tours. Then after sub tour improvements, they are recombined into longer sub tours. The combination of segmentation and migration is the segmentation-migration approach. Recombination occurs at the end of each phase, sub tours are contained by a group of PEs numbered in ascending order.
PGAs are implemented using the standard parallel approach and the decomposition approach. In the first approach, the sequential GA model is implemented on a parallel computer by dividing the task of implementation among the processors. In decomposition approach, the full population exists in distributed form. Either multiple independent or interacting subpopulations exists (coarse grained or distributed GA) or there is only one population with each population member interacting only with limited set of members (fine grained GA). The interactions between the populations or the members of the population, takes place with respect to a spatial structure of a problem. These models maintain more diverse subpopulations mitigating the problem of premature convergence. They also fit in the evolution model, with a large degree of independence in the subpopulation.
2.7.1 CLASSIFICATION OF PARALLER GAs
The basic idea behind most parallel programs is to divide a task into chunks and to solve the chunks simultaneously using multiple processors. This divide-and-conquer approach can be applied to GAs in many different ways, and the literature contains many examples of successful parallel implementations. Some parallelization methods use a single population, while others divide the population into several relatively isolated subpopulations. Some methods can exploit massively parallel computer architectures, while others are better suited to multi-computers with fewer and more powerful processing elements. The classification of parallel GAs used in this review is similar to others [ADA 94, GOR 93, LIN 94], but it is extended to include one more category.
There are three main types of parallel GAs:
1) Global Single-Population Master Slave GAs,
2) Single-Population Fine-Grained, and
3) Multiple-Population Coarse-Grained GAs.
In a master-slave GA there is a single panmictic population (just as in a simple GA), but the evaluation of fitness is distributed among several processors. Since in this type of parallel GA, selection and crossover consider the entire population it is also known as global parallel GAs.
Figure 2.3: A schematic of a master-slave parallel GA. The master stores the population, executes GA operations, and distributes individuals to the slaves. The slaves only evaluate the fitness of the individuals.
Fine-grained parallel GAs are suited for massively parallel computers and consists of one spatially-structured population. Selection and mating are restricted to a small neighborhood, but neighborhoods overlap permitting some interaction among all the individuals for a schematic of this class of GAs). The ideal case is to have only one individual for every processing element available.
Multiple-population (or multiple-deme) GAs are more sophisticated, as they consist on several subpopulations which exchange individuals occasionally. This exchange of individuals is called migration and, as we shall see in later sections, it is controlled by several parameters. Multiple-deme GAs are very popular, but also are the class of parallel GAs which is most difficult to understand, because the effects of migration are not fully understood. Multiple-deme parallel Gas introduces fundamental changes in the operation of the GA and have a different behavior than simple GAs.
Figure 2.5: A schematic of a multiple-population parallel GA. Each process is a simple GA, and there is (infrequent) communication between the populations.
Multiple-deme parallel GAs are known with different names. Sometimes they are known as "distributed" GAs, because they are usually implemented on distributed memory MIMD computers. Since the computation to communication ratio is usually high, they are occasionally called coarse-grained GAs. Finally, multiple-deme GAs resemble the "island model" in Population Genetics which considers relatively isolated demes, so the parallel GAs are also known as "island" parallel GAs.
Since the size of the demes is smaller than the population used by a serial GA, we would expect that the parallel GA converges faster. However, when we compare the performance of the serial and the parallel algorithms, we must also consider the quality of the solutions found in each case. Therefore, while it is true that smaller demes converge faster, it is also true that the quality of the solution might be poorer. In later section we presents a recent theory that predicts the quality of the solutions for some extreme cases of multi-deme parallel GAs, and allows us to make fair comparisons with serial GAs.
It is important to emphasize that while the master-slave parallelization method does not affect the behavior of the algorithm, the last two methods change the way the GA works. For example, in master-slave parallel GAs, selection takes into account all the population, but in the other two parallel GAs, selection only considers a subset of individuals. Also, in the master-slave any two individuals in the population can mate (i.e., there is random mating), but in the other methods mating is restricted to a subset of individuals.
The final method to parallelize GAs combines multiple demes with master-slave or fine-grained GAs. We call this class of algorithms hierarchical parallel GAs, because at a higher level they are multiple-deme algorithms with single-population parallel GAs (either master-slave or fine-grained) at the lower level. A hierarchical parallel GAs combines the benefits of its components, and it promises better performance than any of them alone.
2.8 Master-Slave Parallelization
This section reviews the master-slave (or global) parallelization method. The algorithm uses a single population and the evaluation of the individuals and/or the application of genetic operators are done in parallel. As in the serial GA, each individual may compete and mate with any other (thus selection and mating are global). Global parallel GAs are usually implemented as master-slave programs, where the master stores the population and the slaves evaluate the fitness.
The most common operation that is parallelized is the evaluation of the individuals, because the fitness of an individual is independent from the rest of the population, and there is no need to communicate during this phase. The evaluation of individuals is parallelized by assigning a fraction of the population to each of the processors available. Communication occurs only as each slave receives its subset of individuals to evaluate and when the slaves return the fitness values.
If the algorithm stops and waits to receive the fitness values for all the population before proceeding into the next generation, then the algorithm is synchronous. A synchronous master-slave GA has exactly the same properties as a simple GA, with speed being the only difference. However, it is also possible to implement an asynchronous master-slave GA where the algorithm does not stop to wait for any slow processors, but it does not work exactly like a simple GA. Most global parallel GA implementations are synchronous and the rest of the paper assumes that global parallel GAs carry out the exact same search of simple
GAs.
The global parallelization model does not assume anything about the underlying computer architecture, and it can be implemented efficiently on shared-memory and distributed-memory computers. On a shared-memory multiprocessor, the population could be stored in shared memory and each processor can read the individuals assigned to it and write the evaluation results back without any conflicts.
On a distributed-memory computer, the population can be stored in one processor. This "master" processor would be responsible for explicitly sending the individuals to the other processors (the "slaves") for evaluation, collecting the results, and applying the genetic operators to produce the next generation. The number of individuals assigned to any processor may be constant, but in some cases (like in a multi-user environment where the utilization of processors is variable) it may be necessary to balance the computational load among the processors by using a dynamic scheduling algorithm (e.g., guided self-scheduling).
Fogarty and Huang [FOG 91] attempted to evolve a set of rules for a pole balancing application which takes a considerable time to simulate. They used a network of transputers which are microprocessors designed specifically for parallel computations. A transputer can connect directly to only four transputers and communication between arbitrary nodes is handled by retransmitting messages. This causes an overhead in communications, and in an attempt to minimize it, Fogarty and Huang connected the transputers in different topologies. They concluded that the configuration of the network did not make a significant difference. They obtained reasonable speed-ups, but identified the fast-growing communication overhead as an impediment for further improvements in speed.
Abramson and Abela [ABR 92] implemented a GA on a shared-memory computer (an Encore Multimax with 16 processors) to search for efficient timetables for schools. They reported limited speed-ups, and blamed a few sections of serial code on the critical path of the program for the results. Later, Abramson, Mills, and Perkins [ABR 93] added a distributed-memory machine (a Fujitsu AP1000 with 128 processors) to the experiments, changed the application to train timetables, and modified the code. This time, they reported significant (and almost identical) speed-ups for up to 16 processors on the two computers, but the speed-ups degrade significantly as more processors are added, mainly due to the increase in communications.
A more recent implementation of a global GA is the work by Hauser and Manner [HAU 94]. They used three different parallel computers, but only obtained good speed-ups on a NERV multiprocessor (speed-up of 5 using 6 processors), that has a very low communications overhead. They explain the poor performance on the other systems they used (a SparcServer and a KSR1) on the inadequate scheduling of computation threads to processors by the system.
Another aspect of GAs that can be parallelized is the application of the genetic operators. Crossover and mutation can be parallelized using the same idea of partitioning the population and distributing the work among multiple processors. However, these operators are so simple that it is very likely that the time required to send individuals back and forth would offset any performance gains.
The communication overhead also obstructs the parallelization of selection, mainly because several forms of selection need information about the entire population and thus require some communication. Recently, Branke [BRA 97] parallelized different types of global selection on a 2-D grid of processors and show that their algorithms are optimal for the topology they use (the algorithms take O (Vn) time steps on a Vn x Vn grid).
In conclusion, master-slave parallel GAs are easy to implement and it can be a very efficient method of parallelization when the evaluation needs considerable computations. Besides, the method has the advantage of not altering the search behavior of the GA, so we can apply directly all the theory available for simple GAs.
2.9 Multiple-Deme Parallel GAs”The First Generation
The important characteristics of multiple-deme parallel GAs are the use of a few relatively large subpopulations and migration. Multiple-deme GAs are the most popular parallel method, and many papers have been written describing innumerable aspects and details of their implementation. Obviously, it is impossible to review all these papers, and therefore this section presents only the most influential papers and a few relevant examples of particular implementations and applications.
Probably the first systematic study of parallel GAs with multiple populations was Grosso's dissertation [GRO 85]. His objective was to simulate the interaction of several parallel subcomponents of an evolving population. Grosso simulated diploid individuals (so there were two subcomponents for each "gene"), and the population was divided into five demes. Each deme exchanged individuals with all the others with a fixed migration rate.
With controlled experiments, Grosso found that the improvement of the average population fitness was faster in the smaller demes than in a single large panmictic population. This confirms a long-held principle in Population Genetics: favorable traits spread faster when the demes are small than when the demes are large. However, he also observed that when the demes were isolated, the rapid rise in fitness stopped at a lower fitness value than with the large population. In other words, the quality of the solution found after convergence was worse in the isolated case than in the single population.
With a low migration rate, the demes still behaved independently and explored different regions of the search space. The migrants did not have a significant effect on the receiving deme and the quality of the solutions was similar to the case where the demes were isolated. However, at intermediate migration rates the divided population found solutions similar to those found in the panmictic population. These observations indicate that there is a critical migration rate below which the performance of the algorithm is obstructed by the isolation of the demes, and above which the partitioned population finds solutions of the same quality as the panmictic population.
In a similar parallel GA by Pettey, Leuze, and Grefenstette [PET 87a], a copy of the best individual found in each deme is sent to all its neighbors after every generation. The purpose of this constant communication was to ensure a good mixing of individuals. Like Grosso, the authors of this paper observed that parallel GAs with such a high level of communication found solutions of the same quality as a sequential GA with a single large panmictic population. These observations prompted other questions that are still unresolved today:
1) What is the level of communication necessary to make a parallel GA behave like a panmictic GA
2) What is the cost of this communication And,
3) Is the communication cost small enough to make this a viable alternative for the design of
parallel GAs
More research is necessary to understand the effect of migration on the quality of the search in parallel GAs to be able to answer these questions.
It is interesting that such important observations were made so long ago, at the
Reply
#2
[attachment=5708]
Natural Language Processing

NATURAL LANGUAGE PROCESSING (NLP) is one of the upcoming applications of AI. The goal of the Natural Language Processing (NLP) is to design and build software that will analyze, understand, and generate languages that humans use naturally, so that eventually you will be able to address your computer as though you were addressing another person.


This goal is not easy to reach. "Understanding" language means, among other things, knowing what concepts a word or phrase stands for and knowing how to page link those concepts together in a meaningful way. It's ironic that natural language, the symbol system that is easiest for humans to learn and use, is hardest for a computer to master. Long after machines have proven capable of inverting large matrices with speed and grace, they still fail to master the basics of our spoken and written languages.


The challenges we face stem from the highly ambiguous nature of natural language. As an English speaker you effortlessly understand a sentence like "Flying planes can be dangerous". Yet this sentence presents difficulties to a software program that lacks both your knowledge of the world and your experience with linguistic structures. Is the more plausible interpretation that the pilot is at risk, or that the danger is to people on the ground? Should "can" be analyzed as a verb or as a noun? Which of the many possible meanings of "plane" is relevant? Depending on context, "plane" could refer to, among other things, an airplane, a geometric object, or a woodworking tool. How much and what sort of context needs to be brought to bear on these questions in order to adequately disambiguate the sentence?


We address these problems using a mix of knowledge-engineered and statistical/machine-learning techniques to disambiguate and respond to natural language input. Our work has implications for applications like text critiquing, information retrieval, question answering, summarization, gaming, and translation. The grammar checkers in Office for English, French, German, and Spanish are outgrowths of our research; Encarta uses our technology to retrieve answers to user questions; Intellishrink uses natural language technology to compress cellphone messages; Microsoft Product Support uses our machine translation software to translate the Microsoft Knowledge Base into other languages. As our work evolves, we expect it to enable any area where human users can benefit by communicating with their computers in a natural way.


Extensive research in NLP over the past decade has brought us one of the most useful applications of AI: machine translation. If we could one day created a program that could translate (for example) English text to Japanese and vice versa without need of polishing by a professional translator then bridges of communication could be significantly widened. Our current translation programs have not yet reached this level, but they may do so very soon. In particular, NLP research also deals with speech recognition. Currently, programs that convert spoken speech into text have been widely used and are fairly dependable.


Recent research in Machine Translation (MT) has focused on “data-driven” systems. Such systems are “self-customizing” in the sense that they can learn the translations of terminology and even stylistic phrasing from already translated materials. Microsoft Research’s MT (MSR-MT) system is such a data-driven system, and it has been customized to translate Microsoft technical materials through the automatic processing of hundreds of thousands of sentences from Microsoft product documentation and support articles, together with their corresponding translations. This customization processing can be completed in a single night, and yields an MT system that is capable of producing output on par with systems that have required months of costly human customization.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: ppt extraction of natural dye, a pragmatic journey, deal breakercom, theory of electrification in natural process, genetic algorithm advantages, matlab natural frequency, project report on genetic algorithm,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  SAMBA SERVER ADMINISTRATION full report project report tiger 3 4,756 17-01-2018, 05:40 PM
Last Post: AustinnuAke
  air ticket reservation system full report project report tiger 16 46,877 08-01-2018, 02:33 PM
Last Post: RaymondGom
  An Efficient Algorithm for Mining Frequent Patterns full report project topics 3 4,764 01-10-2016, 10:02 AM
Last Post: Guest
  online examination full report project report tiger 14 42,877 03-09-2016, 11:20 AM
Last Post: jaseela123d
  watermarking algorithm seminar class 3 2,675 27-04-2016, 11:17 AM
Last Post: dhanabhagya
  Employee Cubicle Management System full report computer science technology 4 5,115 07-04-2016, 11:37 AM
Last Post: dhanabhagya
  e-Post Office System full report computer science technology 27 25,979 30-03-2016, 02:56 PM
Last Post: dhanabhagya
  college website project full report project report tiger 28 67,210 29-11-2015, 02:37 PM
Last Post: Guest
  DYNAMIC SEARCH ALGORITHM IN UNSTRUCTURED PEER-TO-PEER NETWORKS--PARALLEL AND DISTRIBU electronics seminars 9 7,362 14-07-2015, 02:25 PM
Last Post: seminar report asees
  steganography full report project report tiger 31 33,848 07-07-2015, 02:57 PM
Last Post: seminar report asees

Forum Jump: