Abstract: Traditionally, parsing of text is based on an explicit grammar and an associated parsing procedure. Examples of grammars are Context Free, Context Sensitive, Transformational, etc. The grammars are specified in a generative mode, A parsing procedure is then designed for the grammar in question (e.g. LR parsing, CYK parsing, Early parsing, etc) and is supposed to reverse the process: given text, find the particular generative sequence whose result was the text. Parsed text is useful in text understanding or in language translation. In most cases it consists of a tree with labeled nodes and individual words at the leaves of the tree. Understanding systems attempt to derive meaning from operations on the structure of the tree. Machine translators frequently accomplish their task by transforming the tree of the source language into a tree of the target language. There are two major problems with the traditional procedure: a grammar has to be designed, usually by hand, and corresponding text analysis yields highly ambiguous parses. For some time now, attempts have been made to extract the grammar automatically from data, attach probabilities to its productions, and resolve the parsing ambiguity by selecting the most probable parse. The grammar extraction process has been based on TREEBANKS which are data bases consisting of large amounts of parsed text. Cooperating researchers at IBM and the University of Pennsylvania have recently realized that since one is interested in parsing and not in generation, one might as well develop parsers directly, without recourse to the painful process of grammar development. Two separate and promising approaches have emerged, one statistical, one rule-based. This talk will describe both, and point out their differences and affinities.
Publication Year: 1996
Publication Date: 1996-01-01
Language: en
Type: book-chapter
Indexed In: ['crossref']
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot