File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/79/j79-1031_abstr.xml

Size: 5,754 bytes

Last Modified: 2025-10-06 13:45:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="J79-1031">
  <Title>American Journal of Computational Linguistics Microfiche 31 A CASE-DRIVE14 PARSER FOR NATURAL LANGUAGE</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
2 Contact R, S. Rosenberg for information on [ 8 1.
</SectionTitle>
    <Paragraph position="0"> detailed operation of the system viLl be described subsequently, we will now present the najor infxuences on this uork, some similar systems, and what ue believe the i~portant contributions to be.</Paragraph>
    <Paragraph position="1"> Beginning at the end, we decided to represent an input sentence with a structure which is very similar to the conceptual aependency networks of %hank[ 5,6 1. This does not imp11 agreement with the overall philosophy of Schank, but rather a recognition that an underlying representation should contain as much knowledge as possible, as it map be crucial for subsequent analysis. As will be seen, our representation, in addition to the basic syntactic relations, also reveals semantic relations not explicitly given in the input. sentence, This latter knowledge is derived f roa a conplex seaantic lexicon organize&amp; around the concept of case as first formulated by Fillmore[3]. One point should be emphasized: Fillmore, as a linguist. was concerned uith foranIating a theory to explain data which a traxisfarmatioaal a~~roach seened unable to ao, AS such, he felt the need to worry about the number and nature of the cases necessary to treat the f ingaistic data adequately.</Paragraph>
    <Paragraph position="2"> Thrs is to be contrasted with our approach which is ta use an extended version of case in order to represent the meaning of a sentence as fqlly as possible. The basic difference is revealed in illn no re's aescription of five or six cases, whereas our system uses, at present, twenty-four cases.</Paragraph>
    <Paragraph position="3"> Perhaps the tar. ncasetl is inappropriate here, but there is anough similarity with, and motivation from, Pillmore's vork that we decided to use the term. The most complex part of the lexicon is the verb with its associated case frame: actually an environaent of obligatory and optional cases associated with the verb. One basic problea of sentence analysis is to choose among alternate verb meanings for an appropriate aandidate, This selection process is governed, in patt, by attempting to satisfy the constraints imposed by the case frames associated with each verb meaning. But prior to activating the case-driven part of the system, a preliminary stage of analysis must be initiated.</Paragraph>
    <Paragraph position="4"> This is an almost purely syntactic phase carried out by a rather siaple augmented transition network (A , (Roods[ 11,12 1). he AT1 has proven to be very useful in natural language processing mainly because of the ease of representing complicated and interrelated syntactic structures, For our purposes, the ATH is used to produce a very fast preliminary parse of the input sentence which indicates gross structural re la tions, Using this parse, the case-driven component seeks to select the appropriate verb meaning. It is importaat to note that if the Gase procedures fail on the first pass, condition@ for success are progressively weakened until the most suitable meaning is chosen. Although not a feature of the present system, it would be possible to re-enter the ATN phase in order to produce another parse if the case phase were unable to complete its task in a satisfactory manner.</Paragraph>
    <Paragraph position="5"> Ye would like to stress those aspects of the systea uhich aake it flexible and useful for a wide range of language processing applications. It is straightforward to incorporate different kinds of knowledge necessary for adequate processing.</Paragraph>
    <Paragraph position="6"> Puz exa~ple, the current syste~ has a procedure for resolving a fair range of anaphoric references for pronouns. If additional procedures are developed, they can also be incorporated into the system, and will exert their inffiuence by aodifyiag the search procedures for candidates which satisfy the reqaireaents of the case frane for verbs. another important feature is the facility within the dictionary entries of nouns for providing inf oreation about relevant properties, such as superset and subset. Thus a kind of semantic netuorh links nouns of the dictionary, and this knouleage is available to aid in the processing.</Paragraph>
    <Paragraph position="7"> The user can construct the dictionary appropriate for his purposes and can readily add necessary do~ain specific f eatares.</Paragraph>
    <Paragraph position="8"> This system is considerably more than just a front-end for a traditional linear language proaessor. ~t integrates syntactic and se~antic linguistic kflovledge in a particularly transparent and flexible mannet.</Paragraph>
    <Paragraph position="9"> There are other current language systems ubich are based on nations of case but which differ mainly in the uap. the processed sentence is represented, Ue might mention ~imaons[7], Martin[4], and Brace[ 1 1.</Paragraph>
    <Paragraph position="10"> The system is written in LISP/!lTS, and runs on an IBB 370168 under the HTS operating system at the University of British Columbia. The code occupies 240K bytes, and the current dictionacy of 450 words occupies an additional 90R bytes. When the system is running the total space used is 470K bytes. In spite of its large size, it is relatively fast. For example, the total time taken to parse sentence (11) below, is -90 CPU seconds, executing interpretively. A con piled version of the progran uould run approximately 10 times faster.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML