XML Viewer - p92-1040

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/p92-1040_metho.xml
Size: 9,909 bytes
Last Modified: 2025-10-06 14:13:20
<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1040">
  <Title>REFERENCES</Title>
  <Section position="3" start_page="0" end_page="288" type="metho">
    <SectionTitle>
ON LINGUISTIC THEORY
</SectionTitle>
    <Paragraph position="0"> For a parser to be linguistically motivated, it must be transparent to a linguistic theory, under some precise notion of transparency (see Abney 1987)~ GB theory is a modular theory of abstract principles. A parser which encodes a modular theory of grammax must fulfill apparently contradictory demands: for the parser to be explanatory it must maintain the modularity of the theory, while for the paxser to be efficient, modularization must be minimized so that all potentially necessary information is available at all times, x We explore a possible solution to this contradiction. We observe that linguistic information can be classified into 5 different classes, as shown in (1), on the basis of their informational content. These we will ca\]\] IC Classes.</Paragraph>
    <Paragraph position="1">  (1) a. Configurations: sisterhood, c-command, m-command, :t:maximal projection ...</Paragraph>
    <Paragraph position="2"> b. Lexical features: ~N, +-V, +-Funct, +-c-selected, :t:Strong Agr ...</Paragraph>
    <Paragraph position="3"> c. Syntactic features: +-Case, ~8, +-7, ~baxrier.</Paragraph>
    <Paragraph position="4"> d. Locality information: minimality, binding, antecedent government.</Paragraph>
    <Paragraph position="5"> e. Referential information: +D-linked, +-anaphor, +-pronominal.</Paragraph>
    <Paragraph position="6"> IOn efficiency of GB-based systems tad(1990), Kashkett(1991).</Paragraph>
    <Paragraph position="7"> see RJs null This classification can be used to specify precisely the amount of modularity in the parser. Berwick(1982:400ff) shows that a modulax system is efficient only if modules that depend on each other axe compiled, while independent modules axe not. We take the notion of dependent and independent to correspond to IC Classes, in that primitives that belong to the same IC Class axe dependent on each other, while primitives that belong to different IC Classes axe independent from each other. We impose a modularity requirement that makes precise predictions for the design of the parser.</Paragraph>
    <Paragraph position="8"> Modularity Requirement (MR) Only primitives that belong to the same IC Class can be compiled in the parser.</Paragraph>
  </Section>
  <Section position="4" start_page="288" end_page="288" type="metho">
    <SectionTitle>
RECOVERING PHRASE
STRUCTURE
</SectionTitle>
    <Paragraph position="0"> According to the MR, notions such as headedness, directionality, sisterhood, and maximal projection can be compiled and stored in a data structure, because these notions belong to the same IC Class, configurations. These features are compiled into context-free rules in our parser. These basic X rules axe augmented by A rules licensed by the part of Trace theory that deals with configurations. The crucial feature of this grammar is that nontermina\]s specify only the X projection level, and not the category. The full context-free grammax is shown in Figure 1.</Paragraph>
    <Paragraph position="1"> The recovery of phrase structure is a crucial component of a parser, as it builds the skeleton which is needed for feature annotation. It must be efficient and it must fail as soon as an error is encountered, in order to limit backtracking. An LR(k) parser (Knuth 1965) has these properties, since it is deterministic on unambiguous input, and it has been proved to recognize only valid prefixes. In our parser, we compile the grammar shown above into an LALR(1) (Aho and Ullma~n 1972) parse table. The table has been modified</Paragraph>
    <Paragraph position="3"> in order to have more than one action for each table entry. 2 Three stacks are used: a stack for the states traversed so far; a stack for the semantic attributes associated with each of the nodes; a tree stack of partial trees. The LR algorithm is encoded in a parse predicate, which establishes a relation between two sets of 5-tuples, as shown in (2). s (2) Tix$ixA~xCixPT~--* T~xSjxA.~xCjxPT~ Our parser is more elaborate and less restrictive than a standard LR parser, because it imposes conditions on the attributes of the states and it is nondeterministic. In order to reduce the amount of nondeterminism, some predictive power has been introduced. The cooccurenee restrictions between categories, and subcategorization information of verbs is compiled in a table, which we call Left Corner Prediction Table (LC Table). By looking at the current token, at its category label, and its subcategorization frame, the number of choices of possible next states can be restricted. For instance, if the current token is a verb, and the LR table allows the parser either to project one level up to V ~, or it requires to create an empty object NP, then, on consulting the subcategorization information, the parser can eliminate the second option as incorrect if the verb is intransitive.</Paragraph>
  </Section>
  <Section position="5" start_page="288" end_page="289" type="metho">
    <SectionTitle>
RESULTS AND COMMENTS
</SectionTitle>
    <Paragraph position="0"> The design presented so far embodies the MR, since it compiles only dependent features in two tables off-line. Compared to the use of partially or fully instantiated context-free grammars, this 2This modification is necessary because the grammar compiled into the LR table is not an LR grammar. Sin (2) T~ is an element of the set of input tokens, Ss is an element of the set of states in the LR table, At is an element of the set of attributes associated with each state in the table, C~ iS an element of the set of chains, i.e. displaced element, and PTk iS an element of the set of tokens predicted by the left corner table (see below).</Paragraph>
    <Paragraph position="1">  organization of the parsing algorithms has been found to be better on several grounds.</Paragraph>
    <Paragraph position="2"> Consider again the X grammar that we use in the parser, shown in Figure 1. One of the crucial features of this grammar is that the nonterminals are specified only for level and headedness. This version of the grammar is a recent result. In previous implementations of the parser, the projections of the head in a rule were instantiated: for instance NP--~ YP IV' . Empirically, we find that on compiling the partially instantiated grammar the number of rules is increased proportionately to the number of categories, and so is the number of conflicts in the table. Figure 2 shows the relative sizes of the LALR(1) tables and the number of conflicts. Moreover, on closer inspection of the entries in the table, categories that belong to the same level of projection show the same reduce/reduce conflicts. This means that introducing unrestricted categoriM information increases the size of the table without decreasing the number of conflicts in each entry, i.e. without reducing the nondeterminism in the table.</Paragraph>
    <Paragraph position="3"> These findings confirm that categorial information can be factored out of the compiled table, as predicted by the MR. The information about cooccurrenee restrictions, category and subcategorization frame is compiled in the Left Corner (LC) table, as described above. Using two compiled tables that interact on-line is better than compiling all the information into a fully instantiated, standard context-free grammar for several reasons. 4 Computational\]y, it is more efllcient, s Practically, manipulating a small, highly abstract grammar is 4Fully iustantiated grammars have been used, among others, by Tomita(1985) in an LR parser, and by Doff(1990), Fong(1991) in GB-based parsers.</Paragraph>
    <Paragraph position="4"> sit has been argued elsewhere that for context-free parsing algorithms, the size of the graxrtrnsr (which iS a constant factor) can easily become the predominant factor for a11 useful inputs (see Berwick and Weinberg 1982). Work on compilation of parsers that use GPSG seems to point in the same direction. The separation of strnctu~al information from cooccttrence restrictions iS advocated in Kilbury(1986); both Shieber(1986) and Phi\]Hps(1987) argue that the combinatorial explosion (Barton 1985) of a fully expanded ID/LP formalism can be avoided by using feature variables in the compiled gxammar. See also Thompson 1982.</Paragraph>
    <Paragraph position="5"> much easier. It is easy to maintain and to embed in a full-fledged parsing system. Linguistically, a fully-instantiated paxser would not be transpaxent to the theory and it would be language dependent.</Paragraph>
    <Paragraph position="6"> Finally, it could not model some experimental psycholingnistic evidence, which we present below.</Paragraph>
  </Section>
  <Section position="6" start_page="289" end_page="289" type="metho">
    <SectionTitle>
PSYCHOLINGUISTIC SUPPORT
</SectionTitle>
    <Paragraph position="0"> A reading task is presented in F~azier and Rayner 1987 where eye movements are monitored: they find that in locally ambiguous contexts, the ambiguous region takes less time than an unambiguous eounterpaxt, while a slow down in processing time is registered in the disambiguating region. This suggests that selection of major categorial information in lexically ambiguous sentences is delayed, e This delay means that the parser must be able to operate in absence of categorial information, making use of a set of category-neutral phrase structure rules. This separation of itemdependent and item-independent information is encoded in the grammax used in our paxser. A parser that uses instantiated categories would have to store categorial cooccurence restrictions in a different data structure, to be consulted in case of lexically ambiguous inputs. Such design would be redundant, because categorial information would be encoded twice.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML