XML Viewer - c90-3080

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-3080_metho.xml
Size: 11,534 bytes
Last Modified: 2025-10-06 14:12:31
<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-3080">
  <Title>QPATR and Constraint Threading</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
I Introduction
</SectionTitle>
    <Paragraph position="0"> QPATR (&amp;quot;Quick PATR&amp;quot;) is an MS -DOS Arity/PROLOG implementation of the PATR-II formalism (of Shieber et al. 1983, Shieber 1986) with certain logical extensions. The name was chosen to reflect the fact that the prototype system was developed in a short period of time but nevertheless runs quickly enough for practical use. QPATR was developed at the University of Dt~sseldorf within the research project &amp;quot;Simulation of Lexical Acquisition&amp;quot;, which is funded by the Deutsche Forsehtmgsgemeinschaft.</Paragraph>
    <Paragraph position="1"> In contrast to most existing PATR implementations such as D-PATR (cf Karttunan 1986a, 1986b), QPATR runs under MS-DOS and thus makes minimal hardware demands. Like ProP (of Carpenter 1989) QPATR is implemented in PROLOG but uses both the negation and disjunction of PROLOG in the extended PATR formalism; moreover, it employs a left-comer parser with a &amp;quot;linking&amp;quot; relation and PROI/3G baclctracking rather than a pure bottom-up chart parser.</Paragraph>
    <Paragraph position="2"> The system comprises the following components: (1) grammar compiler, (2) unification, (3) left-comer parser, (4) lexieal look-up, (5) input/output, (6) testing off-line input, and (7) tracing. The grammar compiler (1) transforms syntax rules and lexical entries from their external notation to an internal form; at the same time partial feature structure matrices (FSMs) are constructed and the linking relation (see below) is constructed. The unification package (2) uses techniques introduced by Eisele and D5rre (1986) and described by Gazdar and Mellish (1989) to implement the unification of FSMs with the term unification of PROLOG. A facility of prediction is included in the input/output package that allows new lexicai items in input to be identified on the basis of contextual information. While QPATR uses a full-foma lexicon at present, a package for morphological analysis is being developed.</Paragraph>
    <Paragraph position="3"> Since QPATR is distributed in a compiled version, knowledge of PROLOG is only needed in order to write macros (see below) but not to write grammars or to rttrl the system. Thus, QPATR can also be used in instruction with students who have no background in PROLOG programming.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="382" type="metho">
    <SectionTitle>
2 Descriptions of FSMs
</SectionTitle>
    <Paragraph position="0"> The formalism of PATR-H has been adopted for QPATR and will not be inuoduced here. As presented by Shieber (1986: 21) rules consist of a context-flee skeleton introducing variables for FSMs and a conjunction of path equations that describe the FSMs, e.g.:</Paragraph>
    <Paragraph position="2"> where cat, head, and subject are attributes. Such path equations are written with &amp;quot;*=&amp;quot; in QPATR, which is implemented with the nonaal (&amp;quot;destructive&amp;quot;) PROLOG unification. Furthermore, QPATR provides for pseudo-constraints written with &amp;quot;*==&amp;quot; in the path equations, which capture the expressiveness of constraining schemata in LFG (of Kaplan/Bresnan 1982: 213) and allow the grammar writer to specify that some attribute must ,ugt receive a value unifiable with the indicated value. These are implemented with the &amp;quot;==&amp;quot; unification of PROLOG.</Paragraph>
    <Paragraph position="3"> FSMs are described in QPATR with a logic generally based on that developed by Kasper and Rounds (1986). The presentation of the logical description language here is parallel to that of Carpenter (1989).</Paragraph>
    <Paragraph position="4"> Atomic well-formed formulas (wffs) of this logic consist of the two types of equations just introduced as well as macro heads (see below); heads of macros defined in terms of constraints are prefixed with the operator &amp;quot;@&amp;quot; in atomic wffs. Equations contain two designators, which are atoms or FSM variables, implemented with PROLOG atoms and variables, respectively, or else paflm. The latter are defined recursively and may contain atoms or paths as attribute expressions. The evaluation of emtwxlded paths must yield an atom.</Paragraph>
    <Paragraph position="5"> All derived wffs of the logic are built from atomic descriptions with conjunction &amp;quot;,&amp;quot;, disjunction &amp;quot;;&amp;quot;, and negation &amp;quot;not&amp;quot;; parentheses may be simplified in the customary manner. Disjunction and negation are not directly reflected in the FSMs generated in QPATR.</Paragraph>
    <Paragraph position="6"> Disjunctions are implemented with PROLOG backtracking, wtfile negations are treated like pseudo-constraints, which are executed as tests after the complete FSM of an input phrase has been constructed by the parser. The &amp;quot;negation&amp;quot; employed here is thus the negation-as.failure of PROLOG.</Paragraph>
    <Paragraph position="7"> FSMs themselves are represented internally as a PROLOG list of feature-value pairs with a variable  remainder list (ef Eisele/D0rre 1986: 551; Oazdar/MeUish 1989: 228). Since FSMs are described rather than directly represented in the grammar and lexicon, these internal PROLOG representations normally are neither constructed nor seen by the user.</Paragraph>
    <Paragraph position="8"> The syntax of the logical description language is defined here in Backus-Naur form:  Macros (or templates; cf Shieber 1986: 51) may be employed in QPATR to reduced redundancy in syntax rules and lexical entries and thereby to capture generalizations. In the present version of QPATR macros are defined as conjunctions of other macros and FSM descriptions with &amp;quot;*=&amp;quot; and &amp;quot;*==&amp;quot;; they may not contain disjunctions or negations. Furthermore, macros may not be defined reeursively as this would lead to nonterminating loops.</Paragraph>
    <Paragraph position="9"> Since macros are ultimately defined in terms of FSM descriptions with &amp;quot;*=&amp;quot; and &amp;quot;*==&amp;quot;, which themselves are implemented as executable PROLOG goals, macros are represented in the present QPATR version simply as PROLOG inference rules with a head consisting of the macro name as its predicate and the variables for FSMs referred to as its arguments. This is the only part of the system that requires elementary PROLOG programming in order to write grammars in the formalism.</Paragraph>
    <Paragraph position="10"> A special representation language for the definition of macros is being developed and will be included in new versions of QPATR.</Paragraph>
  </Section>
  <Section position="4" start_page="382" end_page="383" type="metho">
    <SectionTitle>
4 Rules and Lexlcal Entries
</SectionTitle>
    <Paragraph position="0"> Syntax rules are indexed with an hlteger which is used by the linking relation constructed during compilation of the grammar into its intea:nal form (see below). The mtmbering of rules is arbitrary and need not be consecutive or ordered.</Paragraph>
    <Paragraph position="1"> Category descriptions are macro heads. In principle, a single dummy macro name cat can be used for all categories so that all information about the FSMs contained in a rule is put in the description wff of the right-hand side; however, the linking relation would then lose its value for the parser. In order to modularirz the grammatical description, the wffs of rules and entries may be defined exclusively in terms of macros.</Paragraph>
    <Paragraph position="2"> The syntax of rules and lexical entries is defined as follows:  By convention, the wffs of rules and lexical entries are written in conjunctive normal form as a list of atomic wffs, disjunctions, and negations. When a rule or entry is compiled the list representing its wff is sorted into lists of atomic wffs (except constraints), disjunctions, and constraints (including negations) whose members are executed as PROLOG goals before, during, and after parsing, respectively. The execution of the atomic wffs without constraints builds partial FSMs which contribute to the information encoded in the linking relation (see below). In their compiled form rules and entries thus contain partial FSMs associated with lists of disjunctions and negations that apply to them.</Paragraph>
    <Paragraph position="3"> Disjunctions are executed during parsing and make use of the normal backtracking mechanism of PROI.PSK\] while constraints and negations are executed after parsing to test whether a FSM in fact fulf'dls all conditions of the original wff. During parsing the constraints and negations contributing to the complete description of the FSM associated with the input must be collected. In order to accomplish this a technique of constraint threading is introduced based on the difference lists used by Pereira  and Shieber (1987) for gap threading. The PROLOG term associated with a syntactic constituent contains difference lists of constraints associated with the constituent before and after it has been parsed. The first difference list for an entire input phrase is the empty list, whi!e the second is instantiated with the complete list of constraints and negations after parsing is completed.</Paragraph>
    <Paragraph position="4"> A complication arises from the fact that constraints and negations may be embedded in disjunctions and that their execution must be deferred. This can be dealt with by &amp;quot;percolating&amp;quot; such embedded constraints up into rite difference lists for constraint threading when the disjunction is solved. The following program implements the execution of disjunctions during parsing:</Paragraph>
    <Paragraph position="6"> solve_disjunctions(\[\], C, C).</Paragraph>
    <Paragraph position="7"> solve disjunctions(\[DIDs\], C0, C) :dsolve(D, C0, C1), solve_disjunctions(Ds, C1, C).</Paragraph>
    <Paragraph position="8"> dsolve((Wff ; Wffs), CO, C):l, (dsolve(Wff, C0,C) ; dsolve(Wffs,C0,C)). dsolve(fWff , Wffs), C0, C) :-I, dsolve(Wff, C0,C1), dsolve(Wffs,C1,C). dsolve((not Wff), C, \[(not Wff)lC\]) :- I. dsolve((@ Wff), C, \[WfflC\]) :- I.</Paragraph>
    <Paragraph position="9"> dsolve(Wff, C, C) :- call(Wff).</Paragraph>
  </Section>
  <Section position="5" start_page="383" end_page="383" type="metho">
    <SectionTitle>
6 The Parser of QPATR
</SectionTitle>
    <Paragraph position="0"> The parser is based on a left-comer algorithm with backtracking for context-free grammars (cf Kilbury 1988 and Pereira/Shieber 1987: 179fO. The efficiency of the parser is improved with top-down filtering in the feral of a linking relation (cf Pereira/Shieber 1987: 182). This ordinarily is a transitive binary relation over categories represented as PROliX\] atoms or terms with atomic category labels as functors. The PATR formalism requires a modified technique since the syntax rules contain FSMs, whose unification is more costly than that of atomic category lables. QPATR therefore uses numbered syntax rules and then defines the filter with a binary relation over the rule indices. If the grammar contains some rules</Paragraph>
    <Paragraph position="2"> where the subscripted F's are FSMs, then we have dlink(ij) iff F~z subsumes F~0, i.e. if F~ is an immediate left corner of F/0. Then link(ij) is the reflexive and transitive closure of dlink(ij).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML