File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/w97-1502_metho.xml

Size: 17,061 bytes

Last Modified: 2025-10-06 14:14:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1502">
  <Title>The TreeBanker: a Tool for Supervised Training of Parsed Corpora</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Representational Issue.,~
</SectionTitle>
    <Paragraph position="0"> In the version of QLF output by the CLE's analyser, content word senses are represented as predicates and predicate-argument relations are shown, so that selecting a single QLF during disambiguation entails resolving content word senses and many structural ambiguities. However, many function words, particularly prepositions, are not resolved to senses, and quantifier scope and anaphoric references are also left unresolved. Some syntactic information, such as number and tense, is represented. Thus QLF encodes quite a wide range of the syntactic and semantic information that can be useful both in supervised training and in run-time disambiguation.</Paragraph>
    <Paragraph position="1"> QLFs are designed to be appropriate for the inference or other processing that follows utterance analysis in whatever application (translation, database query, etc.) the CLE is being used for. However, they are not easy for humans to work with directly in supervised training. Even for an expert, inspecting all the analyses produced for a sentence is a tedious and time-consuming task. There may be dozens of analyses that are variations on a small number of largely independent themes: choices of word sense, modifier attachment, conjunction scope and so on.</Paragraph>
    <Paragraph position="2"> Further, if the representation language is designed with semantic and computational considerations in mind, there is no reason why it should be easy to read even for someone who fully understands it. And indeed, as already argued, it is preferable that selection of the correct analysis should as far as possible not require the intervention of experts at all.</Paragraph>
    <Paragraph position="3"> The TreeBanker (and, in fact, the CLE's preference mechanism, omitted here for space reasons but discussed in detail by Becket et al, forthcoming) therefore treats a QLF as completely characterized by its properties: smaller pieces of information, extracted from the QLF or the syntax tree associated with it, that are likely to be easy for humans to work with.</Paragraph>
    <Paragraph position="4"> The TreeBanker presents instances of many kinds of property to the user during training. However, its functionality in no way depends on the specific nature of QLF, and in fact its first action in the training process is to extract properties from QLFs and their associated parse trees, and then never again to process the QLFs directly. The database of analysed sentences that it maintains contains only these properties and not the analyses themselves.</Paragraph>
    <Paragraph position="5"> It would therefore be straightforward to adapt the TreeBanker to any system or formalism from which properties could be derived that both distinguished competing analyses and could be presented to a non-expert user in a comprehensible way. Many mainstream systems and formalisms would satisfy these criteria, including ones such as the University of Pennsylvania Treebank (Marcus et al, 1993) which are purely syntactic (though of course, only syntactic properties could then be extracted). Thus although I will ground the discussion of the TreeBanker in its use in adapting the CLE system to the ATIS domain, the work described is of much more general application.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="10" type="metho">
    <SectionTitle>
3 Discriminant-Based Training
</SectionTitle>
    <Paragraph position="0"> Many of the properties extracted from QLFs can be presented to non-expert users in a form they can easily understand. Those properties that hold for some analyses of a particular utterance but not for others I will refer to as discriminants (Dagan and ltai, 1994; Yarowsky, 1994). Discriminants that fairly consistently hold for correct but not (some) incorrect analyses, or vice versa, are likely to be useful in distinguishing correct from incorrect analyses at run time.</Paragraph>
    <Paragraph position="1"> Thus for training on an utterance to be effective, we need to provide enough &amp;quot;user-friendly&amp;quot; discriminants to allow the user to select the correct analyses, and as many as possible &amp;quot;system-friendly&amp;quot; discriminants that, over the corpus as a whole, distinguish reliably between correct and incorrect analyses. Ideally, a discriminant will be both user-friendly and system-friendly, but this is not essential. In the rest of this paper we will only encounter user-friendly properties and discriminants.</Paragraph>
    <Paragraph position="2"> The TreeBanker presents properties to the user in a convenient graphical form, exemplified in Figure 1 for the sentence &amp;quot;Show me the flights to Boston serving a meal&amp;quot;. Initially, all discriminants are displayed in inverse video to show they are viewed as undecided. Through the disambiguation process, discriminants and the analyses they apply to can be undecided, correct (&amp;quot;good&amp;quot;, shown in normal video), or incorrect (&amp;quot;bad&amp;quot;, normal video but preceded a negation symbol ..... ). The user may click on any discriminant with the left mouse button to select it as correct, or with the right button to select it as incorrect. The types of property currently extracted, ordered approximately from most to least user-friendly, are as follows; examples are taken from the six QLFs for the sentence used in figure 1.</Paragraph>
    <Paragraph position="3">  discriminant, holding only for readings that could be paraphrased &amp;quot;show me the flights to Boston while you're serving a meal&amp;quot;); VP for &amp;quot;serving a meal&amp;quot; (holds for all readings, so not a discriminant and not shown in figure 1).</Paragraph>
    <Paragraph position="4"> Semantic triples: relations between word senses mediated usually by an argument position, preposition or conjunction (Alshawi and Carter, 1994). Examples here (abstracting from senses to root word forms, which is how they are presented to the user) are &amp;quot;flight to Boston&amp;quot; and &amp;quot;show -to Boston&amp;quot; (the &amp;quot;-&amp;quot; indicates that the attachment is not a low one; this distinction is useful at run time as it significantly affects the likelihood of such discriminants being correct). Argument-position relations are less user-friendly and so are not displayed.</Paragraph>
    <Paragraph position="5"> When used at run time, semantic triples undergo abstraction to a set of semantic classes defined on word senses. For example, the obvious senses of &amp;quot;Boston&amp;quot;, &amp;quot;New York&amp;quot; and so on all map onto the class name co_city. These classes are currently defined manually by experts; however, only one level of abstraction, rather than a full semantic hierarchy, seems to be required, so the task is not too arduous.</Paragraph>
    <Paragraph position="6"> Word senses: &amp;quot;serve&amp;quot; in the sense of &amp;quot;fly to&amp;quot; (&amp;quot;does United serve Dallas?&amp;quot;) or &amp;quot;provide&amp;quot; (&amp;quot;does that flight serve meals?&amp;quot;).</Paragraph>
    <Paragraph position="7"> * Sentence type: imperative sentence in this case (other moods are possible; fragmentary sentences are displayed as &amp;quot;elliptical NP&amp;quot;, etc). * Grammar rules used: the rule name is given.</Paragraph>
    <Paragraph position="8"> This can be useful for experts in the minority of cases where their intervention is required.</Paragraph>
    <Paragraph position="9"> In all, 27 discriminants are created for this sentence, of which 15 are user-friendly enough to display, and a further 28 non-discriminant properties may be inspected if desired. This is far more than the three distinct differences between the analyses (&amp;quot;serve&amp;quot; as &amp;quot;fly to&amp;quot; or &amp;quot;provide&amp;quot;; &amp;quot;to Boston&amp;quot; attaching to &amp;quot;show&amp;quot; or &amp;quot;flights&amp;quot;; and, if &amp;quot;to Boston&amp;quot; does attach to &amp;quot;flights&amp;quot;, a choice between &amp;quot;serving a meal&amp;quot; as relative or adverbial). The effect of this is that the user can give attention to whatever discriminants he I finds it easiest to judge; other, harder ones will typically be resolved automatically by the TreeBanker as it reasons about what combinations of discriminants apply to which analyses. The first rule the TreeBanker uses in this reasoning process to propagate decisions is: R1 If an analysis (represented as a set of discriminants) has a discriminant that the user has marked as bad, then the analysis must be bad.</Paragraph>
    <Paragraph position="10"> This rule is true by definition. The other rules used depend on the assumption that there is exactly one 1I make the customary apologies for this use of pronouns, and offer the excuse that most use of the TreeBanker to date has been by men.</Paragraph>
    <Paragraph position="11"> ll good analysis among those that have been found, which is of course not true for all sentences; see Section 4 below for the ramifications of this.</Paragraph>
    <Paragraph position="12">  R2 If a discriminant is marked as good, then only analyses of which it is true can be good (since there is at most one good analysis).</Paragraph>
    <Paragraph position="13"> R3 If a discriminant is true only of bad analyses, then it is bad (since there is at least one good analysis).</Paragraph>
    <Paragraph position="14"> R4 If a discriminant is true of all the undecided  analyses, then it is good (since it must be true of the correct one, whichever it is).</Paragraph>
    <Paragraph position="15"> Thus if the user selects &amp;quot;the flights to Boston serving a meal&amp;quot; as a correct NP, the TreeBanker applies rule R2 to narrow down the set of possible good analyses to just two of the six (hence the item &amp;quot;2 good QLFs&amp;quot; at the top of the control menu in the figure; this is really a shorthand for &amp;quot;2 possibly good QLFs&amp;quot;). It then applies RI-R4 to resolve all the other discriminants except the two for the sense of &amp;quot;serve&amp;quot;; and only those two remain highlighted in inverse video in the display, as shown in Figure 2.</Paragraph>
    <Paragraph position="16"> So, for example, there is no need for the user explicitly to make the trickier decision about whether or not &amp;quot;serving a meal&amp;quot; is an adverbial phrase. The user simply clicks on &amp;quot;serve = provide&amp;quot;, at which point R2 is used to rule out the other remaining analysis and then R3 to decide that &amp;quot;serve = fly to&amp;quot; is bad.</Paragraph>
    <Paragraph position="17"> The TreeBanker's propagation rules often act like this to simplify the judging of sentences whose discriminants combine to produce an otherwise unmanageably large number of QLFs. As a further example, the sentence &amp;quot;What is the earliest flight that has no stops from Washington to San Francisco on Friday?&amp;quot; yields 154 QLFs and 318 discriminants, yet the correct analysis may be obtained with only two selections. Selecting &amp;quot;the earliest flight ... on Friday&amp;quot; as an NP eliminates all but twenty of the analyses produced, and approving &amp;quot;that has no stops&amp;quot; as a relative clause eliminates eighteen of these, leaving two analyses which are both correct for the purposes of translation. 152 incorrect analyses may thus be dismissed in less than fifteen seconds.</Paragraph>
    <Paragraph position="18"> The utterance &amp;quot;Show me the flights serving meals on Wednesday&amp;quot; demonstrates the TreeBanker's facility for presenting the user with multiple alternatives for determining correct analyses. As shown in Figure 3, the following decisions must be made: * Does &amp;quot;serving&amp;quot; mean &amp;quot;flying to&amp;quot; or &amp;quot;providing&amp;quot; ? * Does &amp;quot;on Wednesday&amp;quot; modify &amp;quot;show&amp;quot;, &amp;quot;flights&amp;quot;, &amp;quot;serving&amp;quot; or &amp;quot;meals&amp;quot;? * Does &amp;quot;serving&amp;quot; modify &amp;quot;show&amp;quot; or &amp;quot;flights&amp;quot;? but this can be done by approving and rejecting various constituents such as &amp;quot;the flights serving meals&amp;quot; and &amp;quot;meals on Wednesday&amp;quot;, or through the selection of triples such as &amp;quot;flight -on Wednesday&amp;quot;. Whichever method is used, the user can choose among the 14 QLFs produced for this sentence within twenty seconds. null</Paragraph>
  </Section>
  <Section position="6" start_page="10" end_page="13" type="metho">
    <SectionTitle>
4 Additional Functionality
</SectionTitle>
    <Paragraph position="0"> Although primarily intended for the disambiguation of corpus sentences that are within coverage, the TreeBanker also supports the diagnosis and categorization of coverage failures. Sometimes, the user may suspect that none of the provided analyses for a sentence is correct. This situation often becomes apparent when the TreeBanker (mis-)applies rules R2-R4 above and insists on automatically assigning incorrect values to some discriminants when the user makes decisions on others; the coverage failure may be confirmed, if the user is relatively accomplished, by inspecting the non-discriminant properties as well (thus turning the constituent window into a display of the entire parse forest) and verifying that the correct parse tree is not among those offered. Then the user may mark the sentence as &amp;quot;Not OK&amp;quot; and classify it under one of a number of failure types, optionally typing a comment as well. At a later stage, a system expert may ask the TreeBanker to print out all the coverage failures of a given type as an aid to organizing work on grammar and lexicon development. null For some long sentences with many different readings, more discriminants may be displayed than will fit onto the screen at one time. In this case, the user may judge one or two discriminants (scrolling if necessary to find likely candidates), and ask the TreeBanker thereafter to display only undecided discriminants; these will rapidly reduce in number as decisions are made, and can quite soon all be viewed at once.</Paragraph>
    <Paragraph position="1"> If the user changes his mind about a discriminant, he can click on it again, and the TreeBanker will take later judgments as superceding earlier ones, inferring other changes on that basis. Alternatively, the &amp;quot;Reset&amp;quot; button may be pressed to undo all judgments for the current sentence.</Paragraph>
    <Paragraph position="2"> It has proved most convenient to organize the corpus into files that each contain data for a few dozen sentences; this is enough to represent a good-sized  corpus in a few hundred files, but not so big that the user is likely to want to finish his session in the middle of a file.</Paragraph>
    <Paragraph position="3"> Once part of the corpus has been judged and the information extracted for run-time use (not discussed here), the TreeBanker may be told to resolve discriminants automatically when their values can safely be inferred. In the ATIS domain, &amp;quot;show -to (city)&amp;quot; is a triple that is practically never correct, since it only arises from incorrect PP attachments in sentences like &amp;quot;Show me flights to New York&amp;quot;. The user can then be presented with an initial screen in which that choice, and others resulting from it, are already made. This speeds up his work, and may in fact mean that some sentences do not need to be presented at all.</Paragraph>
    <Paragraph position="4"> In practice, coverage development tends to overlap somewhat with the judging of a corpus. In view of this, the TreeBanker includes a &amp;quot;merge&amp;quot; option which allows existing judgments applying to an old set of analyses of a sentence to be transferred to a new set that reflects a coverage change. Properties tend to be preserved much better than whole analyses as coverage changes; and since only properties, and not analyses, are kept in the corpus database, the vast bulk of the judgments made by the user can be preserved.</Paragraph>
    <Paragraph position="5"> The TreeBanker can also interact directly with the CLE's analysis component to allow a user or developer to type sentences to the system, see what discriminants they produce, and select one analysis for further processing. This configuration can be used in a number of ways. Newcomers can use it to familiarize themselves with the system's grammar. More generally, beginning students of grammar can use it to develop some understanding of what grammatical analysis involves. It is also possible to use this mode during grammar development as an aid to visualizing the effect of particular changes to the grammar on particular sentences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML