File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/j02-3005_evalu.xml
Size: 5,411 bytes
Last Modified: 2025-10-06 13:58:52
<?xml version="1.0" standalone="yes"?> <Paper uid="J02-3005"> <Title>c(c) 2002 Association for Computational Linguistics Squibs and Discussions A Note on Typing Feature Structures</Title> <Section position="6" start_page="393" end_page="396" type="evalu"> <SectionTitle> 4.2 Results </SectionTitle> <Paragraph position="0"> Applying the type inference algorithm to the XTAG English grammar, we have validated the consistency of all feature structures specified in the grammar. We have been able to detect a great number of errors, which we discuss in this section. The errors Inferred TFSs.</Paragraph> <Paragraph position="1"> can be classified into four different types: ambiguous names, typos, undocumented features, and plain errors.</Paragraph> <Paragraph position="2"> easy to track without the typing mechanism that we discuss in this article. As the XTAG grammar has been developed by as many as a dozen developers, over a period of more than a decade, such errors are probably unavoidable. Specifically, a single name is used for two different features or values, with completely different intentions in mind.</Paragraph> <Paragraph position="3"> We have found several such errors in the grammar.</Paragraph> <Paragraph position="4"> The feature gen was used for two purposes: in nouns, it referred to the gender, and took values such as masc, fem,orneuter; in pronouns, it was a boolean feature denoting genitive case. We even found a few cases in which the values of these incompatible features were equated. As another example, the value nom was used to denote both nominative case, where it was an appropriate value for the case feature, and to denote a nominal predicate, where it was the appropriate value of the mode feature. Of course, these two features have nothing to do with each other and should never be equated (hence, should never have the same value). Finally, values such as nil or none were used abundantly for a variety of purposes.</Paragraph> <Paragraph position="5"> 3 Recall that by the feature introduction condition, each feature must be introduced by some most general type (and be appropriate for all its subtypes).</Paragraph> <Paragraph position="6"> Wintner and Sarkar A Note on Typing Feature Structures typos. The best example is probably a feature that occurred about 80% of the time as relpron and the rest of the time as rel-pron: S_r.t:<relpron> = NP_w.t:<rel-pron> that are not mentioned in the technical report documenting the grammar. Some of them turned out to be remnants of old analyses that were obsolete; others indicated a need for better documentation. Of course, the fewer features the grammar is using, the more efficient unification (and, hence, parsing) becomes.</Paragraph> <Paragraph position="7"> Other cases necessitated updates of the grammar documentation. For example, the feature displ-const was documented as taking boolean values but turned out to be a complex feature, with a substructure under the feature set1. The feature gen (in its gender use) was defined at the top level of nouns, whereas it should have been under the agr feature.</Paragraph> <Paragraph position="8"> 4.2.4 Other Errors. Finally, some errors are plain mistakes of the grammar designer. For example, the specification S_r.t:<assign-case> = NP_w.t:<assign-case> implies that assign-case is appropriate for nouns, which is of course wrong; the specification S_r.t:<case> = nom implies that sentences have cases; and the specification V.t:<refl> = V_r.b:<refl> implies that verbs can be reflexive. Another example is the specification D_r.b:<punct bal> = Punct_1.t:<punct>, which handles the balancing of punctuation marks such as parentheses. This should have been either</Paragraph> <Paragraph position="10"> bal>.</Paragraph> <Section position="1" start_page="395" end_page="396" type="sub_section"> <SectionTitle> 4.3 Additional Advantages </SectionTitle> <Paragraph position="0"> Since the feature structure validation procedure practically expands path equations to (most general) totally well-typed feature structures, we have implemented a mode in which the system outputs the expanded TFSs. Users can thus have a better idea of what feature structures are associated with tree nodes, both because all the features are present, and because typing adds information that was unavailable in the untyped specification. As an example, consider the following specification:</Paragraph> <Paragraph position="2"> When it is expanded by the system, the TFS that is output for PP.b is depicted in Figure 5 (left). Note that the type of this TFS was set to p or v or comp, indicating that there is not sufficient information for the type inference procedure to distinguish among these three types. Many features that are not explicitly mentioned are added by the inference procedure, with their &quot;default&quot; (most general) values.</Paragraph> <Paragraph position="3"> The node N.t is associated with a TFS, parts of which are depicted in Figure 5 (right). It is worth noting that the type of this TFS was correctly inferred to be noun, and that the case feature is reentrant with the assign-case feature of the PP.b node (through the reentrancy tag [304]), thus restricting it to nom, although the specification listed a disjunctive value, nom/acc.</Paragraph> </Section> </Section> class="xml-element"></Paper>