File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-0502_concl.xml

Size: 8,929 bytes

Last Modified: 2025-10-06 13:53:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0502">
  <Title>Generating Hebrew verb morphology by default inheritance hierarchies</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
9 DATR and KATR
</SectionTitle>
    <Paragraph position="0"> We discuss KATR and its relation to DATR extensively elsewhere (Finkel et al., 2002); here we only summarize the differences. The DATR formalism is quite powerful; we have demonstrated that it is capable of emulating a Turing machine. The KATR enhancements are therefore aimed at usability, not theoretical power. The principal innovations of KATR are: a91 Set notation. The left-hand sides of DATR rules may only use list notation. KATR allows set notation as well, which allows us to deal with morphosyntactic properties in any order.</Paragraph>
    <Paragraph position="1"> Hebrew verb morphology provides abundant motivation for this enhancement. In theVERB-SUFFIX node, Rule 15 identifies a22a10a43a25 a27 as an exponent of number and gender but not of person; Rule 10 identifies a37a12 as an exponent of person and number but not of gender. Both rules are indifferent to the order in which properties of person, number, and gender are listed in any matching query. If a rule's left-hand side were required to be a list (as in ordinary DATR), then one of these two rules would have to be complicated by the inclusion of either a variable over properties of person (Rule 15) or a variable over properties of gender (Rule 10); moreover, all queries would have to adhere to a fixed (but otherwise unmotivated) ordering among properties of person, number, and gender.</Paragraph>
    <Paragraph position="2"> a91 Regular expressions. KATR allows limited regular expressions in lists in left-hand sides of rules; DATR has no such expressions. We use this facility in the ACCENT node in the Hebrew theory, both for the Kleene star * and for the ? operator. More generally, we often find regular expressions valuable in representing non-local sandhi phenomena, such as the Sanskrit rule of n-retroflexion.</Paragraph>
    <Paragraph position="3"> a91 Non-subtractive rules. DATR rules have a subtractive quality: The atoms of the query matched by the left-hand side are removed from the query used for subsequent evaluation in the right-hand side. The KATR =+= operator allows us to represent rules that preserve the atoms matched by the left-hand side, substituting new atoms where necessary. We generally use this facility for rules of referral. For example, Latin neuter nouns share the same nominative and accusative plural; we capture this fact by a rule that converts accusative to nominative in the context of neuter plural. In the Hebrew theory, we use non-subtractive rules to convert shewa nah. to shewa na.</Paragraph>
    <Paragraph position="4"> a91 Enhanced matching length. In some cases, competing rules have left-hand sides of the same length, but one of the rules should always be chosen when both apply. KATR includes the ++syntax for explicitly enhancing the effective length of the preferred left-hand side; we use this facility in the VERBSUFFIX node. DATR does not have this syntax.</Paragraph>
    <Paragraph position="5"> a91 Syntax. KATR has several minor syntax enhancements. It allows special characters to be used as atoms if escaped by the \ character.</Paragraph>
    <Paragraph position="6"> The atom $$ can be used to match the end of the query. Variables can be computed instead of being enumerated; we use this facility in defining the $letter variable. KATR allows greater control over which nodes are to be displayed under default queries. The interactive KATR program has new facilities for rapid testing and debugging of theories.</Paragraph>
    <Paragraph position="7"> KATR is entirely coded in Java, making it quite portable to a variety of platforms. It runs as an interactive program, with commands for compiling theories, executing queries, and performing various debugging functions. The KATR algorithm is based on evaluating a query at a node within a context.</Paragraph>
    <Paragraph position="8"> First, KATR identifies the rule within the node with the best matching left-hand side. The result of the query involves evaluating the associated right-hand side, which might require further evaluations of new queries at a variety of nodes and contexts; KATR recursively undertakes these evaluations. The algorithm is completely deterministic and reasonably fast: Compiling the entire Hebrew theory and evaluating all the forms of a verb takes about 2 seconds on an 863MHz Linux machine.</Paragraph>
    <Paragraph position="9"> The interested reader can acquire KATR and our Hebrew morphology theory from the authors (under the GNU General Public License).</Paragraph>
    <Paragraph position="10"> 10 Strategies for building KATR theories We have been applying KATR to generation of natural-language morphology for several years. In addition to Hebrew, we have built a complete morphology of Latin verbs and nouns, large parts of Sanskrit (and other related languages), and smaller studies of Bulgarian, Swahili, Georgian, and Turkish. We have found that KATR allows us to represent morphological rules for these languages with great elegance. It is especially well-suited to cases like Hebrew verbs, where a similar structure applies across the entire spectrum of words, and where that spectrum is partitioned into binyanim with distinguishable rules, but where euphony introduces standard vowel shifts based on accent, guttural letters, and weak letters.</Paragraph>
    <Paragraph position="11"> As we have gained experience with KATR, we have noted encoding strategies that apply across language families; we used each of these in our Hebrew verb specification.</Paragraph>
    <Paragraph position="12"> a91 Priming. A node representing a specific category provides information needed by more general nodes to which it refers queries. Rules in the more general nodes refer to primed information by means of quoted queries.</Paragraph>
    <Paragraph position="13"> a91 Lookup. A node is invoked solely to provide information needed by other rules.</Paragraph>
    <Paragraph position="14"> a91 Overriding. A node representing a specific category answers a query that is usually answered (with different results) by a more general node to which queries are usually referred.</Paragraph>
    <Paragraph position="15"> a91 Specializing. A rule introduces a specific exception to a more general pattern specified by another rule housed at the same node.</Paragraph>
    <Paragraph position="16"> The strategies of overriding and specializing both exploit the nonmonotonicity inherent in KATR's semantics.</Paragraph>
    <Paragraph position="17"> a91 Combining. A rule concatenates various morphological units by referring queries to multiple nodes.</Paragraph>
    <Paragraph position="18"> a91 Postprocessing. The result of combining morphological units is referred to a node that makes local adjustments to account for euphony and other sandhi principles.</Paragraph>
    <Paragraph position="19"> We do not want to leave the impression that writing specifications in KATR is easy. The tool is capable of presenting elegant specifications, but arriving at those specifications requires considerable effort. Early choices color the entire structure of the resulting KATR specification, and it happens frequently that the author of a specification must discard code and rethink how to represent the morphological structures that are being specified. Perhaps our experience will eventually lead to a second-generation KATR that better facilitates the linguist's task.</Paragraph>
    <Paragraph position="20"> The definition of Hebrew verb inflection that we have sketched here rests on the hypothesis that an inflected word's morphological form is determined by a system of realization rules organized in a default inheritance hierarchy. There are other approaches to defining Hebrew verb inflection; one could, for example, assume that an inflected word's form is determined by a ranked system of violable constraints on morphological structure, as in Optimality Theory (Prince and Smolensky, 1993), or by a finite-state machine (Karttunen, 1993). The facts of Hebrew verb inflection are apparently compatible with any of these approaches. Even so, there are strong theoretical grounds for preferring our approach. It provides a uniform, well-defined architecture for the representation of both morphological rules and lexical information. Moreover, it embodies the assumption that inflectional morphology is inferential and realizational, readily accommodating such phenomena as extended exponence and the frequent underdetermination of morphosyntactic content by inflectional form; in this sense, it effectively excludes a morpheme-based conception of word structure, unlike both the optimality-theoretic and the finite-state approaches.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML