File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/86/p86-1019_intro.xml

Size: 4,526 bytes

Last Modified: 2025-10-06 14:04:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="P86-1019">
  <Title>COMPUTER METHODS FOR MORPHOLOGICAL ANALYSIS</Title>
  <Section position="2" start_page="0" end_page="120" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> This paper describes our current research on the properties of derivational affixation in English. Our research arises from a more general research project, the Lexical Systems project at the IBM Thomas J. Watson Research laboratories, the goal for which is to build a variety of computerized dictionary systems for use both by people and by computer programs. An important sub-goal is to build reliable and robust word recognition mechanisms for these dictionaries. One of the more important issues in word recognition for all morphologically complex languages involves mechanisms for dealing with affixes.</Paragraph>
    <Paragraph position="1"> Two complementary motivations underlie our research on derivational morphology. On the one hand, our goal is to discover linguistically significant generalizations and principles governing the attachment of affixes to English words to form other words. If we can find such generalizations, then we can use them to build our ~mproved word recognizer. We will be better able to correctly recognize and analyse well-formed words and, on the other hand, to reject ill-formed words. On the other hand, we want to use our existing word-recognition and analysis programs as tools for gathering further information about English affixation. This circular process allows us to test and refine our emerging word recognition logic while at the same time providing a large amount of data for linguistic analysis.</Paragraph>
    <Paragraph position="2"> It is important to note that, while doing derivational morphology is not the only way to deal with complex words in a computerized dictionary, it offers certain advantages. It allows systems to deal with coinages, a possibility which is not open to most systems. Systems which do no morphology and even those which handle primarily inflectional affixation (such as Winograd (1971) and Koskenniemi (1983)) are limited by the fixed size of their lists of stored words. Koskenniemi claims that his two-level morphology framework can handle derivational affixation, although his examples are all of inflectional processes. It is not clear how that framework accounts for the variety of phenomena that we observe in English derivational morphology.</Paragraph>
    <Paragraph position="3"> Morphological analysis also provides an additional source of lexical information about words, since a word's properties can often be predicted from its structure. In this respect, our dictionaries are distinguished from those of Allen (1976) where complex words are merely analysed as concatenations of word-parts and Cercone (1974) where word structure is not exploited, even though derivational affixes are analysed.</Paragraph>
    <Paragraph position="4"> Our morphological analysis system was conceived within the linguistic framework of word-based morphology, as described in Aronoff (1976). In our dictionaries, we store a large number of words, together with associated idiosyncratic information. The retrieval mechanism contains a grammar of derivational (and inflectional) affixation which is used to analyse input strings in terms of the stored words. The mechanism handles both prefixes and suffixes. The framework and mechanism are described in Byrd (1983a). Crucially, in our system, the attachment of an affix to a base word is conditioned on the properties of the base word. The purpose of our research is to determine the precise nature of those conditions. These conditions may refer to syntactic, semantic, etymological, morphological or phonological properties.</Paragraph>
    <Paragraph position="5"> (See Byrd (1983b)).</Paragraph>
    <Paragraph position="6"> Our research is of interest to two related audiences: both computational linguists and theoretical linguists. Computational linguists will find here a powerful set of pro- null grams for processing natural language material.</Paragraph>
    <Paragraph position="7"> Furthermore, they should welcome the improvements to those programs' capabilities offered by our linguistic resuits. Theoretical linguists, on the other hand, will find a novel set of tools and data sources for morphological research. The generalizations that result from our analyses should be welcome additions to linguistic theory.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML