File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/p96-1053_metho.xml

Size: 8,591 bytes

Last Modified: 2025-10-06 14:14:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1053">
  <Title>Using Terminological Knowledge Representation Languages to Manage Linguistic Resources</Title>
  <Section position="3" start_page="0" end_page="366" type="metho">
    <SectionTitle>
2 Sentence Classification
</SectionTitle>
    <Paragraph position="0"> The usual practice in investigating the alternation patterning of a verb is to construct example sentences in which simple, illustrative noun phrases are used as arguments of a verb. The sentences in (1)  exemplify two familiar alternations of give.</Paragraph>
    <Paragraph position="1">  (1) a. John gave Mary a book b. John gave a book to Mary.</Paragraph>
    <Paragraph position="2">  Such sentences exemplify an alternation that belongs to the alternation pattern of their verb. 1 I will call this the alternation type of the test sentence. To determine the alternation type of a test sentence, the sentence must be syntactically analyzed so that its grammatical functions (e.g. subject, object) are marked. Then, given semantic feature information about the words filling those grammatical functions (GFs), and information about the possible argument structures for the verb in the sentence and the semantic feature restrictions on these arguments, it is possible to find the argument structures appropriate to the input sentence. Consider the sentences and descriptions shown below for pour: (2) a. \[Mary,,hi\] poured \[Tinaobj\] \[a glass of mflkio\]. b. \[Marys,bj\] poured \[a glass of milkobj\] for  \[Tinam, o\].</Paragraph>
    <Paragraph position="3"> poura: subj ~ agent\[volitional\] obj ~ recipient\[voUtional\] io ~ patient\[liquid\] pour2: subj --+ agent\[volitional\] obj ---* patient\[l/quid\] ppo ---* recipient\[volitional\]  Given the semantic type restrictions and the GFs, pour1 describes (2a) and pourz, (2b). The mapping from the GFs to the appropriate argument structure is similar to lexical rules in the LFG syntactic theory except that here I semantically type the arguments. To indicate the alternation types for these sentences, I call sentence (2a) a benefactive-ditransitive and sentence (2b) a benefactive-transitive.</Paragraph>
    <Paragraph position="4"> Classifying a sentence by its alternation type requires linguistic and world knowledge. World knowledge is used in the definitions of nouns and verbs in the lexicon and describes high-level entities, such as events, and animate and inanimate objects. Properties (such as LIQUID) are used to define specialized entities. For example, the prop-erty NON-CONSUMABLE (SMALL CAPITALS indicate Classic concepts in my implementation) specializes a LIQUID-ENTITY to define PAINT and distinguish it from WATER, which has the property that it is CON-SUMABLE. Specialized EVENT entities are used in the definition of verbs in the lexicon and represent the argument structures for the verbs.</Paragraph>
    <Paragraph position="5"> The linguistic knowledge needed to support sentence classification includes the definitions of (1) verb types such as intransitive, transitive and alltransitive; (2) verb definitions; and (3) concepts that define the links between the GFs and verb argument structures as represented by events.</Paragraph>
    <Paragraph position="6"> 1In the examples that I will consider, and in most examples used by linguists to test alternation patterns, there will only be one verb; this is the verb to be tested. Verb types (SUBCATEGORIZATIONS) are defined according to the GFs found in the sentence. For example, (2a) classifies as DITRANSITIVE and (2b) as a specialized TRANSITIVE with a PP. Once the verb type is identified, verb definitions (VERBs) are needed to provide the argument structures. A VERB can have multiple senses which are instances of EVENTs, for example the verb &amp;quot;pour&amp;quot; can have the senses pour or prepare, with the required arguments shown below. 2 Note that pour1 and pour2 in (2) are subcategorizations of prepare.</Paragraph>
    <Paragraph position="7"> pour: pourer\[volitional\]  For a sentence to classify as a particular ALTERNA-TION, a legal linking must exist between an EVENT and the SUBCATEGORIZATION. Linking involves restricting the fillers of the GFs in the SUBCATEGORIZATION to be the same as the arguments in an EVENT. In Classic, the same-as restriction is limited so that either both attributes must be filled already with the same instance or the concept must already be known as a LEGAL-LINKING. Because of this I created a test (written in LISP) to identify a LEGAL-LINKING. The test inputs are the sentence predicate and GF fillers arranged in the order of the event arguments against which they are to be tested. A linking is legal when at least one of the events associated with the verb can be linked in the indicated way, and all the required arguments are filled.</Paragraph>
    <Paragraph position="8"> Once a sentence passes the linking test, and classifies as a particular ALTERNATION, a rule associated with the ALTERNATION classifies it as a speciMizalion of the concept. This causes the EVENT arguments to be filled with the appropriate GF fillers from the SUBCATEGORIZATION. A side-effect of the alternation classification is that the EVENT classifies as a specialized EVENT and indicates which sense of the verb is used in the sentence.</Paragraph>
  </Section>
  <Section position="4" start_page="366" end_page="367" type="metho">
    <SectionTitle>
3 Semantic Class Classification
</SectionTitle>
    <Paragraph position="0"> The semantic class of the verb can be identified once the example sentences are classified by their alternation type. Specialized VERB-CLASSes are defined by their good and bad alternations. Note that VERB defines one verb whereas VERB-CLASS describes a set of verbs (e.g. spray/load class). Which AL-TERNATIONs are associated with a VERB-CLASS is a matter of linguistic evidence; the linguist discovers these associations by testing examples for grammaticality. To assist in this task, I provide two tests, have-instances-of and have-no-instances-of.</Paragraph>
    <Paragraph position="1">  The have-instances-of test for an ALTERNATION searches a corpus of good sentences or bad sentences and tests whether at least one instance of the specified ALTERNATION, for example a benefactiveditransitive, is present.</Paragraph>
    <Paragraph position="2"> A bad sentence with all the required verb arguments will classify as an ALTERNATION despite the ungrammatical syntactic realization, while a bad sentence with missing required arguments will only classify as a SUBCATEGORIZATION. The have-no-instances-of test for a SUBCATEGORIZATION searches a corpus of bad sentences and tests whether at least one instance of the specified SUBCATEGORIZATION, for example TRANSITIVE, is present as the most specific classification.</Paragraph>
  </Section>
  <Section position="5" start_page="367" end_page="367" type="metho">
    <SectionTitle>
4 Discussion
</SectionTitle>
    <Paragraph position="0"> The ultimate test of this approach is in how well it will scale up. The linguist may choose to add knowledge as it is needed or may prefer to do this work in batches. To support the batch approach, it may be useful to extract detailed subcategorization information from English learner's dictionaries. Also it will be necessary to decide what semantic features are needed to restrict the fillers of the argument structures. Finally, there is the problem of collecting complete sets of example sentences for a verb. In general, a corpus of tagged sentences is inadequate since it rarely includes negative examples and is not guaranteed to exhibit the full range of alternations. In applications where a domain specific corpus is available (e.g. the Kant MT project (Mitamura et al., 1993)), the full range of relevant alternations is more likely. However, the lack of negative examples still poses a problem and would require the project linguist to create appropriate negative examples or manually adjust the class definitions for further differentiation.</Paragraph>
    <Paragraph position="1"> While I have focused on a lexical research tool, an area I will explore in future work is how classification could be used in grammar writing. One task for which a terminological language is appropriate is flagging inconsistent rules. When writing and maintaining a large grammar, inconsistent rules is one type of grammar writing bug that occurs. For example, the following three rules are inconsistent since feature1 of NP and feature1 of VP would not unify in rule 1 given the values assigned in 2 and 3.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML