File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/w93-0109_concl.xml

Size: 2,072 bytes

Last Modified: 2025-10-06 13:57:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="W93-0109">
  <Title>The Automatic Acquisition of Frequencies of Verb Subcategorization Frames from Tagged Corpora</Title>
  <Section position="6" start_page="104" end_page="104" type="concl">
    <SectionTitle>
5 Conclusion and Future Direction
</SectionTitle>
    <Paragraph position="0"> We have demonstrated that by combining syntactic and statistical analysis, the frequencies of verb-subcat frames can be estimated with high accuracy. Although the present system measures the frequencies of only six subcat frames, the method is general enough to be extended to many more frames. The traditional application of regular expressions as rules for deterministic processing has self-evident limitations since a linear grammar is not powerful enough to capture general linguistic phenomena. The statistical method we propose uses regular expressions as filters for detecting specific features of the occurrences of verbs and employs multi-dimensional analysis of the features based on loglinear models and Bayes Theorem.</Paragraph>
    <Paragraph position="1"> We expect that by identifying other useful syntactic features we can further improve the accuracy of the frequency estimation. Such features can be regarded as characterizing the syntactic context of the verbs, quite broadly. The features need not be linked to a local verb context. For example, a regular expression such as &amp;quot;w\['vex\]*k&amp;quot; can be used to find cases where the target verb is preceded by a relative pronoun such that there is no other finite verb or punctuation or sentence final period between the relative pronoun and the target verb.</Paragraph>
    <Paragraph position="2"> If the syntactic structure of a sentence can be predicted using only syntactic and lexical knowledge, we can hope to estimate the subcat frame of each occurrence of a verb using the context expressed by a set of features. We thus can aim to extend and refine this method for use with general probabilistic parsing of unrestricted text.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML