File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/p99-1015_concl.xml
Size: 3,831 bytes
Last Modified: 2025-10-06 13:58:26
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1015"> <Title>Corpus-Based Linguistic Indicators for Aspectual Classification</Title> <Section position="6" start_page="117" end_page="118" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> We have developed a full-scale system for aspectual classification with multiple linguistic indicators. Once trained, this system can automatically classify all verbs appearing in a corpus, including &quot;unseen&quot; verbs that were not included in the supervised training sample. This framework is expandable, since additional lexico-syntactic markers may also correlate with aspectual class. Future work will extend this approach to other semantic distinctions in natural language.</Paragraph> <Paragraph position="1"> Linguistic indicators successfully exploit linguistic insights to provide a much-needed method for aspectual classification. When combined with a decision tree to classify according to stativity, the indicators achieve an accuracy of 93.9% and stative recall of 74.2%. When combined with CART to classify according to completedness, indicators achieved 74.0% accuracy and 53.1% non-culminated recall.</Paragraph> <Paragraph position="2"> A favorable tradeoff in recall presents an advantage for applications that weigh the identification of non-dominant classes more heavily (Cardie and Howe, 1997). For example, correctly identifying occurrences of for that denote event durations relies on positively identifying non-culminated events. A system that summarizes the duration of events which incorrectly classifies &quot;She ran (for a minute)&quot; as culminated will not detect that &quot;for a minute&quot; describes the duration of the run event. This is because durative for-PPs that modify culminated events denote the duration of the ensuing state, e.g., I leJt the room for a minute. (Vendler, 1967) Our analysis has revealed several insights regarding individual indicators. For example, both duration in-PP and manner adverb are particularly valuable for multiple aspectual distinctions - they were ranked in the top two positions by log-linear modeling for both stativity and completedness.</Paragraph> <Paragraph position="3"> We have discovered several new linguistic indicators that are not traditionally linked to aspectual class. In particular, verb frequency with no deep subject was positively correlated with both stativity and completedness. Moreover, four other indicators are newly linked to stativity: (1) Verb frequency, (2) occurrences modified by &quot;not&quot; or &quot;never&quot;, (3) occurrences in the past or present participle, and (4) occurrences in the perfect tense. Additionally, another three were newly linked to completedness: (1) occurrences modified by a manner adverb, (2) occurrences in the past or present participle, and (3) occurrences in the progressive.</Paragraph> <Paragraph position="4"> These new correlations can be understood in pragmatic terms. For example, since points (non-culminated, punctual events, e.g., hiccup) are rare, punctual events are likely to be culminated. Therefore, an indicator that discriminates events according to extendedness, e.g., the progressive, past/present participle, and duration for-PP, is likely to also discriminate between culminated and non-culminated events.</Paragraph> <Paragraph position="5"> As a second example, the not/never indicator correlates with stativity in medical reports because diagnoses (i.e., states) are often ruled out in medical discharge summaries, e.g., &quot;The patient was not hypertensive,&quot; but procedures (i.e., events) that were not done are not usually mentioned, e.g., '~.An examination was not performed.&quot;</Paragraph> </Section> class="xml-element"></Paper>