File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/86/c86-1118_abstr.xml

Size: 4,050 bytes

Last Modified: 2025-10-06 13:46:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1118">
  <Title>TOPIC Essentials*</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> An overview of TOPIC is provided, a knowledge-based text information system for the analysis of German-language texts. TOPIC supplies text condensates (summaries) on variable degrees of generality and makes available facts acquired from the texts. The presentation focuses on the major methodological principles underlying the design of TOPIC: a frame representation model that incorporates various integrity constraints, text parsing with focus on text cohesion and text coherence properties of expository texts, a lexlcally distributed semantic text grammar in the format of word experts, a model of partial text parsing, and text graphs as appropriate representation structures for text condensates.</Paragraph>
    <Paragraph position="1"> I. Introduction This paper provides an overview of TOPIC, a text understanding and text condensation system which analyzes German-language texts: complete magazine articles in tbe domain of information technology products. TOPIC performs the following functions: Text summarization (abstracting) TOPIC produces a graph representation of the most relevant topics dealt with in a text. This summary is derived from text representation structures and its level of generality varies from quite generic descriptions (similar to a system of index terms) to rather detailed information concerning facts, newly acquired concepts and their properties. Due to the flexibility inherent to this cascaded approach to text summarization (cf. KUHLEN 84) we refer to it as text condensation. This is opposed to invariant forms of text summarization based on summary schemata (DeJONG 79, TAIT 82) or structural features of the text representations (TAYLOR 74, LEHNERT 81), and dynamic abstracting procedures which depend on a priori specifications of appropriate parameters (FUM et el. 82) or rule sets for importance evaluation (FUM et el. 85) prior to text analysis.</Paragraph>
    <Paragraph position="2"> * Extraction of facts / acquisition of new concepts Knowledge extraction resulting from text analysis not only leads TOPIC to the assignment of specific properties to concepts already known to the system, but also comprises the acquisition of new concepts and corresponding properties.</Paragraph>
    <Paragraph position="3"> Linking thematic descriptions with text passages TOPIC's analytic devices are by no means exhaustive to capture all the knowledge encoded in a text. Thus, the text representation structures provided might be incomplete, llowever, the themat* The development of the TOPIC system is supported by BMFT/GID under contract 1020016 0. We want to thank D. Soergel for his contributions to this paper.</Paragraph>
    <Paragraph position="4"> ic descriptions generated are linked to the corresponding text passages so that querying a text knowledge base may end up in the retrieval of relevant fragments of the original text (cf.</Paragraph>
    <Paragraph position="5"> similar approaches in LOEF 80, HOBBS et el. 82).</Paragraph>
    <Paragraph position="6"> To perform these functions, the design of TOPIC is based on the following methodological principles: * a method for making strategic decisions to control the depth of text understanding according to the functional level of system performance desired * a knowledge representation model whose expressive power primarily comes from various integrity constraints which control\[ the validity of the knowledge representation structures during text analysis null * a parsing model adapted to the specific constructive requirements of expository prose (local text cohesion and global text coherence phenomena) * a text condensation model based on empirical well-formedness conditions on texts (text grammatical macro rules) and criteria derived from the knowledge representation model (complex operations)</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML