File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-2012_intro.xml
Size: 2,329 bytes
Last Modified: 2025-10-06 14:03:26
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-2012"> <Title>Maytag: A multi-staged approach to identifying complex events in textual data</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Our goal is to support the discovery of complex events in text. By complex events, we mean events that might be structured out of multiple occurrences of other events, or that might occur over a span of time. In financial analysis, the domain that concerns us here, an example of what we mean is the problem of understanding corporate acquisition practices. To gauge a company's modus operandi in acquiring other companies, it isn't enough to know just that an acquisition occurred, but it may also be important to understand the degree to which it was debt-leveraged, or whether it was performed through reciprocal stock exchanges.</Paragraph> <Paragraph position="1"> In other words, complex events are often composed of multiple facets beyond the basic event itself. One of our concerns is therefore to enable end users to access complex events through a combination of their possible facets.</Paragraph> <Paragraph position="2"> Another key characteristic of rich domains like financial analysis, is that facts and events are subject to interpretation in context. To a financial analyst, it makes a difference whether a multi-million-dollar loss occurs in the context of recurring operations (a potentially chronic problem), or in the context of a one-time event, such as a merger or layoff. A second concern is thus to enable end users to interpret facts and events through automated context assessment.</Paragraph> <Paragraph position="3"> The route we have taken towards this end is to model the domain of corporate finance through an interactive suite of language processing tools. Maytag, our prototype, makes the following novel contribution. Rather than trying to model complex events monolithically, we provide a range of multi-purpose information extraction and text classification methods, and allow the end user to combine these interactively. Think of it as Boolean queries where the query terms are not keywords but extracted facts, events, entities, and contextual text classifications.</Paragraph> </Section> class="xml-element"></Paper>