File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/e06-1049_metho.xml
Size: 25,043 bytes
Last Modified: 2025-10-06 14:10:04
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1049"> <Title>A Machine Learning Approach to Extract Temporal Information from Texts in Swedish and Generate Animated 3D Scenes</Title> <Section position="3" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 2 Previous Work </SectionTitle> <Paragraph position="0"> Research on the representation of time, events, and temporal relations dates back the beginning of logic. It resulted in an impressive number of formulations and models. In a review of contemporary theories and an attempt to unify them, Bennett and Galton (2004) classified the most influential formalisms along three lines. A first approach is to consider events as transitions between states as in STRIPS (Fikes and Nilsson, 1971). A second one is to map events on temporal intervals and to define relations between pairs of intervals.</Paragraph> <Paragraph position="1"> Allen's (1984) 13 temporal relations are a widely accepted example of this. A third approach is to reify events, to quantify them existentially, and to connect them to other objects using predicates based on action verbs and their modifiers (Davidson, 1967). The sentence John saw Mary in London on Tuesday is then translated into the logical form: [?]epsilon1[Saw(epsilon1,j,m)[?]Place(epsilon1,l)[?]Time(epsilon1,t)]. Description of relations between time, events, and verb tenses has also attracted a considerable interest, especially in English. Modern work on temporal event analysis probably started with Reichenbach (1947), who proposed the distinction between the point of speech, point of reference, and point of event in utterances. This separation allows for a systematic description of tenses and proved to be very powerful.</Paragraph> <Paragraph position="2"> Many authors proposed general principles to extract automatically temporal relations between events. A basic observation is that the temporal order of events is related to their narrative order. Dowty(1986) investigated it and formulated a Temporal Discourse Interpretation Principle tointerpret the advance of narrative time in a sequence of sentences. Lascarides and Asher (1993) described a complex logical framework to deal with events in simple past and pluperfect sentences.</Paragraph> <Paragraph position="3"> Hitzeman et al. (1995) proposed a constraint-based approach taking into account tense, aspect, temporal adverbials, and rhetorical structure to analyze a discourse.</Paragraph> <Paragraph position="4"> Recently, groups have used machine learning techniques to determine temporal relations.</Paragraph> <Paragraph position="5"> They trained automatically classifiers on hand-annotated corpora. Mani et al. (2003) achieved the best results so far by using decision trees to order partially events of successive clauses in English texts. Boguraev and Ando (2005) is another example of it for English and Li et al. (2004) for Chinese.</Paragraph> </Section> <Section position="4" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 3 Annotating Texts with Temporal </SectionTitle> <Paragraph position="0"> Information Several schemes have been proposed to annotate temporal information in texts, see Setzer and Gaizauskas (2002), inter alia. Many of them were incompatible or incomplete and in an effort to reconcile and unify the field, Ingria and Pustejovsky (2002) introduced the XML-based Time markup language (TimeML).</Paragraph> <Paragraph position="1"> TimeML is a specification language whose goal is to capture most aspects of temporal relations between events in discourses. It is based on Allen's (1984) relations and a variation of Vendler's (1967) classification of verbs. It defines XML elements to annotate time expressions, events, and &quot;signals&quot;. The SIGNAL tag marks sections of text indicating a temporal relation. It includes function words such as later and not.</Paragraph> <Paragraph position="2"> TimeML also features elements to connect entities using different types of links, most notably temporal links, TLINKs, that describe the temporal relation holding between events or between an event and a time.</Paragraph> </Section> <Section position="5" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 4 A System to Convert Narratives of </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> Road Accidents into 3D Scenes 4.1 Carsim </SectionTitle> <Paragraph position="0"> Carsim is a text-to-scene converter. From a narrative, it creates a complete and unambiguous 3D geometric description, which it renders visually.</Paragraph> <Paragraph position="1"> Carsim considers authentic texts describing road accidents, generally collected from web sites of Swedish newspapers or transcribed from hand-written accounts by victims of accidents. One of the program's key features is that it animates the generated scene to visualize events.</Paragraph> <Paragraph position="2"> The Carsim architecture is divided into two parts that communicate using a frame representation of the text. Carsim's first part is a linguistic module that extracts information from the report and fills the frame slots. The second part is a virtual scene generator that takes the structured representation as input, creates the visual entities, and animates them.</Paragraph> </Section> <Section position="2" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 4.2 Knowledge Representation in Carsim </SectionTitle> <Paragraph position="0"> The Carsim language processing module reduces the text content to a frame representation - a template - that outlines what happened and enables a conversion to a symbolic scene. It contains: * Objects. They correspond to the physical entities mentioned inthe text. Theyalso include abstract symbols that show inthe scene. Each object has a type, that is selected from a predefined, finite set. An object's semantics is a separate geometric entity, where its shape (and possibly its movement) is determined by its type.</Paragraph> <Paragraph position="1"> * Events. They correspond intuitively to an activity that goes on during a period in time and here to the possible object behaviors. We represent events as entities with a type taken from a predefined set, where an event's semantics will be a proposition paired with a point or interval in time during which the proposition is true.</Paragraph> <Paragraph position="2"> * Relations and Quantities. They describe specific features of objects and events and how they are related to each other. The most obvious examples of such information are spatial information about objects and temporal information about events. Other meaningful relations and quantities include physical properties such as velocity, color, and shape.</Paragraph> </Section> </Section> <Section position="6" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 5 Time and Event Processing </SectionTitle> <Paragraph position="0"> We designed and implemented a generic component to extract temporal information from the texts. It sits inside the natural language part of Carsim and proceeds in two steps. The first step uses apipeline offinite-state machines andphrasestructure rules that identifies timeexpressions, signals, and events. This step also generates a feature vector for each element it identifies. Using the vectors, the second step determines the temporal relations between the extracted events and orders them in time. The result is a text annotated using the TimeML scheme.</Paragraph> <Paragraph position="1"> We use a set of decision trees and a machine learning approach to find the relations between events. As input to the second step, the decision trees take sequences of events extracted by the first step and decide the temporal relation, possibly none, between pairs of them. To run the learning algorithm, we manually annotated a small set of texts on which we trained the trees.</Paragraph> <Section position="1" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 5.1 Processing Structure </SectionTitle> <Paragraph position="0"> We use phrase-structure rules and finite state machines to mark up events and time expressions. In addition totheidentification ofexpressions, weoften need to interpret them, for instance to compute the absolute time an expression refers to. We therefore augmented the rules with procedural attachments. null Wewrote a parser to control the processing flow where the rules, possibly recursive, apply regular expressions, call procedures, and create TimeML entities.</Paragraph> </Section> <Section position="2" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 5.2 Detection of Time Expressions </SectionTitle> <Paragraph position="0"> We detect and interpret time expressions with a two-level structure. The first level processes individual tokens using a dictionary and regular expressions. The second level uses the results from the token level to compute the meaning of multi-word expressions.</Paragraph> <Paragraph position="1"> Token-Level Rules. In Swedish, time expressions such as en tisdagseftermiddag 'a Tuesday afternoon' use nominal compounds. To decode them, we automatically generate a comprehensive dictionary with mappings from strings onto compound time expressions. We decode other types of expressions such as 2005-01-14 using regular expressions Multiword-Level Rules. We developed a grammar to interpret the meaning of multiword time expressions. It includes instructions on how to combine the values of individual tokens for expressions such as {vid lunchtid}t1 {en tisdageftermiddag}t2 '{at noon}t1 {a Tuesday afternoon}t2'. Themost common case consists in merging the tokens'attributes toformamorespecificexpression.</Paragraph> <Paragraph position="2"> However, relative time expressions such as i torsdags 'last Tuesday' are more complex. Our grammar handles the most frequent ones, mainly those that need the publishing date for their interpretation. null</Paragraph> </Section> <Section position="3" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 5.3 Detection of Signals </SectionTitle> <Paragraph position="0"> We detect signals using a lexicon and naive string matching. We annotate each signal with a sense where the possible values are: negation, before, after, later, when, and continuing. TimeML only defines one attribute for the SIGNAL tag, an identifier, and encodes the sense as an attribute of the LINKs that refer to it. We found it more appropriate to store the sense directly in the SIGNAL element, and so we extended it with a second attribute.</Paragraph> <Paragraph position="1"> We use the sense information in decision trees as a feature to determine the order of events. Our strategy based on string matching results in a limited overdetection. However, it does not break the rest of the process.</Paragraph> </Section> <Section position="4" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 5.4 Detection of Events </SectionTitle> <Paragraph position="0"> We detect the TimeML events using a part-of-speech tagger and phrase-structure rules. We consider that all verbs and verb groups are events. We also included some nouns or compounds, which are directly relevant to Carsim's application domain, such as bilolycka 'car accident' or krock 'collision'. We detect these nouns through a set of six morphemes.</Paragraph> <Paragraph position="1"> TimeML annotates events with three features: aspect, tense, and &quot;class&quot;, where the class corresponds tothetypeoftheevent. TheTimeMLspecifications define seven classes. We kept only the two most frequent ones: states and occurrences.</Paragraph> <Paragraph position="2"> We determine the features using procedures attached to each grammatical construct we extract.</Paragraph> <Paragraph position="3"> The grammatical features aspect and tense are straightforward and a direct output of the phrase-structure rules. To infer the TimeML class, we use heuristics such as these ones: predicative clauses (copulas) are generally states and verbs in preterit are generally occurrences.</Paragraph> <Paragraph position="4"> The domain, reports of car accidents, makes this approach viable. The texts describe sequences of real events. They are generally simple, to the point, and void of speculations and hypothetical scenarios. This makes the task of feature identification simpler than it is in more general cases. In addition to the TimeML features, we extract the grammatical properties of events. Our hypothesis is that specific sequences of grammatical constructs are related to the temporal order of the described events. The grammatical properties consist of the part of speech, noun (NOUN) or verb (VB). Verbs can be finite (FIN) or infinitive (INF).</Paragraph> <Paragraph position="5"> They can be reduced to a single word or part of a group (GR). They can be a copula (COP), a modal (MOD), or a lexical verb. We combine these properties into eight categories that we use in the feature vectors of the decision trees (see ...EventStructure in Sect. 6.2).</Paragraph> </Section> </Section> <Section position="7" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 6 Event Ordering </SectionTitle> <Paragraph position="0"> TimeML defines three different types of links: subordinate (SLINK), temporal (TLINK), and aspectual (ALINK).Aspectual links connect twoevent instances, one being aspectual and the other the argument. As its significance was minor in the visualization of car accidents, we set aside this type of link.</Paragraph> <Paragraph position="1"> Subordinate links generally connect signals to events, for instance to mark polarity by linking a not to its main verb. We identify these links simultaneously with the event detection. We augmented the phrase-structure rules to handle subordination cases at the same time they annotate an event. We restricted the cases to modality and polarity and we set aside the other ones.</Paragraph> <Section position="1" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 6.1 Generating Temporal Links </SectionTitle> <Paragraph position="0"> To order the events in time and create the temporal links, we use a set of decision trees. We apply each tree to sequences of events where it decides the order between two of the events in each sequence. If e1,...,en are the events in the sequence they appear in the text, the trees correspond to the following functions:</Paragraph> <Paragraph position="2"> The possible output values of the trees are: simultaneous, after, before, is_included, includes, and none. These values correspond to the relations described by Setzer and Gaizauskas (2001).</Paragraph> <Paragraph position="3"> The first decision tree should capture more general relations between two adjacent events without the need of a context. Decision trees dt2 and dt3 extend the context by one event to the left respectively one event to the right. They should capture more specific phenomena. However, they are not always applicable as wenever apply a decision tree when there is a time expression between any of the events involved. In effect, time expressions &quot;reanchor&quot; the narrative temporally, and we noticed that the decision trees performed very poorly across time expressions.</Paragraph> <Paragraph position="4"> We complemented the decision trees with a small set of domain-independent heuristic rules that encode common-sense knowledge. We assume that events in the present tense occur after events in the past tense and that all mentions of events such as olycka 'accident' refer to the same event. In addition, the Carsim event interpreter recognizes some semantically motivated identity relations.</Paragraph> </Section> <Section position="2" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 6.2 Feature Vectors </SectionTitle> <Paragraph position="0"> Thedecision trees useasetoffeatures corresponding to certain attributes of the considered events, temporal signals between them, and some other parameters such as the number of tokens separating the pair of events to be linked. We list below the features of fdt1 together with their values. The firsteventin thepair isdenoted bya mainEventprefix and the second one by relatedEvent: * mainEventTense: none, past, present, future, NOT_DETERMINED.</Paragraph> <Paragraph position="1"> * mainEventAspect: progressive, perfective, perfective_progressive, none, NOT_DETERMINED.</Paragraph> <Paragraph position="3"> The four other decision trees consider more events but use similar features. The values for the ...Distance features are of course greater.</Paragraph> </Section> <Section position="3" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 6.3 Temporal Loops </SectionTitle> <Paragraph position="0"> Theprocess described aboveresults inan overgeneration oftemporal links. Assomeofthem maybe conflicting, a post-processing module reorganizes them and discards the temporal loops.</Paragraph> <Paragraph position="1"> The initial step of the loop resolution assigns each link with a score. This score is created by the decision trees and is derived from the C4.5 metrics (Quinlan, 1993). It reflects the accuracy of the leaf as well as the overall accuracy of the decision tree in question. The score for links generated from heuristics is rule dependent.</Paragraph> <Paragraph position="2"> The loop resolution algorithm begins with an empty set of orderings. It adds the partial orderings to the set if their inclusion doesn't introduce a temporal conflict. It first adds the links with the highest scores, andthus, ineachtemporal loop, the ordering with the lowest score is discarded.</Paragraph> </Section> </Section> <Section position="8" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 7 Experimental Setup and Evaluation </SectionTitle> <Paragraph position="0"> As far as we know, there is no available timeannotated corpus in Swedish, which makes the evaluation more difficult. As development and test sets, we collected approximately 300 reports of road accidents from various Swedish newspapers. Each report is annotated with its publishing date. Analyzing the reports is complex because of their variability in style and length. Their size ranges from a couple of sentences to more than a page. The amount of details is overwhelming in some reports, while in others most of the information is implicit. The complexity of the accidents described ranges from simple accidents with only one vehicle to multiple collisions with several participating vehicles and complex movements.</Paragraph> <Paragraph position="1"> We manually annotated a subset of our corpus consisting of 25 texts, 476 events and 1,162 temporal links. We built the trees automatically from this set using the C4.5 program (Quinlan, 1993).</Paragraph> <Paragraph position="2"> Our training set is relatively small and the number of features we use relatively large for the set size. Thiscan produce atraining overfit. However, C4.5, to some extent, makes provision for this and prunes the decision trees.</Paragraph> <Paragraph position="3"> We evaluated three aspects of the temporal information extraction modules: the detection and interpretation of time expressions, the detection and interpretation of events, and the quality of the final ordering. We report here the detection of events and the final ordering.</Paragraph> <Section position="1" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 7.1 Event Detection </SectionTitle> <Paragraph position="0"> We evaluated the performance of the event detection on a test corpus of 40 previously unseen texts.</Paragraph> <Paragraph position="1"> It should be noted that we used a simplified definition of what an event is, and that the manual annotation and evaluation were both done using the same definition (i.e. all verbs, verb groups, and a small number of nouns are events). The system detected 584 events correctly, overdetected 3, and missed 26. This gives a recall of 95.7%, a precision of 99.4%, and an F-measure of 97.5%.</Paragraph> <Paragraph position="2"> The feature detection is more interesting and Table 1 shows an evaluation of it. We carried out this evaluation on the first 20 texts of the test corpus. null</Paragraph> </Section> <Section position="2" start_page="2004" end_page="2004" type="sub_section"> <SectionTitle> 7.2 Evaluation of Final Ordering </SectionTitle> <Paragraph position="0"> We evaluated the final ordering with the method proposed by Setzer and Gaizauskas (2001). Their scheme is comprehensive and enables to compare the performance of different systems.</Paragraph> <Paragraph position="1"> Description of the Evaluation Method. Setzer and Gaizauskas carried out an inter-annotator agreement test for temporal relation markup.</Paragraph> <Paragraph position="2"> When evaluating the final ordering of a text, they defined the set E of all the events in the text and the set T of all the time expressions. They computed the set (E [?]T)x(E [?]T) and they defined the sets Sturnstileleft, Iturnstileleft, and Bturnstileleft as the transitive closures for the relations simultaneous, includes, and before, respectively.</Paragraph> <Paragraph position="3"> If Sturnstileleftk and Sturnstileleftr represent the set Sturnstileleft for the answer key (&quot;Gold Standard&quot;) and system response, respectively, the measures of precision and recall for the simultaneous relation are:</Paragraph> <Paragraph position="5"> For an overall measure of recall and precision, Setzer and Gaizauskas proposed the following formulas: null</Paragraph> <Paragraph position="7"> They used the classical definition of the Fmeasure: the harmonic means of precision and recall. Note that the precision and recall are computed per text, not for all relations in the test set simultaneously.</Paragraph> <Paragraph position="8"> Results. We evaluated the output of the Carsim system on 10 previously unseen texts against our Gold Standard. As a baseline, we used a simple algorithm that assumes that all events occur in the order they are introduced in the narrative. For comparison, we also did an inter-annotator evaluation on the same texts, where we compared the Gold Standard, annotated by one of us, with the annotation produced by another member in our group.</Paragraph> <Paragraph position="9"> As our system doesn't support comparisons of time expressions, we evaluated the relations contained in the set E x E. We only counted the reflexive simultaneous relation once per tuples (ex,ey) and (ey,ex) and we didn't count relations (ex,ex).</Paragraph> <Paragraph position="10"> Table 2 shows our results averaged over the 10 texts. As a reference, we also included Setzer and Gaizauskas' averaged results for inter-annotator agreement on temporal relations in six texts. Their results are not directly comparable however as they did the evaluation over the set (E [?] T) x (E [?] T) for English texts of another type.</Paragraph> <Paragraph position="11"> Comments. The computation of ratios on the transitive closure makes Setzer and Gaizauskas' evaluation method extremely sensitive. Missing a singlelinkoftenresults inalossofscores ofgenerated transitive links and thus has a massive impact on the final evaluation figures.</Paragraph> <Paragraph position="12"> As an example, one of our texts contains six events whose order is e4 < e5 < e6 < e1 < e2 < e3. The event module automatically detects the chains e4 < e5 < e6 and e1 < e2 < e3 correctly, but misses the link e6 < e1. This gives a recall of 6/15 = 0.40. When considering evaluations performed using the method above, it is meaningful to have this in mind.</Paragraph> </Section> </Section> <Section position="9" start_page="2004" end_page="2004" type="metho"> <SectionTitle> 8 Carsim Integration </SectionTitle> <Paragraph position="0"> The visualization module considers a subset of the detected events that it interprets graphically. We call this subset the Carsim events. Once the event processing has been done, Carsim extracts these specific events from the full set using a small domain ontology and inserts them into the template. We use the event relations resulting from temporal information extraction module to order them. For all pairs of events in the template, Carsim queries the temporal graph to determine their relation.</Paragraph> <Paragraph position="1"> Figure 1 shows a part of the template representing the accident described in Section 1. It lists the participants, with the unmentioned vehicle inferred to be a car. It also shows the events and their temporal order. Then, the visualization module synthesizes a 3D scene and animates it. Figure 2 shows four screenshots picturing the events.</Paragraph> </Section> class="xml-element"></Paper>