File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/p06-1095_relat.xml
Size: 3,401 bytes
Last Modified: 2025-10-06 14:15:51
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1095"> <Title>Machine Learning of Temporal Relations</Title> <Section position="8" start_page="758" end_page="758" type="relat"> <SectionTitle> 5 Related Work </SectionTitle> <Paragraph position="0"> Our approach of classifying pairs independently during learning does not take into account dependencies between pairs. For example, a classifier may label <X, Y> as BEFORE. Given the pair <X, Z>, such a classifier has no idea if <Y, Z> has been classified as BEFORE, in which case, through closure, <X, Z> should be classified as BEFORE. This can result in the classifier producing an inconsistently annotated text. The machine learning approach of (Cohen et al. 1999) addresses this, but their approach is limited to total orderings involving BEFORE, whereas TLINKs introduce partial orderings involving BEFORE and five other relations. Future research will investigate methods for tighter integration of temporal reasoning and statistical classification.</Paragraph> <Paragraph position="1"> The only closely comparable machine-learning approach to the problem of TLINK extraction was that of (Boguraev and Ando 2005), who trained a classifier on Timebank 1.1 for event anchoring for events and times within the same sentence, obtaining an F-measure (for tasks A and B together) of 53.1. Other work in machine-learning and hand-coded approaches, while interesting, is harder to compare in terms of accuracy since they do not use common task definitions, annotation standards, and evaluation measures. (Li et al. 2004) obtained 78-88% accuracy on ordering within-sentence temporal relations in Chinese texts. (Mani et al. 2003) obtained 80.2 F-measure training a decision tree on 2069 clauses in anchoring events to reference times that were inferred for each clause. (Berglund et al. 2006) use a document-level evaluation approach pioneered by (Setzer and Gaizauskas 2000), which uses a distinct evaluation metric. Finally, (Lapata and Lascarides 2004) use found data to successfully learn which (possibly ambiguous) temporal markers connect a main and subordinate clause, without inferring underlying temporal relations.</Paragraph> <Paragraph position="2"> In terms of hand-coded approaches, (Mani and Wilson 2000) used a baseline method of blindly propagating TempEx time values to events based on proximity, obtaining 59.4% on a small sample of 8,505 words of text. (Filatova and Hovy 2001) obtained 82% accuracy on 'timestamping' clauses for a single type of event/topic on a data set of 172 clauses. (Schilder and Habel 2001) report 84% accuracy inferring temporal relations in German data, and (Li et al. 2001) report 93% accuracy on extracting temporal relations in Chinese. Because these accuracies are on different data sets and metrics, they cannot be compared directly with our methods.</Paragraph> <Paragraph position="3"> Recently, researchers have developed other tools for automatically tagging aspects of TimeML, including EVENT (Sauri et al. 2005) at 0.80 F-measure and TIMEX3 tags at 0.82-0.85 F-measure. In addition, the TERN competition (tern.mitre.org) has shown very high (close to .95 F-measures) for TIMEX2 tagging, which is fairly similar to TIMEX3. These results suggest the time is ripe for exploiting 'imperfect' features in our machine learning approach.</Paragraph> </Section> class="xml-element"></Paper>