File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0507_intro.xml
Size: 5,203 bytes
Last Modified: 2025-10-06 14:03:53
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0507"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Towards Large-scale Non-taxonomic Relation Extraction: Estimating the Precision of Rote Extractors[?]</Title> <Section position="4" start_page="0" end_page="49" type="intro"> <SectionTitle> 2 Related work </SectionTitle> <Paragraph position="0"> Extracting information using Machine Learning algorithms has received much attention since the nineties, mainly motivated by the Message Understanding Conferences. From the midnineties, there are systems that learn extraction patterns from partially annotated and unannotated data (Huffman, 1995; Riloff, 1996; Riloff and Schmelzenbach, 1998; Soderland, 1999).</Paragraph> <Paragraph position="1"> Generalising textual patterns (both manually and automatically) for the identification of relations has been proposed since the early nineties (Hearst, 1992), and it has been applied to extendingontologieswithhyperonymyandholonymyre- null lations (Morin and Jacquemin, 1999; Kietz et al., 2000; Cimiano et al., 2004; Berland and Charniak,1999). Finkelstein-LandauandMorin(1999) learn patterns for company merging relations with exceedingly good accuracies. Recently, kernel methods are also becoming widely used for relation extraction (Bunescu and Mooney, 2005; Zhao and Grishman, 2005).</Paragraph> <Paragraph position="2"> Concerning rote extractors from the web, they have the advantage that the training corpora can be collected easily and automatically, so they are useful in discovering many different relations from text. Several similar approaches have been proposed (Brin, 1998; Agichtein and Gravano, 2000; Ravichandran and Hovy, 2002), with various applications: Question-Answering (Ravichandran and Hovy, 2002), multi-document Named Entity Coreference (Mann and Yarowsky, 2003), and generating biographical information (Mann and Yarowsky, 2005). Szpektor et al. (2004) applies a similar, with no seed lists, to extract automatically entailment relationships between verbs, and Etzioni et al. (2005) report very good results extracting Named Entities and relationships from the web.</Paragraph> <Section position="1" start_page="49" end_page="49" type="sub_section"> <SectionTitle> 2.1 Rote extractors </SectionTitle> <Paragraph position="0"> Rote extractors (Mann and Yarowsky, 2005) estimate the probability of a relation r(p,q) given the surrounding context A1pA2qA3. This is calculated, with a training corpus T, as the number of times that two related elements r(x,y) from T appearwiththatsamecontextA1xA2yA3, divided by the total number of times that x appears in that context together with any other word:</Paragraph> <Paragraph position="2"> xis called the hook, andy the target. In order to train a Rote extractor from the web, this procedure is mostly used (Ravichandran and Hovy, 2002): 1. Select a pair of related elements to be used as seed. For instance, (Dickens,1812) for the relation birth year.</Paragraph> <Paragraph position="3"> 2. Submit the query Dickens AND 1812 to a search engine, and download a number of documents to build the training corpus.</Paragraph> <Paragraph position="4"> 3. Keep all the sentences containing both elements. null 4. Extract the set of contexts between them and identify repeated patterns. This may just be the m characters to the left or to the right (Brin, 1998), the longest common substring of several contexts (Agichtein and Gravano, 2000), or all substrings obtained with a suffix tree constructor (Ravichandran and Hovy, 2002).</Paragraph> <Paragraph position="5"> 5. Download a separate corpus, called hook corpus, containing just the hook (in the example, Dickens).</Paragraph> <Paragraph position="6"> 6. Apply the previous patterns to the hook corpus, calculate the precision of each pattern in the following way: the number of times it identifies a target related to the hook divided by the total number of times the pattern appears. null 7. Repeat the procedure for other examples of the same relation.</Paragraph> <Paragraph position="7"> To illustrate this process, let us suppose that we want to learn patterns to identify birth years. We may start with the pair (Dickens, 1812). From the downloaded corpus, we extract sentences such as Dickens was born in 1812 Dickens (1812 - 1870) was an English writer Dickens (1812 - 1870) wrote Oliver Twist Thesystemidentifiesthatthecontextsofthelast two sentences are very similar and chooses their longest common substring to produce the following patterns: <hook> was born in <target> <hook> ( <target> - 1870 ) The rote extractor needs to estimate automatically the precision of the extracted patterns, in order to keep the best ones. So as to measure these precision values, a hook corpus is now downloaded using the hook Dickens as the only query word, and the system looks for appearances of the patterns in this corpus. For every occurrence in which the hook of the relation is Dickens, if the target is 1812 it will be deemed correct, and otherwise it will be deemed incorrect (e.g. in Dickens was born in Portsmouth).</Paragraph> </Section> </Section> class="xml-element"></Paper>