File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-2101_metho.xml
Size: 16,944 bytes
Last Modified: 2025-10-06 14:07:14
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2101"> <Title>Learning Semantic-Level Information Extraction Rules by Type-Oriented ILP</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Our Approach to IE Tasks </SectionTitle> <Paragraph position="0"> This section describes our approach to IE tasks.</Paragraph> <Paragraph position="1"> tic representations. First, training articles are analyzed and converted into semantic representations, which are filled case fl'ames represented as atomic formulae. Training templates are prepared by hand as well. The ILP system learns \]!!; rules in the tbrm of logic l)rograms with type information. To extract key inlbrmation from a new ~rticle, semantic representation s au tomatically generated from the article is matched by the IE rules. Extracted intbrmation is filled into the template slots.</Paragraph> </Section> <Section position="4" start_page="0" end_page="699" type="metho"> <SectionTitle> 3 NLP Resources and Tools </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="698" type="sub_section"> <SectionTitle> 3.1 The Semantic Attribute System </SectionTitle> <Paragraph position="0"> We used the semantic attribute system of &quot;Gel Taikei -- A Japanese Lexicon&quot; (lkehara el el., 1997a; Kurohashi and Sakai, 1.999) compiled by the NTT Communication Science Laboratories for a Japanese-to-English machine translation system, ALT-J/E (Ikehm:a et al., 1994). The semantic attribute system is a sort of hierarchical concept thesaurus represented as a tree structure in which each node is called a semantic cateqory. An edge in the tree represents an is_a or has_a relation between two categories. The semantic attribute system is 11.2 levels deep and semantic representation new article ''' \[\]\]\] sdeg~chy yze ~ Analyze I rolease(cl,pl) articles sentences announce(cl,dl) i'~kackgrou n d Anal .... nzwledge / \[E rules ~ F representatiOn sentences I semantic .ooitive ..... l contains about 3,000 sema.ntic category nodes.</Paragraph> <Paragraph position="1"> More than 300,000 Japanese words a.re linked to the category nodes.</Paragraph> </Section> <Section position="2" start_page="698" end_page="698" type="sub_section"> <SectionTitle> 3.2 Verb Case Frame Dictionary </SectionTitle> <Paragraph position="0"> The Japanese-to-li;nglish valency 1)a.ttern dictionary of &quot;(\]oi-Taikei&quot; (lkehara et al., 1997b; Kurohash.i and Saka.i, 1999) was also originally developed for ALT-,I/IB. The. wde:ncy dictionary conta.ins about 15,000 case frames with sema.ntic restrictions on their arguments lbr 6,000 a apanese verbs. Each ca.se frame consists of one predicate a.nd one or more case elements tha.t h ave a list; of sere an tic categories.</Paragraph> </Section> <Section position="3" start_page="698" end_page="699" type="sub_section"> <SectionTitle> 3.3 Natural Language Processing Tools </SectionTitle> <Paragraph position="0"> We used the N I,P COml)onents of kl/l'-.I/F, for text a, nalysis. These inclu<le the morphologica,l amdyzer, the syntactic analyzer, and the case aDalyzer for Japanese. The components a.re robust a:nd generic tools, mainly ta:rgeted to newspaper articles.</Paragraph> <Paragraph position="1"> l,et us examine the case a.nalysis in more detail. The <'as(; analyzer reads a set of parse tree candidates produced by the J a.panese syntactic analyzer. The parse tree is :represented as a dependen cy of ph rases (i. e., .\] al>anese bu'nsctmt). First, it divides the parse tree into unit sentences, where a unit sentence consists of one predicate and its noun and adverb dependent phrases. Second, it compares each unit sentence.with a verb case fl'alne dictionary, l!;ach frame consists a predicate condition and several cast elements conditions. The predicate condition specifies a verb that matches the frame a.:nd each case-role has a. case element condition whi ch sl>ecifie.s particles an d sere an tic categories of&quot; noun phrases. The preference va.lue is delined as the summation of noun phrase \])references which are calculated from the distances between the categories of the input sentences m~d the categories written in the fi'amcs. The case a.na.lyzer then chooses the most preferable pa.rse tree and the most preferable combination of case frames.</Paragraph> <Paragraph position="2"> The valency dictionary also has case<roles (Table \] ) for :noun phrase conditions. The case-roles of adjuncts are determined by using the particles of adjuncts and the sema.ntic ca.tegories of n ou n ph ra.ses.</Paragraph> <Paragraph position="3"> As a result, the OUtl)ut O\[' the case a.nalysis is a set; el&quot; (;ase fl:ames for ca.oh unit se:ntence. The noun phra.ses in \['tames are la.beled by case-roh;s in Tal)le 1.</Paragraph> <Paragraph position="4"> l!'or siml)\]icity , we use case-role codes, such a.s N 1 and N2, a.s the labels (or slot ha.rues) to represent case li:ames. The relation between sentences and case-roles is described in detail in (Ikehara el el., 1993).</Paragraph> <Paragraph position="5"> We developed a logical form translator li'E1 ~ that generates semantic representations expressed a,s atomic Ibrmulae from the cast; fi:a.mes and parse trees. For later use, document II) and tense inlbrmation a.re also added to the case frames.</Paragraph> <Paragraph position="6"> For example, tile case fl:ame in 'l.'able 2 is obtained a:l'ter analyzing the following sentence of document l) 1: &quot;Jalctcu(.lack) h,a suts,tkesu(suitca.se) we Subject N1 the agent/experiencer of I throw a ball. an event/situation Objectl N2 the object of an event Object2 N3 another object of an event Loc-Source N4 source location of a movement Loc-Goal N5 goal location of a movement Purpose N6 the purpose of an action Result N7 the result of an event Locative N8 the location of an event Comitative N9 co-experiencer Quotative N10 quoted expression Material N 11 material/ingredient Cause N12 the reason for an event Instrument N13 a concrete instrument Means N14 an abstract instrument Time-Position TN1 the time of an event Time-Source TN2 the starting time of an event Time-Goal TN3 the end time of ~n event Amount QUANT quantity of something I throw a ball.</Paragraph> <Paragraph position="7"> I compare it with them.</Paragraph> <Paragraph position="8"> I start fl'om Japan.</Paragraph> <Paragraph position="9"> I go to Japan.</Paragraph> <Paragraph position="10"> I go shopping.</Paragraph> <Paragraph position="11"> It results in failure.</Paragraph> <Paragraph position="12"> it occurs at the station.</Paragraph> <Paragraph position="13"> I share a room with him.</Paragraph> <Paragraph position="14"> I say that ....</Paragraph> <Paragraph position="15"> I fill the glass with water.</Paragraph> <Paragraph position="16"> It collapsed fr'om the weight.</Paragraph> <Paragraph position="17"> I speak with a microphone.</Paragraph> <Paragraph position="18"> I speak in Japanese.</Paragraph> <Paragraph position="19"> I go to bed at i0:00.</Paragraph> <Paragraph position="20"> I work from Monday.</Paragraph> <Paragraph position="21"> It continues until Monday.</Paragraph> <Paragraph position="22"> I spend $10.</Paragraph> <Paragraph position="23"> hok,,ba(the omce) kava(from) o(the air</Paragraph> </Section> </Section> <Section position="5" start_page="699" end_page="700" type="metho"> <SectionTitle> 4 Inductive Learning Tool </SectionTitle> <Paragraph position="0"> Conventional ILP systems take a set of positive and negative examples, and background knowledge. The output is a set of hypotheses in the form of logic programs that covers positives and do not cover negatives. We employed the type-oriented ILP system RHB +.</Paragraph> <Section position="1" start_page="699" end_page="699" type="sub_section"> <SectionTitle> 4.1 Features of Type-orlented ILP System RHB + </SectionTitle> <Paragraph position="0"> The type-oriented I\],P system has the tbllowing features that match the needs for learning l\]&quot;~ rules.</Paragraph> <Paragraph position="1"> * A type-oriented ILP system can efficiently and effectively handle type (or semantic category) information in training data.. This feature is adwmtageous in controlling the generality and accuracy of learned IE rules.</Paragraph> <Paragraph position="2"> * It can directly use semantic representations of the text as background knowledge.</Paragraph> <Paragraph position="3"> , It can learn from only positive examples. * Predicates are allowed to have labels (or keywords) for readability and expressibility. null</Paragraph> </Section> <Section position="2" start_page="699" end_page="700" type="sub_section"> <SectionTitle> 4.2 Summary of Type-oriented ILP System RHB + </SectionTitle> <Paragraph position="0"> This section summarizes tile employed type-oriented ILP system RHB +. The input of RHB + is a set of positive examples and background knowledge including type hierarchy (or the semantic attribute system). The output is a set of I\[orn clauses (Lloyd, 11.987) having vari;tl~les with tyl)e intbrmation. That is, the term is extended to the v-term.</Paragraph> </Section> <Section position="3" start_page="700" end_page="700" type="sub_section"> <SectionTitle> 4.3 v-terms </SectionTitle> <Paragraph position="0"> v-terms are the restricted form of 0-terms (Ai't-K~tci and Nasr, 1986; Ait-Kaci et al., 11994). Inl'ormttlly, v-terms are Prolog terms whose variables a.re replaced with variable Var of type T, which is denoted as Var:T. Predicttte ~tnd tim(:tion symbols ~tre allowed to h;we features (or labels). For examl)\]e, speak( agent~ X :human,objcct~ Y :language) is a clause based on r-terms which ha.s labels agent and object, and types human and language.</Paragraph> </Section> </Section> <Section position="6" start_page="700" end_page="702" type="metho"> <SectionTitle> 4.,4 Algorithm </SectionTitle> <Paragraph position="0"> The algorithm of lHllI + is basically ~t greedy covering algorithm. It constructs clauses one-by-one by calling inner_loop (Algorithm \]) which returns a hypothesis clause. A hypothesis clause is tel)resented in the form of head :-body. Covered examples are removed from 1 ) in each cycle.</Paragraph> <Paragraph position="1"> The inner loop consists of two phases: the head construction phase and the body construction I)hase. It constrncts heads in a bottom-up manner and constructs the body in a top-down lna.nner, following the result described in (Zelle el al., 1994).</Paragraph> <Paragraph position="2"> &quot;\['he search heuristic PWI is weighted inform~tivity eml)loying the l,a.place estimate. Let</Paragraph> <Paragraph position="4"> where IPl denotes the number of positive examples covered by T and Q(T) is the empirical content. The smaller the value of PWI, the candidate clause is better. Q(T) is defined as the set of atoms (1) that are derivable from T ~md (2) whose predicate is the target I)redicate, i.e., the predicate name of the head.</Paragraph> <Paragraph position="5"> The dynamic type restriction, by positivc examples uses positive examples currently covered in order to determine appropriate types to wtri~bles for the current clause.</Paragraph> <Paragraph position="6"> Algorithm 1 inner_loop 1. Given positives P, original positives 1~o, background knowledge 1Hr.</Paragraph> <Paragraph position="7"> 2. Decidc typcs of variables in a head by computing the lyped least general generalizations (lgg) of N pairs of clcmcnts in P, and select the most general head as H cad.</Paragraph> <Paragraph position="8"> 3. If the stopping condition is satisfied, return Head.</Paragraph> <Paragraph position="9"> It. Let Body bc empty.</Paragraph> <Paragraph position="10"> 5, Create a set of all possible literals L using variables in Head and Body.</Paragraph> <Paragraph position="11"> 6. Let BEAM be top If litcrals l~, of L wilh respect to the positive weighted informalivily PWI.</Paragraph> <Paragraph position="12"> 7. Do later steps, assuming that l~ is added to Body for each literal lk in BEAM.</Paragraph> <Paragraph position="13"> 8. Dynamically restrict types in Body by callin, g the dynamic type restriction by positive exampies. null 9. If the slopping condition is satisfied, rct'aru (Head :- Body).</Paragraph> <Paragraph position="14"> lO. Goto 5.</Paragraph> <Paragraph position="15"> 5 Illustration of a Learning Process Now, we examine tile two short notices of' new products release in Table 3. The following table shows a sample te:ml)late tbr articles reporting a new product relea.se.</Paragraph> <Paragraph position="16"> Tom pl ate 1. article id: 2. coml)any: 3. product: 4. release date:</Paragraph> <Section position="1" start_page="700" end_page="701" type="sub_section"> <SectionTitle> 5.1 Preparation </SectionTitle> <Paragraph position="0"> Suppose that the following semantic representations is obtained from Article 1.</Paragraph> <Paragraph position="1"> (cl) announce( article => I, tense => past,</Paragraph> <Paragraph position="3"> 2. colnpany: ABC Corp.</Paragraph> <Paragraph position="4"> 3. product: a color printer 4. release date: Jan. 20 Suppose that the following semantic representation is obtained from Article 2.</Paragraph> <Paragraph position="6"> The filled template for Article 2 is as follows.</Paragraph> <Paragraph position="7"> Template 2 1. article id: 2 2. company: XYZ Corp.</Paragraph> <Paragraph position="8"> 3. product: a color scanner 4. release date: last month</Paragraph> </Section> <Section position="2" start_page="701" end_page="701" type="sub_section"> <SectionTitle> 5.2 Head Construction </SectionTitle> <Paragraph position="0"> Two positive examples are selected for the template slot &quot;company&quot;.</Paragraph> <Paragraph position="2"> By computing a least general generalization (lgg)sasaki97, the following head is obtained:</Paragraph> <Paragraph position="4"/> </Section> <Section position="3" start_page="701" end_page="701" type="sub_section"> <SectionTitle> 5.3 Body Construction </SectionTitle> <Paragraph position="0"> Generate possible literals 1 by combining predicate names and variables, then check the PWI 1,1iterals,, here means atomic formulae or negated ones.</Paragraph> <Paragraph position="1"> values of clauses to which one of the literal added. In this case, suppose that adding the following literal with predicate release is the best one. After the dynamic type restriction, the current clause satisfies the stopping condition. Finally, the rule for extracting &quot;company name&quot; is returned. Extraction rules for other slots &quot;product&quot; and &quot;release date&quot; can be learned in the sanle manner. Note that several literals may be needed in the body of the clause to satisfy the stopping condition.</Paragraph> <Paragraph position="3"/> </Section> <Section position="4" start_page="701" end_page="702" type="sub_section"> <SectionTitle> 5.4 Extraction </SectionTitle> <Paragraph position="0"> Now, we have tile following sen\]antic representation extracted from the new article: Article 3: &quot;JPN Corp. has released a new CI) player. ''2</Paragraph> <Paragraph position="2"> Applying the learned IE rules and other rules, we can obtain the filled template for Article 3.</Paragraph> <Paragraph position="3"> Template 3 1. article id: 3 2. company: JPN Corp.</Paragraph> <Paragraph position="4"> 3. product: CI)player 4. release date:</Paragraph> </Section> <Section position="5" start_page="702" end_page="702" type="sub_section"> <SectionTitle> 6.1. Setting of Experhnents </SectionTitle> <Paragraph position="0"> We extracted articles related to the release of new products from a one-year newspaper corpus written in Japanese 3. One-hundred articles were randomly selected fi'om 362 relevant articles. The template we used consisted of tive slots: company name, prod'uct name, release date, a~tnomzcc date, and price. We also filled one template for each a.rticle. After ana.lyzing sentences, case fi'ames were converted into atomic tbrmulae representing semantic repre,,~entationx a.x described in Section 2 and 3. All the semantic representations were given to the lea.rner as background \]C/nowledge, ~md the tilled templates were given as positive examples. To speed-up the leCturing process, we selected predicate names that are relevant to the word s in the templates as the target predicates to be used by the ILl ~ system, and we also restricted the number of literals in the body of hypotheses to one.</Paragraph> <Paragraph position="1"> Precision and recM1, the standard metrics \['or IF, tasks, are counted by using the remove-oneout cross validation on tile e, xamples for each item. We used a VArStation with tlie Pentium H Xeon (450 MHz):for this experiment.</Paragraph> </Section> </Section> class="xml-element"></Paper>