File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/w99-0209_metho.xml
Size: 3,431 bytes
Last Modified: 2025-10-06 14:15:29
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0209"> <Title>Orthographic Co-Reference Resolution Between Proper Nouns Through the Calculation of the Relation of &quot;Replicancia&quot;</Title> <Section position="4" start_page="62" end_page="62" type="metho"> <SectionTitle> 3 Replicancia Calculation Algorithm </SectionTitle> <Paragraph position="0"> Once the replicancia relation is defined, we are in a position to present the algorithm devised to calculate it. Although the algorithm has been developed in Prolog, we are going to expose it in a pseudocode, similar to that used in PASCAL, which uses the following notation: the symbols &quot;,&quot;, &quot;/&quot; and &quot;7&quot; represent the conjunction, disjunction and negation of terms, respectively; the brackets &quot;(...)&quot; delimitate optional units; braces &quot;{... }&quot; are used to establish the prelation of logical operators; &quot;::=&quot; is the symbol chosen for 2 We will use Courier New font to write the names of the predicates defined in Prolog which take part in the replicancia calculation algorythm.</Paragraph> <Paragraph position="1"> 3 The canonic form of a proper noun is its representation in small letters without accentuation.</Paragraph> </Section> <Section position="5" start_page="62" end_page="63" type="metho"> <SectionTitle> 4 In the last case the whole word P1 in the longer </SectionTitle> <Paragraph position="0"> noun is assimilated to only the initial ' of P2. The rest of the letters in P2 must have some correspondence in the first noun so that we can consider both nouns as replicantes.</Paragraph> <Paragraph position="1"> the definition and &quot;:=&quot; refers to the assignation of values.</Paragraph> <Paragraph position="2"> The calculation of the replicancia relation can impose a heavy computation burden if it is not somehow limited. The variety of forms in which two nouns can be mutually replicantes forces us to cover a long distance before being able to decide that two candidates are linked by such relation. That is why we have provided our algorithm with two instruments to reduce the computation burden; the first is its ability to learn, which permits the automatic creation of a data base of pairs of proper nouns mutually replicantes; the second is a filter which, based in a fast analysis of the initials of the nouns under comparison, reject most of the pairs of nouns that are not mutually replicantes. The main predicate can be expressed as follows:</Paragraph> <Paragraph position="4"> where Nxc represents the canonic form of noun Nx and predicate resto replicante is defined by:</Paragraph> <Paragraph position="6"> where Nx-Ny represents the lexemes sequency of noun Nx not present in noun Ny and D12 is NI-N2; predicate sin_prep(Nx) is true for the nouns Nx which do not include prepositions; suprimenexos is the result of eliminating the noun of the lexemes in N1 which do not begin with a capital letter; finally, predicate version(Nx,Ny) is satisfied if every lexeme of Ny is a version_palabra of any of the lexemes of Nx without any alteration in the relative order of occurrence of each homologous lexeme, as we have already indicated in section 2.</Paragraph> <Paragraph position="7"> In order to evaluate this algorithm we have designed the experiment described in Section 4.</Paragraph> </Section> class="xml-element"></Paper>