File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/w99-0206_metho.xml
Size: 23,522 bytes
Last Modified: 2025-10-06 14:15:25
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0206"> <Title>Pronoun Resolution in Japanese Sentences Using Surface Expressions and Examples</Title> <Section position="5" start_page="39" end_page="861" type="metho"> <SectionTitle> 3 Heuristic Rules for Demonstratives </SectionTitle> <Paragraph position="0"> We made heuristic rules for demonstratives by consulting the papers (NLRI 81) (Hayashi 83) (Takahashi et al. 90) (Kinsui & Takubo 92) and by examining Japanese sentences by hand. Demonstratives have three categories: demonstrative pronouns, demonstrative adjectives, and demonstrative adverbs. In the following sections, we explain the rules for analyzing demonstratives.</Paragraph> <Section position="1" start_page="39" end_page="861" type="sub_section"> <SectionTitle> 3.1 Rule for Demonstrative Pronouns </SectionTitle> <Paragraph position="0"> Rule in the case when the referent is a noun phrase Candidate enumerating rule 1 When a pronoun is a demonstrative pronoun or &quot;8ono (of it) / k0no (of this) \] an0 (of that)&quot;, {(A topic which has weight W and distance D, W-D-2) (A focus which has weight W and distance D, W D + 4)} This bracketed expression represents the lists of proposals in Figure 1. The definition and weight W of the topic and focus are shown in Tables 1 and 2. The distance (D) is the number of topics and loci between the demonstrative and the possible referent. Since a demonstrative more often refers to loci than a zero pronoun does, we add the coefficient -2 or +4 as compared with the heuristic rules in zero pronoun resolution. The score (in other words, the certification value) of a candidate referent depends on the weight of topics/foci and the physical distance between the demonstrative and the candidate referent.</Paragraph> <Paragraph position="1"> Rule when the referent is a verb phrase pronouns Sim. 0111 21 3 5 6 , Jo,o,- O,-lOl- X - o1- ol Sire. = Slmlarity level When a pronoun is &quot;kore/sore/are&quot; or a demonstrative adjective, {( The previous sentence (or the verb phrase which is a conditional form containing a conjunctive particle such as &quot;ga (but)&quot;, &quot; daga (but)&quot;, and &quot;keredo (but)&quot; if the verb phrase is in the same sentence), 15)) The following is an example of a pronoun referring to the verb phrase in the previous sentence.</Paragraph> <Paragraph position="2"> tengu-wa maenoban-noyouni utattari odottari shihajimeta. (tengu) (the previous night) (sing) (dance) (begin to do) (Tengus began singing and dancingjnst as they had done the previous night.) ojiisan-wa sore-wo mite, kon'nahuuni utai-hajimeta. (the old man) (it) (see) (as follows) (begin to sing) (When the old man saw this, he began to sing as follows.) (1) In these sentences, a demonstrative pronoun &quot;sore (it)&quot; refers to the event &quot;tengutachi-ga utattari odottari shihajirnemashita (tengus began singing and dancing just as they had done the previous night.) &quot;3.</Paragraph> <Paragraph position="3"> Rule using the feature that demonstrative pronouns usually do not refer to people Candidate judging rule 1 When a pronoun is a demonstrative pronoun and a candidate referent has a semantic marker HUM (human), it is given -10. We used the Noun Semantic Marker Dictionary (Watanabe et al. 92) as a semantic marker dictionary 4 .</Paragraph> <Paragraph position="4"> Candidate judging rule 2 When a pronoun is a demonstrative pronoun, a candidate referent is given the points in Table 3 by using the highest semantic similarity between the candidate referent and the codes {5200003010 5201002060 5202001020</Paragraph> <Paragraph position="6"> all bll &quot;125&quot; and &quot;126&quot; are given two category numbers. When we calculate the semantic similarity, we use the modified code table in Table 4. The reason for this modification is that some codes in BGH (NLRI 64) are not suitable for semantic constraints.</Paragraph> <Paragraph position="7"> These rules use the feature that a demonstrative pronoun rarely refers to people. This reduces the number of candidates of the referent. For example, we find &quot;sore (it)&quot; in the following sentences refers to &quot;konpyuuta (computer)&quot;, because &quot;sore (it)&quot; can only refer to only a thing which is not human and the only noun which is near &quot;sore (it)&quot; and which is not human is &quot;konpyuuta</Paragraph> <Paragraph position="9"> (Taroo bought a new computer.) ion-hi sassoku sore-wo misemashita.</Paragraph> <Paragraph position="10"> (John) (at once) (it) (show) (\[He\] showed it at once to John.) Rule with feature that &quot;koko&quot; and &quot;soko&quot; (2) often refer to locations Candidate judging rule 3 When a pronoun is &quot;koko (here) / soko (there) \] asoko (over there)&quot; and a candidate referent has a semantic marker LOC (location), the candidate referent is given 10 points.</Paragraph> <Paragraph position="11"> Candidate judging rule 4 When a pronoun is &quot;koko/soko/asoko &quot;, a candidate referent is given the points in Table 5 based on the semantic similarity between the candidate referent and the category number. This 10-digit category number indicates seven levels of an is-a hierarchy. The top five levels are expressed by the first five digits &quot;of a category number. The sixth level is expressed by the following two digits of a category number. The last level is expressed by the last three digits of a category number.</Paragraph> <Paragraph position="12"> refer to places ~Sim. 0 1 2 4 5 6 lO I codes {6563006010 6559005020 9113301090 9113302010 6471001030 6314020130) which signify locations in BGH (NLRI 64).</Paragraph> <Paragraph position="13"> &quot;soko (there)&quot; commonly refers to location. For example, &quot;soko&quot; in the following sentences refers to &quot;baiten (shop)&quot; which signifies location.</Paragraph> <Paragraph position="14"> koora-wo kaini baiten-ni hairimashita.</Paragraph> <Paragraph position="15"> (cola) (buy) (shop) (enter) (Taroo entered a shop to buy a cola.) jiroo-wa so\[co-de guuzen dekuwashimashita.</Paragraph> <Paragraph position="16"> (Jiroo) (there) (by chance) (meet) (Jiroo met Taroo there by chance.) Rule when &quot;kokode&quot; or &quot;sokode&quot; is used as a (3) cortjunction Candidate enumerating rule 3 When a pronoun is &quot;kokode&quot; or &quot;sokode&quot;, {(the pronoun is used as a conjunction, 11)} This rule is for when &quot;kokode (here or then)&quot; or &quot;sokode (there or then)&quot; is used as a conjunction. If a word that signifies location is not found near &quot;kokode&quot; or &quot;sokode&quot;, the candidate listed by this rule has the highest score, and &quot;kokode&quot; or &quot;sokode&quot; is judged to be a conjunction. By using this rule, &quot;sokode&quot; in the following sentences is judged to be a conjunction. ojiisan-wa fenen-ea kowakunakunatte-imashita.</Paragraph> <Paragraph position="17"> (old man) (tengu) (lose all fear of) (The old man lost all fear of the tengns.) sokode ojiisan-wa kakure~eita ana-kara detekimashita. (so) (old man) (be hiding) (hole) (leave) (So, he left the hole where he had been hiding.) (4) This rule is necessary when the system translates &quot;sokode&quot; into English, judges whether it is used as a demonstrative or as a conjunction, and translates it into &quot;there&quot; or &quot;then.&quot; Rule when an anaphor does not have its antecedent Candidate enumerating rule 5 When a pronoun is a demonstrative pronoun, a demonstrative adverb, or a demonstrative adjective, {(Introduce an individual, 10)} This rule is used when there is no referent of a pronoun in the sentences. This rule makes the system introduce a certain individual.</Paragraph> </Section> <Section position="2" start_page="861" end_page="861" type="sub_section"> <SectionTitle> 3.2 Rule for Demonstrative Adjectives </SectionTitle> <Paragraph position="0"> Demonstrative pronouns such as &quot;kono (this)&quot;, &quot;sono (the)&quot;, &quot;ano (that)&quot;, &quot;kon'na (like this)&quot;, and &quot;son'na (like it)&quot; are classified into two reference categories: gentei-reference and daikou-reference.</Paragraph> <Paragraph position="1"> In a Gentei-reference although a demonstrative adjective does not refer to an entity by itself, the phrase of &quot;demonstrative adjective + noun phrase&quot; refers to the antecedent. For example &quot;kono ojiisan (this old man)&quot; in the following sentences: I Sim. 0 1 2 oiot l 01' 11 1 deg 3 I Exact ojiisan-wa ten#utachi-no-mdeni deteitte odori-hajimemashita (old man) (before the tengus) (appear) (begin to dance) (He appeared before the tengus, and began to dance.) keredomo kono ojiisan-wa ufa-mo odori-mo hetakuso-deshita (but) (this oldman) (sing) (dance) (poor) (But the old man was a poor singer, and his dancing was no better.) (s) In this example, although the demonstrative &quot;kono (this)&quot; does not refer to &quot;ojiisan (old man)&quot; in the first sentence, the noun phrase &quot;kono ojiisan (this old man)&quot; refers to &quot;ojiisan (old man)&quot; in the first sentence. Daikou-reference is a demonstrative adjective that refers to an entity. In this case, we can analyze &quot;sono (the)&quot; as well as &quot;sore-no (of it)&quot;. In the following sentences, &quot;sono&quot; refers to &quot;tengu&quot; (tengus). It is an exampie of daikou-reference.</Paragraph> <Paragraph position="2"> mata karasu-no-youna kao-wo-shita tengu-mo imashita (also) (like crows) (with face) (tengu) (exist) (There were also some tengus with faces like those of crows.) sono kuchi-wa torino-kuchibashi-noyouni togatte-imashita (their mouths) (like the beaks of birds) (be pointed) (Their mouths were pointed like the beaks of birds.)</Paragraph> <Paragraph position="4"> Rules for gentei-reference and daikou-reference are as follows: Candidate enumerating rule I When a pronoun is &quot;demonstrative adjective + noun { (the noun phrase containing a noun ~, 45) (the topic which is a subordinate of noun c~ and which has weight W&quot; and distance D, W - D + 30) (the focus which is a subordinate of noun ~ and which has weight W and distance D, W - D + 30)) The relationships between a super-ordinate word and a subordinate word are detected by judging the last word in the definition of the word c~ in EDR Japanese word dictionary (EDR 95a) to be the super-ordinate of the word a.</Paragraph> <Paragraph position="5"> Because of this rule, when a pronoun is &quot;demonstrative adjective + noun phrase a&quot; and there is the same noun phrase a near it, it is judged to be &quot;genteireference&quot; and is selected as a candidate of the referent. When there is a subordinate of a noun phrase near it, it is also selected as a candidate of the referent. These rules give higher points to a candidate referent X&quot; Examples of Noun X I hukuro (sack), ruporaitg (documentary writer) iin (member), akachan (baby), kate (he) the subordinate of noun phrase a.</Paragraph> <Paragraph position="6"> ojiisan-wa toonoiteiku tsuru-ao sugata-wo miokurimashita. (old man) (recede) (crane) (figure) (watch) (The old man watched the receding figure of the crane.) &quot;ano tori-wo tasukete yokatta&quot; to iimashita. (that bird) (save) (glad) (say) (&quot;I'm glad I saved that bird,&quot; said the old man to himself.) (7) In this example, the underlined &quot;ano tori (that bird)&quot; refers to a subordinate &quot;tsuru (crane)&quot; in the previous sentence.</Paragraph> </Section> <Section position="3" start_page="861" end_page="861" type="sub_section"> <SectionTitle> Rules for daikou-referenee of so-series </SectionTitle> <Paragraph position="0"> demonstrative adjective Candidate judging rule 5 When a pronoun is a so-series demonstrative adjective, the system consults examples of the form 'h~oun X n0 noun Y&quot; whose noun Y is modified by the pronoun, and gives a candidate referent the points in Table 6 according to the similarity between the candidate referent and noun X in &quot;Bunnfi Goi Hyou&quot; (NLRI 64). The Japanese Co-occurrence Dictionary (EDR 95c) is used as a source of examples of &quot;X no Y&quot;.</Paragraph> <Paragraph position="1"> This rule is for checking the semantic constraint (For a daikou-reference, candidates of the referent are selected . by Candidate enumerating rule 1 in Section 3.1.).</Paragraph> <Paragraph position="2"> We explain how to use the rule in the underlined &quot;sono (the)&quot; in the sentences (6). First, the system gathers examples of the form &quot;Noun X no kuchi (mouth of Noun X)'. Table 7 shows some examples of &quot;Noun X no kuchi (mouth of Noun X)&quot; in the Japanese Co-occurrence Dictionary (EDR 95c). Next, the system checks the semantic similarity between candidate referents and Noun X, and judges the candidate referent having a higher similarity to be a better candidate referent. In this example, &quot;tcngu&quot; is semantically similar to Noun X in that they are both living things. Finally, the system selects &quot;teng~' as the proper referent.</Paragraph> <Paragraph position="3"> Rules when non-so-series demonstrative has daikou-reference Candidate judging rule 6 When a pronoun is a non-so-series demonstrative adjective, the system consults examples of the form &quot;Noun X no (of) Noun Y (Y of X)&quot; whose Noun Y is modified by the pronoun, and gives candidate referents the points in Table 8 according to the similarity between the candidate referent and noun X in &quot;Bunrul Goi Hyou&quot; (NLRI 64). Since a non-so-series demonstrative adjective rarely is a daikou reference (NLRI 81) (Yamamura et al. 92), the number of points is footnotesizeer than in the case of the s0-series.</Paragraph> <Paragraph position="4"> Rule when a pronoun refers to a verb phrase Like a demonstrative pronoun, a demonstrative adjective can refer to the meaning of the verb phrase in the previous sentence. This case is resolved by Candidate enumerating rule 2 in Section 3.1.</Paragraph> <Paragraph position="5"> Rule for &quot;kon'na noun&quot; (noun like this) &quot;kon'na noun&quot; can also refer to the next sentences in addition to a noun phrase and the previous sentences.</Paragraph> <Paragraph position="6"> ojiisan-wa odorinagara kon'na uta-wo utaimashita.</Paragraph> <Paragraph position="7"> (old man) (dance) (song like this) (sing) (As he danced, he sang the following song: ) &quot;tengu tengu hachl tengu.</Paragraph> <Paragraph position="9"> In the above example, &quot;kon'na uta (song like this)&quot; refers to the next sentence &quot;tengu, tengu, hachi tengu.&quot; But we cannot decide whether &quot;kon'na noun&quot; (noun like this) refers to the previous or next sentences only by the expression of &quot;kon'na noun&quot; (noun like this) itself. To make the decision, we gathered 317 sentences containing &quot;kon'na&quot; (like this) from about 60,000 sentences in Japanese essays and editorials, and counted the total frequency of cases in which &quot;kon'na&quot; refers to the previous and next sentences. The results are shown in Table 9. This table indicates that &quot;kon'na noun&quot; followed by other particles, specifically &quot;ga&quot; and &quot;wo,&quot; which are used when representing new information, very often refers to the previous sentence. Therefore, the system judges that the desired antecedent is the previous sentence. When &quot;kon'na noun&quot; is followed by the particles &quot;ga&quot; or &quot;wo,&quot; the proper referent is determined by the expression in quotation marks (&quot;,&quot;).</Paragraph> </Section> <Section position="4" start_page="861" end_page="861" type="sub_section"> <SectionTitle> 3.3 Rule for Demonstrative Adverbs </SectionTitle> <Paragraph position="0"> Rule when so-series demonstrative adverb refers to the previous sentences Candidate enumerating rule 9 When an anaphor is a so-series demonstrative adverb such as &quot;sou (so),&quot; {(the previous sentences, 30)} The following is an example.</Paragraph> <Paragraph position="1"> &quot;tengu tengu hachi tengu.&quot;</Paragraph> <Paragraph position="3"> so.._.~u utatta-nowa sokoni hachihiki-no tengu-ga itakara-desu.</Paragraph> <Paragraph position="4"> (sing so) (there) (eight) (tengu) (exist) (He sang s._oo because he counted eight of them there. ) (9) &quot;sou (so)&quot; refers to the previous sentence &quot;tengu tengu hachi tengu&quot;.</Paragraph> <Paragraph position="5"> Rule when so-series demonstrative adverb cataphorically Refers to the Verb Phrase in the Same Sentence Candidate enumerating rule 10 When an anaphor is &quot;sou/soushite/sonoyouni&quot; and is in the subordinate clause which has a conjunctive particle such as &quot;9a&quot;, &quot;daga &quot;, and &quot;keredo &quot;or an adjective conjunction such as &quot;youni&quot;, {(the main clause, 45)} 4 Heuristic Rule for Personal Pronouns Candidate enumerating rule 1 When an anaphor is a first personal pronoun, {(the first person (the speaker) in the context, 25)} Candidate enumerating rule 2 When an anaphor is a second personal pronoun, {(the second person (the hearer) in the context, 25)} A first or second personal pronoun is often presented in quotations, and can be resolved by estimating the first person (speaker) or the second person (hearer) in advance. The estimation of the first person and the second person is performed by regarding the ga-case (subjective) and n/-case (objective) components of the verb phase representing the speaking action of the quotation as the first and second persons, respectively. The detection of the verb phase representing the speaking action is performed as follows. If the quotation is followed by a speaking action verb phrase such as &quot;to itta (was said),&quot; the verb phrase is regarded as the verb phase representing the speaking action. Otherwise, the last verb phrase in the previous sentence is regarded as the verb phase representing the speaking action. For example, the second personal pronoun &quot;omaesan (you)&quot; in the following sentences refers to the second person &quot;ojiisan (the old ojiisan-wa jimen-ni koshi-wo-oroshimashita.</Paragraph> <Paragraph position="6"> (old man) (ground) (sit down) (The old man sat down on the ground.) yagate (ojiisan-wa) nemutte-shimaimashita.</Paragraph> <Paragraph position="7"> (soon) (old man) (fall asleep) (He soon fell asleep.) (of course) (you) (don't mean to doubt) (&quot;Of course, we don't mean to doubt you,&quot;) tengu-ga ojiisan-ni iimashita.</Paragraph> <Paragraph position="8"> (tengu) (old man) (said) (said one of the &quot;tengu&quot; to the old man.) (10) The second person in the quotation is estimated to be &quot;ojiisan&quot; because the n/-case component of the verb phrase &quot;iimashita (said)&quot; representing the speaking action of the quotation is &quot;ojiisan'.</Paragraph> <Paragraph position="9"> Candidate enumerating rule 3 When an anaphor is a third personal pronoun, {(a first person, --10) (a second person, -10)) Sire. 0 1 2 4 5 6 Rule using semantic relation to verb phrase Candidate judging rule 1 When a candidate referent of a case component (a zero pronoun) does not satisfy the semantic marker of the case component in the case frame, it is given -5. Candidate judging rule 2 A candidate referent of a case component (a zero pronoun) is given the points in Table 10 by using the highest semantic similarity between the candidate referent and examples of the case component in the case frame. These two rules are for checking the semantic constraint between the candidate referent and the verb phrase which has the candidate referent in its case component. Candidate judging rule 1 checks semantic constraints by using semantic markers. Candidate judging rule 2 checks semantic constraints by using examples. Figure 3 explains how to check semantic constraints in the example sentences.</Paragraph> <Paragraph position="10"> In the method using semantic markers, a candidate referent is the proper referent if one of the semantic markers belonging to the candidate referent is equal or subordinate to the semantic marker of the case component. For example, with respect to the zero pronoun in Figure 3, since the ga-case component in the verb &quot;nemum (sleep)&quot; has the semantic markers HUM (human being) and ANI (animal) and since &quot;ojiisan (old man)&quot; has the semantic marker HUM, the proper referent is judged to be &quot;ojiisan.&quot; In the example-based method, the validity of a candidate referent is decided by the semantic similarity between the candidate referent and the examples of the case component in the verb case frame. The higher the semantic similarity is, the greater the validity is. For example, with respect to a zero pronoun in Figure 3, since the examples of the ga-case are &quot;kate (he)&quot; and &quot;inu (dog),&quot; and since &quot;ojiisan (old man)&quot; is semantically similar to &quot;kate (he)&quot;, the proper referent is &quot;offisan (old man).&quot; These rules, which use semantic relationships to verbs, are also used in the estimation of the referent of demonstratives and personal pronouns.</Paragraph> <Paragraph position="11"> Rule using the feature that it is difficult for a noun phrase to be filled in multiple case components of the same verb Candidate enumerating rule 4 When there is &quot;Noun X&quot; in another case component of the verb which has the analyzed case component (the analyzed zero pronoun), {(Noun X, -20)} Rule using empathy This rule is based on empathy theory (Kameyama 86).</Paragraph> <Paragraph position="12"> When an anaphor is a ga-case zero pronoun whose verb is followed by an auxiliary verb such as &quot;kureru&quot; or &quot;kudasaru,&quot; the n/-case zero pronoun is analyzed first, and The points given in each nile are manually adjusted by using the training sentences. Training sentences (example sentences (43 sentences), a folk tale &quot;kobutori jiisan&quot; (Nakao 85) (93 sentences), an essay in &quot;tenseijingo&quot; (26 sentences), an editorial (26 sentences), an article in &quot;Scientific American (in Japanese)&quot;(16 sentences)} Test sentences {a folk tale &quot;tsuru no ongaeshi&quot; (Nakao 85) (91 sentences), two essays in &quot;tenseijingo&quot; (50 sentences), an editorial (30 sentences), articles in &quot;Scientific American (in Japanese)&quot; (13 sentences)} it is filled with the noun phrase that has high empathy such as the topic, and a ga-case zero pronoun is filled with another noun phrase.</Paragraph> </Section> </Section> class="xml-element"></Paper>