File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/e93-1037_metho.xml
Size: 15,903 bytes
Last Modified: 2025-10-06 14:13:19
<?xml version="1.0" standalone="yes"?> <Paper uid="E93-1037"> <Title>Resolving Zero Anaphora in Japanese</Title> <Section position="3" start_page="315" end_page="318" type="metho"> <SectionTitle> 2 Theory </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="315" end_page="315" type="sub_section"> <SectionTitle> 2.1 General </SectionTitle> <Paragraph position="0"> A discourse segment (DS) is a two-part structure consisting of head and body; a head is a nominal with a wa marking; a body is a set of sentences, which end with a period. Note that an adjunctive clause is not a sentence here, since it ends with connectives like .node because, .kara because/after, .to and-then, etc. Formally, we assume sentence has the following analyses, which are given in the DCG formalism.</Paragraph> <Paragraph position="2"> c+ denotes one or more occurrences of clause, C* zero or more occurrences of clause, and N (pp : wa) denotes a wa-marked nominal;pp:wa specifies that the attribute pp (for postposition) has wa for the value.3Let us define discourse segment by: 2 \[Hobbs, to appear\] talks about the cognitive economy in understanding discourse: it says in effect that coherence is the result of minimizing the number of entities in discourse.</Paragraph> <Paragraph position="3"> 3We take a wa-marked nominal to be a sentence adverbial. Thus our approach differs from the tiaditional gap analysis of topic construction \[Kuroda, 1965; Inoue, 1978; Kitagawa, 1982; Gunji, 1987\], which assumes that a wa- null (4) D -> S+.</Paragraph> <Paragraph position="4"> and text by (5) T -> D+.</Paragraph> <Paragraph position="5"> As discussed in section 1, we choose to restrict D to containing at most one ~1 (pp:wa). We implement the restriction by way of some additions to the rule set 3.</Paragraph> <Paragraph position="6"> (6) a S(head:X) -> C+, N(morph:X,pp:wa).</Paragraph> <Paragraph position="7"> b S(head:X) -> C*, N(morph:X,pp:,a), C+.</Paragraph> <Paragraph position="8"> Here, the 6 rule takes care of inverted sentence and the 6 rule non-inverted sentence. The rule set 6 enforces unification between the head value and the morph value, morph represents the morphology of the nominal; thus morph: taro specifies that the associated nominal has the morphology &quot;taro&quot;. Notice that unification fails on a multiply headed segment. A head attribute, once instantiated to some value, will never unify with another. Unification, therefore, acts to limit each segment in the discourse to a single head. Note also that an non-headed discourse, that is, discourse with no headed segments, has a legitimate DS analysis, for unification is possible between empty heads. The following lists the rules for DS Grammar.</Paragraph> <Paragraph position="10"/> </Section> <Section position="2" start_page="315" end_page="316" type="sub_section"> <SectionTitle> 2.2 Headed vs. Non-Headed Discourse </SectionTitle> <Paragraph position="0"> The discourse can he perfectly intelligible without an explicit topic or wa-nominal, which implies that a discourse segment may not be headed at all. It appears, however, that a discourse segment always comes out headed except when there is no head available in the text. In fact, a segment associates with a head nominal regardless of where it occurs in that segment.</Paragraph> <Paragraph position="1"> (8) Taro<i> -wa 01<i> 02<j> seki -we uzutte top seat acc give -ageta node, 01<i> 02<j> orei -we help because thank iwareta. Ol<i> chotto terekusa katta.</Paragraph> <Paragraph position="2"> say pass slightly embarrased cop nominal is dislocated from the sentence and leaves a gap behind. In fact the analysis meets some difficulty in accounting for the wa-nominal having semantic control over a set of period-marked sentences. cf. \[Mikami, 1960\]. Ours, however, is free from the problem, as we see below. Because Taro gave him/her a favor of giving a seat, he/she thanked Taro, who was slightly em. barrassed.</Paragraph> <Paragraph position="3"> (9) 01<i> 02<j> seki-wo uzutte-ageta-node, Taro<i> -wa 01<i> 02<j> orei-wo iwareta.</Paragraph> <Paragraph position="4"> 01<i> chotto terekusak -atta.</Paragraph> <Paragraph position="5"> Because Taro gave him/her a favor of giving a seat, he/she thanked Taro, who was slightly embarrassed.</Paragraph> <Paragraph position="6"> (10) 01<i> 02<j> seki-wo uzut-te-ageta-node, 01<i> 02<j> orei-wo iwareta. Taro<i> -wa 01<i> chotto terekusak -attn.</Paragraph> <Paragraph position="7"> Because Taro gave him/her a favor o/ giving a seat, he/she thanked Taro, who was slighau embarrassed.</Paragraph> <Paragraph position="8"> 8, 9 and 10 each constitute a discourse segment headed with Taro. 4 A discourse can be acceptable without any head at all: (11) 01<i> 02<j> seki wo uzutte ageta node, seat ace give favor because 01</> 02<j> orei -wo iwar eta. 01<i> thanks ace say pass chotto terekusa katta slightly embarassed cop Because he/she gave him/her a favor of giving a seat, he/she thanked him/her, who was slightly embarrassed.</Paragraph> <Paragraph position="9"> The speaker of 11, or watashi I would be the most likely antecedent for the elided subjects here; whoever gave the favor was thanked for the kindness. Let us say that a discourse is headed if each of its segments is headed, and non-headed, otherwise. Our assumption is that a discourse is either headed or non-headed, and not both (e.g. figure 1, figure 2). 5 Formally, this will be expressed through the value for the head attribute.</Paragraph> <Paragraph position="10"> (12) T -> D(head:empty).</Paragraph> <Paragraph position="11"> An empty-headed discourse expands into one segment; its head value will be inherited by each of the S-trees down below. Note that unification fails on cases like 9 and 10, where the backward-looking center Taro refers back to an item in the previous discourse. sit is interesting to note that a multiple-head discourse may reduce to a single-head discourse. This happens when discourse segments (DS) for a discourse, share an identical head, say, Taro and head-unifies with each other. In fact, such a reduction is linguistically possible and attested everywhere. Our guess is that a repeated use of the same wa-phrase may help the reader to keep track of a coreferent for zero anaphora.</Paragraph> <Paragraph position="12"> the head value if any of the S's should be headed and thus specified for the head attribute.</Paragraph> <Paragraph position="13"> The following rule takes care of headed construc- null tions.</Paragraph> <Paragraph position="14"> (13) T -> D+(head:.).</Paragraph> <Paragraph position="15"> The rule says that each of the segments has a non-null specification for the head attribute.</Paragraph> </Section> <Section position="3" start_page="316" end_page="317" type="sub_section"> <SectionTitle> 2.3 Minimal Semantics Thesis Minimal Semantics Thesis (MST) concerns the way </SectionTitle> <Paragraph position="0"> zero pronouns are interpreted in the discourse segment; it involves an empirical claim that the segment's zeros are coreferential unless considerations on the empathy hierarchy (section 2.4) dictate to the contrary.</Paragraph> <Paragraph position="1"> (14) Kono ryori<i> wa saishoni 01<i> mizu this food acc first water wo irete kudasai. Tugini 01<i> sio acc pour in imperative next salt wo hurimasu. 5 hun sitekara, 01<i> ace put-in min. after passing niku wo iremasu.</Paragraph> <Paragraph position="2"> meat ace add As for this food, first pour in some water. Then put in salt. Add meat after 5 rain.</Paragraph> <Paragraph position="3"> We see that 14 constitutes a single discourse segment. According to the minimal semantics thesis, all of the zeros in the segment are interpreted as coreferential, which is consistent with the reading we have for the example. Here is a more complex discourse.</Paragraph> <Paragraph position="4"> (15) Taro-wa 01<i> machi-niitte, 01<i> huku top town to go cloth -wokatta. Masako<j> -wa01<k> sono acc bought top that huku -wo tanjyobi -ni moratte, 01<k> cloth acc birthday on got totemo yoroko -n'da.</Paragraph> <Paragraph position="5"> much rejoice past Taro went downtown to buy a clothing. Masako got it for her birthday present and she was very happy.</Paragraph> <Paragraph position="6"> The first two zeros refer to Taro and the last two refer to Masako. But this is exactly what the MST predicts; 15 breaks up into two discourse segments, one that starts with Taro-wa and the other that starts with Masako-wa, so zeros for each segment become coreferential.</Paragraph> </Section> <Section position="4" start_page="317" end_page="318" type="sub_section"> <SectionTitle> 2.4 Empathy Hierarchy </SectionTitle> <Paragraph position="0"> It appears to be a fact about Japanese that the speaker of an utterance empathizes or identifies more with the subject than with the indirect object; and more with the indirect object than with the direct object \[Kuno, 1987; Kuno and Kaburaki, 1977\]. In fact, there are predicates in Japanese which are lexitally specified to take an empathy-loaded argument; yaru give and kureru receive are two such. For yaru, the speaker empathizes with the subject, hut with the indirect object, in the case of kureru.</Paragraph> <Paragraph position="1"> The relevance of the speaker's empathy to the resolution problem is that an empathized entity becomes more salient than other elements in the discourse and thus more likely to act as the antecedent for an anaphor.</Paragraph> <Paragraph position="2"> (16) Taro-ga Masako<j> -ni hon -wo katte nom to book acc buy -kureta. Imademo 01<i> sono hon -wo helped still that book acc daijini siteiru.</Paragraph> <Paragraph position="3"> care keep Taro gave Masako a favor in buying her a book. She still keeps it with care.</Paragraph> <Paragraph position="4"> In 16, 01, subject of the second sentence, corders with the indirect object Masako in the first sentence, which is assigned empathy by virtue of the verb kureta.</Paragraph> <Paragraph position="5"> Formally, we define the empathy hierarchy as a function with three arguments. 6 empathy(Z1, Z2, Z3) 6The definition is based on the observation that Japanese predicates take no more than three argument roles.</Paragraph> <Paragraph position="6"> With the definition at hand, we are able to formulate the lexical specification for kureru: V(empathy(hrg2, Argl, Arg3), subject : hrgl, obj ect2 : Arg2, object :Arg3) -> \[kureru\].</Paragraph> <Paragraph position="7"> yaru has the formulation like the following: V(empathy(hrgl, Arg2, hrg3), subj oct : hrgl, obj ect2: Arg2, object :Arg3) -> \[yarun\].</Paragraph> <Paragraph position="8"> Further, let us assume that variables in the empathy hierarchy represent zero pronouns. If a variable in the hierarchy is instantiated to some non-zero item, we will remove the variable from the hierarchy and move the items following by one _position to the left; we might call it empathy shifting/ Now consider the discourse: (17) 01</> 02<i> hon -wo yatta -node, book acc favored because 01<k> 02<a> orei -wo iwareta.</Paragraph> <Paragraph position="9"> gratitude ace say cop 'Because he/she gave a book to him/her, he/she was thanked for it.' (18) a empathy(01<i>, 02<j>, _) b empathy(01<k>, 02<9 >, _) 18(1) corresponds to the empathy hierarchy for the first clause in 17; 18(b) corresponds to the hierarchy for the second clause. Unifying the two structures gives us the correct result: namely, 01<i> - 01<k>, and 02<i> = 02<9 >. Notice that zero items in the segment are all unified through the empathy hierarchy, which in effect realizes the Minimal Semantics Thesis. As it turns out, the MST reduces the number of semantically distinct zero pronouns for a discourse segment to at most three (figure 3). We conclude the section with a listing of the relevant DCG rules.</Paragraph> <Paragraph position="11"> rThe empathy hierarchy here deals only with pronoun variables; we do not want two constant terms unifying via the hierarchy - which is doomed to failure.</Paragraph> </Section> <Section position="5" start_page="318" end_page="318" type="sub_section"> <SectionTitle> 3.1 Embedding and Interleaving </SectionTitle> <Paragraph position="0"> In this section, we will illustrate some of the ways in which T-structure figures in Japanese discourse, s What we have below is a father talking about the health of his children.</Paragraph> <Paragraph position="1"> Chichioya<i> -wa 01<i> warat -te, father top laugh and ~Taxo<h>-wa yoku kaze -wo hiku -n'desuyo.</Paragraph> <Paragraph position="2"> Taro top often cold acc catch aux-polite Kinou -mo 01<t> kaze -wo hi'ire, 01<k> yesterday also cold acc catch gakko -wo yasu -n'da-n'desuyo.</Paragraph> <Paragraph position="3"> school acc take leave past aux-pollte Masako<j> -wa 01<./> gen'ldde, Ol<j> kaze top healthy cold -wo hi'ita koto -ga arimas en.</Paragraph> <Paragraph position="4"> acc caught experiende nora occur aux-neg 01<j> itsumo sotode ason'de -imasuyo.&quot; often outdoors play aux-polite -to Ol<i> itta.</Paragraph> <Paragraph position="5"> comp said &quot;Taro often catches a cold. He got one yesterday again and didn't go to school. Masako stays in a good health and has never been sick with fin. I often see her playing outdoors.&quot; Father said with a smile on his face.</Paragraph> <Paragraph position="6"> Here are the facts:(a) zero anaphora occurring within the quotation (internal anaphora) are coreferential either with Taro or with Masako; (b) those occurring outside (external anaphora), however, all refer to chichioya; (c) chichioya has an anaphoric link which crosses over the entire quotation; (d) syntactically, the quoted portion functions as a complement for the verb -to itta. It appears, moreover, that an internal anaphor associates itself with Taro in case it occurs in the segment headed with Taro, and with Masako in case it occurs in the segment headed with Masako. Then, since the quoted discourse consists of a set of discourse segments, it will be assigned to a T-structure. But the structure does not extend over the part 01 itta, which completes the discourse, for the 01 corders with chichioya, and neither with Taro or Masako. This would give us an analysis like one in figure 4.</Paragraph> <Paragraph position="7"> S Here and below we call a tree rooted at T a 'Tstructure' and one rooted at D a 'D-structure'.</Paragraph> <Paragraph position="9"> The following discourse shows that the T-structure can be discontinuous: \[a\] ~Masako<i> -ga kinou sigoto-wo nora yesterday work acc yasun'da -n'desuyo.&quot; \[b\] Hahaoya<k> -wa took leave aux-polite mother nora 01<h> isu -ni suwaru -to 01<t> hanashi chair on sit when tell hazimeta \[c\] &quot;Kaze-demo 01<i> hi'ita -nolm.&quot; began, cold acc caught question \[d\]-to Chichioya-ga 03<k> tazuneta.</Paragraph> <Paragraph position="10"> comp father nom asked &quot;Masako took a leave from the work yesterday.', Mother began to tell, as she sat on the chair. &quot;Did she catch a cold f &quot;, asked Father.</Paragraph> <Paragraph position="11"> 01<i> corders with Masako, so \[c\] forms a T-structure with \[a\]. But the two are separated by a narrative \[b\]. Similarly, the coreference between 03<k> and Hahaoya gives rise to a T-structure that spans \[d\] and \[b\], but there is an interruption by narrative \[c\] (figure 5).</Paragraph> </Section> </Section> class="xml-element"></Paper>