File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2164_metho.xml
Size: 20,264 bytes
Last Modified: 2025-10-06 14:13:41
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2164"> <Title>A GRAMMAR AND A PARSER FOR SPONTANEOUS SPEECH</Title> <Section position="4" start_page="1014" end_page="1014" type="metho"> <SectionTitle> 2. Adjacency rule </SectionTitle> <Paragraph position="0"> Rule for VP-AUXV constructions, Nf x particle construe tioIlS, etc,</Paragraph> <Paragraph position="2"> = (A sem restric) U 04 sere restric) M, A, and H correspond to mother, adjacent daughter, and head daughter. The head daughter's adjacent feature value is unified with the adjacent daughter's feature strtlcture.</Paragraph> </Section> <Section position="5" start_page="1014" end_page="1014" type="metho"> <SectionTitle> 3. Adjunction rule </SectionTitle> <Paragraph position="0"> Rule for modifier modifiee constructions.</Paragraph> <Paragraph position="2"> = (A sem restric) U (H sem restric) M, A, and H correspond to motlmr, adjunct daughter (modifier), and head daughter (modifiee), Tile adjunct daughter's adjunct feature value is the feature structure for the head daughter.</Paragraph> </Section> <Section position="6" start_page="1014" end_page="1014" type="metho"> <SectionTitle> 2 A GRAMMAR FOil. WRITTEN SENTENCES </SectionTitle> <Paragraph position="0"> (TrM-3, a grammar for writte.n sentences, iv a unification grammar loosely based on Japanese phrase structure gr~mlnar (JI'SG) (Gunji, 1986). Of Lhe six phrase structure rules used in Gral-J, the three related to the discussion in the following sections are shOWll in Fig. 1 in a I)A'l'll.d\] like notation (Shieber, 1986)) \],exica\] items are. represented by feature structures, and example of which is shown in Fig. 2.</Paragraph> <Paragraph position="1"> Grat-J-based p~trsers gellerate SOlllalll, iC representa1 lhtles for relative cirCuses ~tl,d for verb-phr~tse coordimttions are not showll here.</Paragraph> <Paragraph position="3"> lions in logical ff)rm in l)avidsonian style. The seina.ntic represealtation ill each lexical item eonsisls of a wu'iable ealled ;m inde,: (feature, (sent index}) ;rod restrictions i)laced on it, (feature (selll restric)). Every time a l)hrase, structure rule is ~q)lflied, l, hese restrie tions ~tre aggregated and a logical form is synthesized.</Paragraph> <Paragraph position="4"> For exumple, let us ~gain consider 'aisuru' (love).</Paragraph> <Paragraph position="5"> If, in the feature structure for the phr;me 'Taro ga' (Taro-NOM), the (sen, index) value is *p a.nd gl~,, (sere restrie) value is {(taro *p)}, after the subc.ategorization rule is al)plid the {sere restric) v~due ill the resulting feature str/lcture for the phrase &quot;\['aro ga ais.rlC (%,'o 'oves} i~ {(~ro *x) 0ov,, *e) (ag<~t *e *x) (patient *e *y)}.</Paragraph> <Paragraph position="6"> (Trat-,! cowers such fundamental Jal)~mese l)henom ena as subcategorizal.ion, passivization, interrogatiou, coordination= and negation, and also covers copulas, relative clauses, and conjunctions. We developed a parser based on (;rat-,l by using botton>u I) eha.rt pursing (Kay, 1980). Unification operations are performed by using constraint projection, ,Ul efficient method for unifying disjunctive lhature descriptions (Nakano, 1991). The l)arser is inq)lemented in Lucid (',ommon Lisp ver. 4.0.</Paragraph> </Section> <Section position="7" start_page="1014" end_page="1014" type="metho"> <SectionTitle> 3 DISTINCTIVE PHENOMENA IN ,IAPANESE SPONTANEOUS SPEECtI </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="1014" end_page="1014" type="sub_section"> <SectionTitle> 3.1 Classification of PhcImmena </SectionTitle> <Paragraph position="0"> We analyzed 97 telephone dialogues (about 300,000 bytes) ~d)out using ldli!\]X to pl'epare docunmnts and 26 dialogues (about i6(),O00 bytes) obtained from three radio lisl;ener call-in programs (Shimctzu et al., 1993a). We found that a.ugmentiltg the gr~:mmlal's aud analysis methods requires taking into acconllL &{, least, the following six phenomena in Japanese spontaneous speech.</Paragraph> <Paragraph position="1"> (1)\[) expressions peculiar to Japanese spontaneous speech, including fillers (or hesitations).</Paragraph> <Paragraph position="2"> (ex.) 'etto aru ndesnkedomo ._ ' 'kono fMru tar _, ...' (wel\], we haw'~ thenl.., this file is...) (i)2) ll~rticlc (ease pnrtiete) omission (ex.) 'sore w,u.ashi y'a,'imasu' (I will do it.) 0)3) matin verb ellipsis, or fragmentary ul, l, erances 1015 (ex.) 'aa, shinkansen de Kyoto kara.' (uh, from Kyoto by Shinkansen line.) (p4) repairing phrases (ex.) 'ano chosya be, chosya no arufabetto jun ni naran da, indekkusu naai?' (well, are there, aren't there indices ordered alphabetically by authors' names?) (p5) inversion (ex.) 'kopii shire kudasai, sono ronbun.' (That paper, please copy.) (p6) semantic mismatch of the theme/subject and the main verb (ex.) 'rikuesuto no uketsnkejikan wa, 24-jikan jonji uketsuke teori masu.' (The hours we receive your requests, they are received 24 hours a day.)</Paragraph> </Section> <Section position="2" start_page="1014" end_page="1014" type="sub_section"> <SectionTitle> 3.2 Treatment of the Phenomena by the Ensemble Model </SectionTitle> <Paragraph position="0"> These kinds of phenomena can be handled by the Ensemble Model. As described in Section 1, the Ensemble Model has syntactic, semantic, and pragmatic processing modnles and modules that; do combination of some or all of those proeessings to analyze the input in parallel and independently. Their output is unified, and even if some of the modules are unable to analyze the input, the other modules output their own resnlts. This makes the Ensemble Model robust.</Paragraph> <Paragraph position="1"> Moreover, even if some of the modules are nnable to analyze the input in real-time, the others output their results in real-time.</Paragraph> <Paragraph position="2"> '\['he Ensemble Model has been partially implemented, and Ensemble/Trio-I consists of syntactic, semantic, and syntactic-semantic modules, it can handle (p2) above as described in detail elsewhere (Shimazu et al., 1993b). Phenomena (p3) through (p6) can be partly handled by another implementation of the Ensemble Model: Ensemble/Quartet1, which has pragmatic processing modnle as well as the three modules of Ensemble/'lMo-I. The pragmatic processing module uses plan and domain knowledge to handle not only well-structured sentences bnt also ill-structured sentences, such as those including inver sion and omission (Kognre et al., 1994).</Paragraph> <Paragraph position="3"> To make the system more robust by enabling the syntactic and semantic processing modules to handle phenomena (pl) and (p3) through (p6), we incorporated Grass-g into those modnles. Grass-J differs fl:om Grat-J in two ways: Grass-J has lexieal entries for expressions peculiar to spontaneous speech, so that it can handle (pl). And because sentence boundaries are not clear in spontaneous speech, it uses tile concept of utterance unit (Shimazu et al,, 1993a) instead of sentence. This allows it to handle phenomena (p3) through (p6). For example, an inverted sentence can be handled by decomposing it, at the point where the inversion occurs, into two utterance units.</Paragraph> <Paragraph position="4"> Fig. 3 shown the architecture of Ensemble/Quartet-I. Each processing module is based on the bottom-up (:hart analysis method (:Kay, 1980) and a disjunctive feature description unification method ealled constraint projection (Nakano, 1991). The syntactic-.</Paragraph> <Paragraph position="5"> semantic processing module uses Grass-J, the syntactic processing module uses Grass-J without semantic constraints such as sortal restriction, the seman-A: 1 anoo kisoken well the Basic Research Labs.</Paragraph> <Paragraph position="6"> eno ikileala o desu.ne to how to go ACC 'well, how to go to the Basic Re- null tic processing moduh', uses Crass-.) without syntactic constraints such as case information, and the pragmatic processing module uses a plan-based grammar.</Paragraph> </Section> </Section> <Section position="8" start_page="1014" end_page="1018" type="metho"> <SectionTitle> 4 A GRAMMAR, FOR SPONTANEOUS SPEECH </SectionTitle> <Paragraph position="0"> '\['his section describes Grass-Z</Paragraph> <Section position="1" start_page="1014" end_page="1016" type="sub_section"> <SectionTitle> 4.1 Processing Units </SectionTitle> <Paragraph position="0"> 'Sentence' is used as the start symbol in granunars for written languages but sentence boundaries are not clear in spontaneous speech. ;Sentence' therefore can not be used as the start symbol in grammars lbr spontaneous speech. Many studies, though, have shown that utterances are composed of short units (I,evelt, 1989: pp. 23-.24), that need not be sentences in written language. Grass-3 uses such units instead of sentences. null Consider, for example, Dialogue 1 in Fig. 4. Utterances 1 and 3 cannot be regarded as sentences in written language. Let us, however, consider 'hal' in Utterance 2. It expresses participant B's confirmation of the contents of Utterance 1. 2 Each utterance in Dialogue 1 can thus be considered to be a speech act (Shimazu et al., 1993a). These utterances are pro-cessing nulls we call &quot;utlerance units. They are used in Grass-J instead of the sentences used in Grat-J. One feature of these units is that 'hal' can be. intel\jected by the hearer at the end of the unit.</Paragraph> <Paragraph position="1"> The boundaries for these units can be determined by using pauses, linguistic clues described in the next section, syntactic form, and so on. In using syntactic \[brm to determine utterance unit boundaries, Crass-J first stipulates what an utterance unit actually is. This stipulation is based on an investigation of dialogue transcripts, and in the current version of Grass.\], the following syntactic constituents are recognized as utterance units.</Paragraph> <Paragraph position="2"> * verb phrases (including auxiliary verb phrases and adjective phrases) that may be followed by =The roles of 'hal', ~tn interjectory response corresponding to ~ back-channel utterance sueh &s uh-huh in English but which occurs more frequently in Japanese di;> logue, axe discussed in Shimazu et at. (1993~t) ~tnd I(~tt~tgiri (1993).</Paragraph> <Paragraph position="3"> used to derive speech act re.presentation from the logical form of these (:onstituents. A Grass-J-based parser inputs an utterance unit and outputs the rel)resenttC/ lion of the speech act performed by the unit, which is then input to the discourse processing system.</Paragraph> <Paragraph position="4"> Consider the following simple dialogue.</Paragraph> <Paragraph position="5"> A: 1 genkou o The logical form for i)tteranee 1 is ((mannscript *x)), so that its resulting speech act representation is</Paragraph> <Paragraph position="7"> or, as written in usual notation, (2) l{el>r(speaker, ?x:manuscript(?x)).</Paragraph> <Paragraph position="8"> In the same way, the speech act representation for Utterance 3 is (3) Request(speaker, hearer, send(hearer, speaker, ry)) The discourse processor would find that '?x in (2) is the same as ?y in (3). A detailed explanation of this discourse processing is beyond the scope of this paper. a'liefer' st~nt(Is for the surface referring in Alien a.nd Perrault (\] 980).</Paragraph> </Section> <Section position="2" start_page="1016" end_page="1018" type="sub_section"> <SectionTitle> 4.2 Treatment of Expressions Peculiar to Spontaneous Slme, eh Classification </SectionTitle> <Paragraph position="0"> The underlined words in l)iak)gue 1 in Fig. d do not normally appear in writLen sentences, We analyzed the dialogue transcripts to identify expressions that kequently appear in spoken sentences which includes spontaneous speech but that do not appear in written sentences, and we cleLssitied them as follows.</Paragraph> <Paragraph position="1"> 1. words plmnologically dif\[erent Dora those in written sentences (words in parenthesis are corresponding written-sentence words) (ex.) 'shinakya' ('shinakerebPS, if someone does not do), 'shichau' ('shiteshimatf, have done) 2. fillers (or hesitations such as well in l!;nglish) (ex.) 'etto', 'anoo' 3. particles peculiar to spoken langnage (ex.) 'tte', 'nante', %oka' 4. interjectory particles (words inserted interjectorily after noun phrases and adverbial/adnominalform verb phrases) (ex.) ~llel~ Cdesllne~ :sa ~ 5. expressions introducing topics (ex.) '(ila)lldeSllkedo', '(\[la) i|desukedon,(,', '(n a) 12 des uga' 6. words appearing after main verb phrases (these words take l;he sentence-final form of verbs/auxiliary verbs/adjectives) (ex.) 'yo', 'ne', 'yone', 'keredo', 'kedo', ~keredomo', 'ga', 'kedomo', 'kate' Nagata and Kogure (1990) addressed Jai)anese sentence-final expressions peculiar to spoken J N)anese sentences but (lid not deal with all the spontaneous speech expressions listed above. These. expressions may be analyzed morphologica.lly (Takeshita & Fukn naga, 1991). Because some expressions peculiar to spontaneous sl)eecb do not affect the propositiomd content of the sentences, disregarding those expressions might be a way to process spontaneons speech. Such cascaded processing of morphological analysis and syntactic and semantic analysis disables the incremental processing required for real-time dialogue understanding. Another approach is to treat these kinds of expressions as extra, 'noisy' words. Although this can be done by using a robust parsing technique, such as the one developed by Mellish (1989), it requires the sentence to be processed more than two times, and is therefore not suitable for real-time dialogue understanding. In Grass-J these expressions are handled in the same way as expressions appearing in written language, so no special techniqm~.s are needed. Words phonologically different froin corresponding words in written-language The words 'tern' and 'ndesu' in 'shit tern ndesu ka' (do you know that'?) correspond semantically to 'teirn' and 'nodesu' in written sentences. We investigated such words in the dialogue data (Fig. 5). One way to handle these words is to translate them into their corresponding written-language words, but because this requires several steps it is not suitable for incremental dialogue processing. We therefore regard these words as independent of their corresponding words in written-language, even though their lexical entries have the same content.</Paragraph> <Paragraph position="2"> Fillers Fillers such as 'anoo' and 'etto', which roughly correspond to wellin English, appear fl'equently in spore taneous speech (Arita et al., 1993) and do not affect the propositional content of sentences in which they appear 4 . One way to handle them is to disregard them after morphological analysis is completed. As noted above, however, such an approach is not suitable for dialogue processing. We therefore treat them directly in parsing.</Paragraph> <Paragraph position="3"> In Grass-J, fillers modify the following words, whatever their grammatical categories are. The feature structure for fillers is as follows.</Paragraph> <Paragraph position="4"> head \[pos interjection\]</Paragraph> <Paragraph position="6"> The value of the feature lexicaI is either + or -: it is + in lexical items and - in feature structures for phrases colnposed, by phrase structure rules, of subphrases. Because these words do not affect propositional contents, the value of the feature (sere restric) is empty.</Paragraph> <Paragraph position="7"> For exalnple, let us look at the parse tree for 'etto 400-yen desu' (well, it's 400 yen). Symbols I (Interjection), NP, and VP are abbreviations for the complex feature structures.</Paragraph> <Paragraph position="8"> 4Although Sadanobu and 'TPakubo (1993) investigated the discourse management function of fillers, we do not discuss it here.</Paragraph> <Paragraph position="9"> \[. expressions related to aspects teku (teiku in written-language), teru (teiru), chau (tesimau), etc.</Paragraph> <Paragraph position="10"> 2. expressions related to topic marker 'wa' cha (tewa), char (tewa), ccha (tewa), .jr (dewa), etc. 3. expressions related to conjnnetive particle 'ha' nakerya (nakereba), nakya (nakereba), etc.</Paragraph> <Paragraph position="11"> 4. expressions related to formal nouns n (no), nmn (nmno), toko (tokoro), etc.</Paragraph> <Paragraph position="12"> 5. demonstratives kocchi (kochira), korya (korewa), so (son), soshitara (soushitara), sokka (souka), socchi (sochira), son (sono), sore.jr (soredewa), sorejaa (soredewa), etc. 6. expressions related to interrogative pronoun nani nanka (nanika), nante (nanito), etc.</Paragraph> <Paragraph position="13"> 7. other 'Phe filler 'etto' modifies the following word '400yen' and the logical form of the sentence is the same as that of '400-yen desn'.</Paragraph> <Paragraph position="14"> Particles peculiar to spoken language Words such as 2;te' in 'Kyoto tte Osaka no tsugi no eki desu yone' (Kyoto is the station next to Osaka, isn't it?) work in the same way a~s c~>e-marking/topicmarking particles. Because they have no corresponding words in written language, lexical entries for then, are required. '\['hese words do not correspond to any specific surface case, such as 'ga' and %'. I,ike t, he topic marker 'wa', the semanl ic relationships they express depend on the meaning of the phrases they connect. null Interjectory particles Intmjectory particles, such ~%s 'ne' and 'desune', for low noun phrases and adverbial/adnominal-form verb phrases, and they do \]lot affect tile meaning of tile utterances. The intmjeciory particle 'he' differs from the sentence-final particle 'ne' in the sense that the latter follows sentence-final form verb phrases. These kinds of words can be treated by regarding them as particles Ibllowing noun phrases and verbs phrases. The following is the feature structure for these words. The interjectory particles indicate the end of utte> ante units; they do not appear in the nliddle of utterante units. They flmetion ~us, so to Sl)eak, utterane(> unit-final t)articles. Therefore, a noun phrase followed by an interjectory particle forms a (surface) referring speech act in the same. way as noun phrase utterances, hH;er.jectory particles add nothing to logical forms. For example, the speech act representation of 'genkou o desune' ix the. same as (2) in Section 4.l. Expressions introducing topics As in Uttermtce 4 of l)ialogne 1, an expression such as (,,a)r, des,,k,~do(,,~o) frequently apl,ears in dialogues, especially in the beginning. This expression introduces a new topic. One way t.o handle an expression such as this is to break it. down into na + ndesu + kcdo F m.o. This process, however, pre vents the system fronl detecting its role in topic intro(luction. We therefore consider each of these expressions to be one word. 'l'he reason these expression are used is to make a topie explicit, by introdncing a discourse referent ('Phomason, 1990). Consequently, an 'introduce-topic' speech act is formed. These exl)ressions indicate the en(I of an utterance unit as an interjectory particle.</Paragraph> <Paragraph position="15"> Words aplmaring after main verb phrase \[t has already been pointed out that; sentence-\[inal |)articles, such as 'yo' and 'ne', Dequently app(:ar in spoken Japanese sentences (Kawamori, 1991). Conjunctive particles, such as qwAo' and 'kara', are also used as sentenee-.final pa.rticle.s (\[h)saka et ah, 1991) and thc'y m:c treated as such in Grass-J. They perform the function of anticipating the heater's reaction, as a trial cxt)ression does (Clark &. Wilkes-(\]ibbs, 19!10). '\]'hey Mso indicate the end of utterance units.</Paragraph> </Section> </Section> <Section position="9" start_page="1018" end_page="1018" type="metho"> <SectionTitle> 5 ANALYSIS EXAMPLES </SectionTitle> <Paragraph position="0"> Below we show results obtained by using a Grass-J-based parser to analyze some of the utterances in Dialogue 1. U (J means the utterance refit category.</Paragraph> <Paragraph position="1"> (r) Utterance I: 'anoo kisoken eno ikikata o desune' (*veil, how to go to the Basic Research l,al)s.) parse tree:</Paragraph> </Section> class="xml-element"></Paper>