File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/h89-2009_metho.xml
Size: 13,285 bytes
Last Modified: 2025-10-06 14:12:20
<?xml version="1.0" standalone="yes"?> <Paper uid="H89-2009"> <Title>ANSWERS AND QUESTIONS: PROCESSING MESSAGES AND QUERIES*</Title> <Section position="4" start_page="60" end_page="61" type="metho"> <SectionTitle> MESSAGE PROCESSING AND QUESTION-ANSWERING </SectionTitle> <Paragraph position="0"> The PUNDIT natural language processing system was initially developed in the context of message-processing applications. Because PUNDIT is a modular system, typical potting tasks include the creation of a domain-specific lexicon, knowledge base, and semantics rules. Another common feature is basic extensions to handle constructions which are part of the standard written language, but which have not previously appeared in the data (e.g. comparatives, superlatives, address expressions). More interesting are the extensions occasioned by basic differences between messages and face-to-(inter)face conversational interaction. null Basic Differences Between Messages and Voyager Dialogue To fully characterize these differences requires representative data from naturally-occurring messages and task-oriented dialogue. We have the former; it is our plan to collect the latter, possibly using PUNDIT + VOYAGER as a data collection vehicle. For present purposes, we may highlight some of the observed differences and similarities. Our message corpora consist largely of short narratives in what has been called telegraphic style (cf. \[GK86\]). As a consequence, we find frequent agentless passives, noun-noun compounds, nominalizations, run-on sentences, and zeroing of determiners, subjects, copula, and prepositions. Explicit pronouns and dummy subjects are raze, as are interrogatives, indirect questions, imperatives, and conditionals. Where temporal relations are explicitly marked, they are commonly marked by preposed time adverbials. Modals are raze.</Paragraph> <Paragraph position="1"> In contrast, our initial corpus of user inputs to VOYAGER shows, not surprisingly, that interrogatives and imperatives are more frequent than statements. The types of zeroing seen in telegraphic narrative do not occur in the corpus 2, nor do nominalizations and run-ons. Passives are rare. Dummy it and there are common, as are I and you, and (in the context of questions about restaurants) they. Preposing (other than wh-movement) does not occur. Modals are common. Many of these differences are p~edicted by considering the two genres to be at opposite ends along Biber's textual dimensions of 'involved versus informational production', 'narrative vs. non-narrative concerns', and 'abstract vs. non-abstract information' \[Bib88\]. We plan to take advantage of such differences to tune the system for the VOYAGER task.</Paragraph> <Paragraph position="2"> In terms of discourse structure, we might expect there to be radical differences between messages and dialogue, but in fact here there are interesting similarities. The basic interactional unit in VOYAGER dialogue can be characterized as the request-response pair (e.g. Where are you? At MIT.). A similar request-response structure appears in messages containing labelled discourse segments (or header fields) , e.g. Failed Part: system tape. Cause of Failure: tape ,as wound backwards. Here, the message originator can be viewed as answering the implicit questions What part failed? What caused the failure ? 2Nevertheless, elliptical questions and answers are certainly seen in task-orlented dialogue, e.g. North: \[It meant to\] ... basically clear up the record. Nields: Did you? North: Tried to. Nields: Then? North: No ... \[Scr87\] \[Bal89\] discusses our approach to handling such structured messages as a series of question/answer pairs, and we were able to extend this approach to dialogue.</Paragraph> <Paragraph position="3"> The interactional structure of monologue and dialogue is, of course, quite different. To provide the control structure for dialogue, we extended a query front-end (QFE) which had been developed for database query applications. The result is a general dialogue manager which can be used for a variety of interactive applications.</Paragraph> </Section> <Section position="5" start_page="61" end_page="62" type="metho"> <SectionTitle> SYSTEM ARCHITECTURE </SectionTitle> <Paragraph position="0"> The system contains four major components: the VOYAGER front-end (VFE), PUNDIT, a query translation and interface module (QTIP), and the VOYAGER expert system. The first three components are currently running on Sun workstations under Quintus Prolog, and VOYAGER, which is written in Lisp, runs on a Symbolics machine. A simplified system flow diagram is shown in Figure 1.</Paragraph> <Paragraph position="2"> VFE is a dialogue manager, which uses PUNDIT and QTIP as resources to interpret and respond to the user's requests. As discussed below, VOYAGER is also a conversational participant, whose utterances must be analyzed and integrated into the discourse context. VFE administers the turn-taking structure, and maintains a higher-level model of the discourse than that available to PUNDIT. This level of knowledge enables it, for example, to call the parser in different modes, depending on preceding discourse (see below). VFE alSO keeps track of the current speaker and hearer, so that PUNDIT's Reference Resolution component can correctly interpret I and you.</Paragraph> <Paragraph position="3"> PUNDIT, as described in (\[HPD+89\], \[PDP+86\], \[Dah86\]), provides syntactic, semantic, and pragmatic interpretation. The input to PUNDIT is currently text, and the output is a set of semantic representations and other predications representing the discourse context (the DISCOURSE LIST), and a list of entities in focus, ordered by saliency (the FOCUS LIST).</Paragraph> <Paragraph position="4"> QTIP'S function is to translate PUNDIT representations into LISP function calls, to pass these to VOYAGER, and to return VOYAGER'S response to VFE. QTIP also incorporates some knowledge about VOYAGER'S capabilities which enables it to trap certain types of queries for appropriate action by VFE. For example, VOYAGER cannot answer direction requests with an unspecified starting point, unless it knows where the user is. In this case, QTIP informs VFE that it must elicit the user's location. As another example, VOYAGER cannot answer questions about whether a class of objects is located on a street. QTIP traps such questions, and VFE informs the user: User: Is there a subway station on Church Street? VFE : Sorry, Voyager can't determine whether something is on a street.</Paragraph> <Paragraph position="5"> QTIP also monitors the state of the machine-machine interface to VOYAGER, and notifies VFE when the link is down or VOYAGER is not loaded; VFE then notifies the user.</Paragraph> <Paragraph position="6"> The final component is the VOYAGER expert system, a version of which has been made available to us by MIT. VOYAGER includes a generation component, and one of our initial issues was how to deal with its output. Clearly, it would be a poor sort of interactive system that did not allow for ordinary anaphoric and definite reference to entities introduced in the course of the conversation, e.g. to Royal East in the example below: A: Is there a restaurant near here? B: . .. The neaxest restauxant to HIT is Royal East ...</Paragraph> <Paragraph position="7"> PS: What kind of food does it serve7 However, in order to integrate VOYAGER'S conversational contribution into the discourse context, we were faced with the choice of (a) modifying VOYAGER to return some semantic representation of its utterances (together with information relevant to focusing), or (b) simply treating VOYAGER as an ordinary conversational participant, and using PUNDIT to analyze what VOYAGER said. The latter proved the most expedient course of action. Though not the most efficient approach in the long run, it has the unexpected benefit of allowing us to look at issues on both sides of the dialogue.</Paragraph> </Section> <Section position="6" start_page="62" end_page="63" type="metho"> <SectionTitle> MAINTAINING THE DISCOURSE CONTEXT </SectionTitle> <Paragraph position="0"> Interactive systems must provide for the strategies by which speakers refer to what has been previously discussed, e.g. one-anaphora: Q1. Is there a restaurant that serves seafood near MIT? ... Q2. Is there one near Harvard? A system which does not handle anaphoric and definite references requires unnatural behaviour on the part of the user, who will be forced to refer to all entities as though they were brand-new in the discourse, e.g. Q2': Q2'. Is there a restaurant that serves seafood near Harvard? But it is clearly not sufficient to allow reference only to what the user has said, for the system's response forms part of the user's discourse context, as illustrated above and below.</Paragraph> <Paragraph position="1"> These points, of course, are not specific to the application or domain, but adhere to language and dialogue. For this reason, the discourse processing facilities of PUNDIT ate largely embodied in the domain-independent Reference Resolution module. This module, which required no significant extensions for the VOYAGER application, is capable of handling a variety of referential strategies. In the following annotated dialogue, we illustrate some of the capabilities of Reference Resolution and the importance of understanding both sides of the dialogue.</Paragraph> </Section> <Section position="7" start_page="63" end_page="63" type="metho"> <SectionTitle> ! User: 2 VFE : 3 Vgr : </SectionTitle> <Paragraph position="0"> Is there a restaurant near Harvard ? I'll ask Voyager...</Paragraph> <Paragraph position="1"> I can't determine if objects are near something. I'll try to find the nearest one.</Paragraph> <Paragraph position="2"> The nearest restaurant to Harvard University is Border Care. Border Care is located at 32 Church Street.</Paragraph> <Paragraph position="3"> Comments: VOYAGER's response is processed by PUNDIT and integrated into the discourse context. As a result, the Border Cafe enters the list of entities in focus, and is thus available for anaphoric reference in (4) below.</Paragraph> <Paragraph position="4"> 4 User: What kind of food does it serve ? 8 VFE : X'll ask Voyager...</Paragraph> <Paragraph position="5"> 6 Vgr : Border Care serves Mexican food.</Paragraph> <Paragraph position="6"> Comments: it- the user is referring to the Border Care, an entity introduced by VOYAGER in turn (3). Reference Resolution quickly finds this entity, which is at the head of the Focus List. 7 User: How do I get there? 8 VFR : X'll ask Voyager...</Paragraph> <Paragraph position="7"> Where are youY 9 User: At NIT.</Paragraph> <Paragraph position="8"> 10 VFE : Ok. I will tell Voyager...</Paragraph> <Paragraph position="9"> 11 Vgr : MIT is located in the area of 77 Massachusetts Avenue. I'll try to find the beet way to got from HIT to Border Care. If your car is on the same side of the street as NITp start driving. Take the next left onto Hassachusetts Avenue. There is a traffic light there.</Paragraph> <Paragraph position="10"> After you cross Garden Street, take the next right onto Church Street. Border Care is about one eighth mile down on your left side. Comments: there in (7) is interpreted as an anaphoric reference to a salient location. Note the clarification dialogue (VOYAGER will need to know the starting point), where VFE becomes the querier. Since there is a salient wh-query in the context (Where are you?), VFE anticipates a short response s and calls the parser and semantics in a special mode. The user's short response is analyzed and bound to the variable in the query (where), and VFE creates a new proposition: The user is at MIT. This information, together with the translation of the original query (How do f get there?) is passed to VOYAGER.</Paragraph> <Paragraph position="11"> Comments: the phone number is a definite reference to an inferrable entity (cf. \[Pri81\]). There is no previously mentioned telephone number, and yet it is a stereotypic assumption that certain classes of objects, e.g. commercial establishments, have phone numbers. This information is encoded in our knowledge base. Reference Resolution looks for previously-mentioned entities that have the property of having phone numbers, and finds the Border Care. Is there a subway stop near the restaurant? I'll ask Voyager...</Paragraph> <Paragraph position="12"> I can't determPSne if objects evce near something.</Paragraph> <Paragraph position="13"> I'll try to find the nearest one.</Paragraph> <Paragraph position="14"> The neexest subway stop to Border Cage is Haxvaxd Station.</Paragraph> <Paragraph position="15"> Harvard Station is located at the intersection of Nassachusetts Avenue and Church Street.</Paragraph> <Paragraph position="16"> Comments: the restaurant is a definite reference to the Border Cafe. Note that it would not be correct to look for the last explicit mention of a restaurant, for this algorithm would find the restaurant introduced in turn 1: Is there a restaurant near Harvard. 9 Instead, Reference Resolution looks for the salient entity of type restaurant, and finds the Border Cafc.</Paragraph> </Section> class="xml-element"></Paper>