File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2801_intro.xml
Size: 4,673 bytes
Last Modified: 2025-10-06 14:02:44
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2801"> <Title>Robustness versus Fidelity in Natural Language Understanding</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Our Architecture </SectionTitle> <Paragraph position="0"> Our NLU sub-system, NUBEE, is part of a tutorial dialogue system, BEETLE (Basic Electricity and Electronics Tutorial Learning Environment) (Zinn et al., forthcoming). BEETLE users are given tasks to perform in the circuit simulator pictured in figure 1. Users manipulate objects in this simulation as well as conversing with the system through typed input. The typed input is sent to NUBEE which queries the domain reasoner, BEER, and dialogue history to help build a logical form which it sends to the dialogue manager of the system, the central component of BEETLE's response generation.</Paragraph> <Paragraph position="1"> NUBEE uses an application-specific logical framework which closely resembles minimal recursion semantics (MRS) (Copestake et al., 1999). An example logical form is shown in figure 2. The identifiers in square brackets define the handles for each of the three elementary predications (EPs). Handles are used by one EP to reference another EP. The first EP, connecta0 a0 , takes the handles of two EPs as arguments (the handles for batterya0 a0 and wirea0 a0 ). Arguments can be handles or atomic values. Note, we differentiate the definition of a handle from a handle reference by marking the former with square brackets. The two arguments to batterya0 a0 are the atomic values defNP and singular. To simplify processing we simply pass on these syntactic features, defNP and singular, for later processing rather than defining quantifiers such as the.</Paragraph> <Paragraph position="2"> In section 5, we describe our two stage interpretation process. Predicates output by the first stage are marked with a prime (e.g., connecta0 ) and predicates output by the second stage (such as those in figure 2) are marked with a double prime.</Paragraph> <Paragraph position="3"> Ideally each EP and each of its individual atomic elements would have a confidence score (reflecting the sub- null (1) connect the battery to the wire</Paragraph> <Paragraph position="5"> system's confidence that it captures the speaker's meaning), and a link back to the syntactic structures corresponding to the predicate or atomic element. Such a fine-grained representation would ensure that a dialogue system could separate low fidelity elements from high fidelity ones, and that all the speaker's lexical and grammatical choices were captured.</Paragraph> <Paragraph position="6"> Building such a representation is difficult because the NLU process consists of a series of tasks: preprocessing (in a typed system, this consists of spelling correction and unknown word handling), syntactic and semantic analysis, and reference resolution. Backward links must be built across these processing steps and each step introduces ambiguity (e.g., the parser will output multiple possibilities). In section 7, we see that the system's parser uses a packed representation for ambiguity. Such a representation is efficient but the connection between individual syntactic structures and semantic structures is lost meaning these links must be recreated post-hoc.</Paragraph> <Paragraph position="7"> NUBEE's architecture is shown in figure 3. The spelling correction, parsing, and post-processing components were built using the Carmel workbench (Ros'e, 2000; Ros'e et al., 2003). In our architecture for typed NLU, speech recognition is replaced by unknown word handling and spelling correction which we discuss in sections 3 and 4. In these modules, it is relatively easy to calculate confidence scores and record the transformations made to the user input. These modules can make dramatic changes to the user input so it is unclear why current NLU sub-systems do not track these transformations. null In section 5, we discuss the parsing and post-processing modules (parts of the Carmel workbench). We highlight why it is difficult to assign confidence scores to Carmel's output and maintain links between logical predicates and the corresponding words typed by the user.</Paragraph> <Paragraph position="8"> Section 6 discusses our reference resolution module and how it calculates confidence scores and records the transformations that it makes (from logical predicates to simulated objects in the domain reasoner). In section 7, we discuss how we calculate global confidence scores and link references to simulated objects back to the user's referring expression.</Paragraph> </Section> class="xml-element"></Paper>