File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0908_metho.xml

Size: 17,875 bytes

Last Modified: 2025-10-06 14:08:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0908">
  <Title>Towards Light Semantic Processing for Question Answering</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 System Components
</SectionTitle>
    <Paragraph position="0"> The JAVELIN system consists of four basic components: a question analysis module, a retrieval engine, a passage analysis module (supporting both statistical and NLP techniques), and an answer selection module. JAVELIN also includes a planner module, which supports feedback loops and finer control over specific components (Nyberg et al., 2003). In this paper we are concerned with the two components which support linguistic analysis: the question analysis and passage understanding modules (Question Analyzer and Information Extractor, respectively).</Paragraph>
    <Paragraph position="1"> The relevant aspects of syntactic processing in both modules are presented in Section 3, whereas the semantic representation is introduced in Section 4.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Parsing
</SectionTitle>
    <Paragraph position="0"> The system employs two different parsing techniques: a chart parser with hand-written grammars for question analysis, and a lexicalized, broad coverage skipping parser for passage analysis. For question analysis, parsing serves two goals: to identify the finest answer focus (Moldovan et al., 2000; Hermjakob, 2001), and to produce a grammatical analysis (f-structure) for questions.</Paragraph>
    <Paragraph position="1"> Due to the lack of publicly available parsers which have suitable coverage of question forms, we have manually developed a set of grammars to achieve these goals. On the other hand, the limited coverage and ambiguity in these grammars made adopting the same approach for passage analysis inefficient. In effect, we use two distinct parsers which provide two syntactic representations, including grammatical functions. These syntactic structures are then transformed into a common semantic representation discussed in Section 4.</Paragraph>
    <Paragraph position="3"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Questions
</SectionTitle>
      <Paragraph position="0"> The question analysis consists of two steps: lexical processing and syntactic parsing. For the lexical processing step, we have integrated several external resources: the Brill part-of-speech tagger (Brill, 1995), BBN IdentiFinder (BBN, 2000) (to tag named entities such as proper names, time expressions, numbers, etc.), WordNet (Fellbaum, 1998) (for semantic categorization), and the KANTOO Lexifier (Nyberg and Mitamura, 2000) (to access a syntactic lexicon for verb valence information).</Paragraph>
      <Paragraph position="1"> The hand-written grammars employed in the project are based on the Lexical Functional Grammar (LFG) formalism (Bresnan, 1982), and are used with the KANTOO parser (Nyberg and Mitamura, 2000). The parser outputs a functional structure (f-structure) which specifies the grammatical functions of question components, e.g., subject, object, adjunct, etc. As illustrated in Fig. 1, the resulting f-structure provides a deep, detailed syntactic analysis of the question.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Passages
</SectionTitle>
      <Paragraph position="0"> Passages selected by the retrieval engine are processed by the Link Grammar parser (Grinberg et al., 1995). The parser uses a lexicalized grammar which specifies links, i.e., grammatical functions, and provides a constituent structure as output. The parser covers a wide range of syntactic constructions and is robust: it can skip over unrecognized fragments of text, and is able to handle unknown words.</Paragraph>
      <Paragraph position="1"> An example of the passage analysis produced by the Link Parser is presented in Fig. 2. Links are treated as predicates which relate various arguments. For example, O in Fig. 2 indicates that Wendy's is an object of the verb founded. In parallel to the Link parser, passages are tagged with the BBN IdentiFinder (BBN, 2000), in order to group together multi-word proper names such as R. David Thomas.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Semantic Representation
</SectionTitle>
    <Paragraph position="0"> At the core of our linguistic analysis is the semantic representation, which bridges the distinct representations of the functional structure obtained for questions and passages. Although our semantic representation is quite simple, it aims at providing the means of understanding and processing broad-coverage linguistic data. The representation uses the following main constructs:2 formula is a conjunction of literals and represents the meaning of the entire sentence (or question); literal is a predicate relation over two terms; in particular, we distinguish two types of literals: extrinsic literal, a literal which relates a label to a label, and intrinsic literal, a literal which relates a label to a word; 2The use of terminology common in the field of formal logic is aimed at providing an intuitive understanding to the reader, but is not meant to give the impression that our work is built on a firm logic-theoretic framework.</Paragraph>
    <Paragraph position="2"> predicate is used to capture relations between terms; term is either a label, a variable which refers to a specific entity or an event, or a word, which is either a single word (e.g., John) or a sequence of words separated by whitespace (e.g., for proper names such as John Smith).</Paragraph>
    <Paragraph position="3"> The BNF syntax corresponding to this representation</Paragraph>
    <Paragraph position="5"> With the exception of the unary ANS predicate which indicates the sought answer, all predicates are binary relations (see examples in Fig. 3). Currently, most predicate names are based on grammatical functions (e.g., SUBJECT, OBJECT, DET) which link events and entities with their arguments. Unlike in (Moldovan et al., 2003), names of predicates belong to a fixed vocabulary, which provides a more sound basis for a formal interpretation.</Paragraph>
    <Paragraph position="6"> Names of labels and terms are restricted only by the syntax in (1). Examples of semantic representations for the question When was Wendy's founded? and the passage R.</Paragraph>
    <Paragraph position="7"> David Thomas founded Wendy's in 1969. are shown in Fig. 4.</Paragraph>
    <Paragraph position="8"> Note that our semantic representation reflects the 'canonical' structure of an active sentence. This design decision was made in order to eliminate structural differences between semantically equivalent structures. Hence, at the semantic level, all passive sentences correspond to their equivalents in the active form. Semantic representation of questions is not always derived directly from the f-structure. For some types of questions, e.g., definition When was Wendy's R. David Thomas founded founded? Wendy's in 1969.</Paragraph>
    <Paragraph position="10"> cialized (dedicated) grammars are used, which allows us to more easily arrive at an appropriate representation of meaning. Also, in the preliminary implementation of the unification algorithm (see Section 5), we have adopted some simplifying assumptions, and we do not incorporate sets in the current representation.</Paragraph>
    <Paragraph position="11"> The present formalism can quite successfully handle questions (or sentences) which refer to specific events or relations. However, it is more difficult to represent questions like What is the relationship between Jesse Ventura and Target Stores?, which seek a relation between entities or a common event they participated in. In the next section, we discuss the unification scheme which allows us to select answer candidates based on the proposed representation. null</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Fuzzy Unification
</SectionTitle>
    <Paragraph position="0"> A unification algorithm is required to match question representations with the representations of extracted passages which might contain answers. Using a precursor predicate example comments ROOT ROOT(x13,jJohnj) the root form of entity/event x13</Paragraph>
    <Paragraph position="2"> apposition: &amp;quot;John, a student of CMU&amp;quot; equality operator in copular sentences: &amp;quot;John is a student of CMU&amp;quot; ATTRIBUTE ATTRIBUTE(x1,x3) x3 is an adjective modifier of x1: adjective-noun: &amp;quot;stupid John&amp;quot; copular constructions: &amp;quot;John is stupid&amp;quot;</Paragraph>
    <Paragraph position="4"> to the representation presented above, we constructed an initial prototype using a traditional theorem prover (Kalman, 2001). Answer extraction was performed by attempting a unification between logical forms of the question and retrieved passages. Early tests showed that a unification strategy based on a strict boolean logic was not as flexible as we desired, given the lack of traditional domain constraints that one normally possesses when considering this type of approach. Unless a retrieved passage exactly matched the question, as in Fig. 4, the system would fail due to lack of information. For instance, knowing that Benjamin killed Jefferson. would not answer the question Who murdered Jefferson?, using a strict unification strategy.</Paragraph>
    <Paragraph position="5"> This has led to more recent experimentation with probabilistic models that perform what we informally refer to as fuzzy unification.3 The basic idea of our unification strategy is to treat relationships between question terms as a set of weighted constraints. The confidence score assigned to each extracted answer candidate is related to the number of constraints the retrieved passage satisfies, along with a measure of similarity between the relevant terms.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Definitions
</SectionTitle>
      <Paragraph position="0"> In this section, we present definitions which are necessary for discussion of the similarity measure employed by our fuzzy unification framework.</Paragraph>
      <Paragraph position="1"> Given a user query Q, where Q is a formula, we retrieve a set of passages P. Our task to is find the best passage Pbest 2 P from which an answer candidate can be extracted. An answer candidate exists within a passage P if the result of a fuzzy unification between Q and P results in the single term of ANS(x0) being ground in a term from P.</Paragraph>
      <Paragraph position="2"> (2) Pbest = argmaxP2Psim(Q;P) The restriction that an answer candidate must be found within a passage P must be made explicit, as our notion of fuzzy unification is such that a passage can unify against a query with a non-zero level of confidence even if one or more constraints from the query are left unsatisfied. Since the final goal is to find and return the best possible answer, we are not concerned with those passages which seem highly related yet do not offer answer candidates.</Paragraph>
      <Paragraph position="3"> In Section 4, we introduced extrinsic literals where predicates serve as relations over two labels. Extrinsic literals can be thought of as relations defined over distinct 3Fuzzy unification in a formal setting generally refers to a unification framework that is employed in the realm of fuzzy logics. Our current representation is of an ad-hoc nature, but our usage of this term does foreshadow future progression towards a representation scheme dependent on such a formal, non-boolean model.</Paragraph>
      <Paragraph position="4"> entities in our formula. For example, SUBJECT(x1;x2) is an extrinsic literal, while ROOT(x1;jBenjaminj) is not. The latter has been defined as an intrinsic literal in Section 4 and it relates a label and a word.</Paragraph>
      <Paragraph position="5"> This terminology is motivated by the intuitive distinction between intrinsic and extrinsic properties of an entity in the world. We use this distinction as a simplifying assumption in our measurements of similarity, which we will now explain in more detail.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Similarity Measure
</SectionTitle>
      <Paragraph position="0"> Given a set of extrinsic literals PE and QE from a passage and the question, respectively, we measure the similarity between QE and a given ordering of PE as the geometric mean of the similarity between each pair of extrinsic literals from the sets QE and PE.</Paragraph>
      <Paragraph position="1"> Let O be the set of all possible orderings of PE, O an element of O, QEj literal j of QE, and Oj literal j of ordering O. Then:</Paragraph>
      <Paragraph position="3"> The similarity of two extrinsic literals, lE and lE0, is computed by the square root of the similarity scores of each pair of labels, multiplied by the weight of the given literal, dependent on the equivilance of the predicates p;p0 of the respective literals lE;lE0. If the predicates are not equivilant, we rely on the engineers tactic of assigning an epsilon value of similarity, where is lower than any possible similarity score4. Note that the similarity score is still dependent on the weight of the literal, meaning that failing to satisfy a heavier constraint imposes a greater penalty than if we fail to satisfy a constraint of lesser importance.</Paragraph>
      <Paragraph position="4"> Let tj and t0j be the respective j-th term of lE;lE0.</Paragraph>
      <Paragraph position="5"> Then:</Paragraph>
      <Paragraph position="7"> The weight of a literal is meant to capture the relative importance of a particular constraint in a query. In standard boolean unification the importance of a literal is uniform, as any local failure dooms the entire attempt.5 In a non-boolean framework the importance of one literal vs.</Paragraph>
      <Paragraph position="8"> another becomes an issue. As an example, given a question concerning a murder we might be more interested in the suspect's name than in the fact that he was tall. This  affair.</Paragraph>
      <Paragraph position="9"> idea is similar to that commonly seen in information retrieval systems which place higher relative importance on terms in a query that are judged a priori to posses higher information value. While our prototype currently sets all literals with a weight of 1.0, we are investigating methods to train these weights to be specific to question type.</Paragraph>
      <Paragraph position="10"> Per our definition, all terms within an extrinsic literal will be labels. Thus, in equation (10), t0 is a label, as is t1, and so on. Given a pair of labels, b and b0, we let I;I0 be the respective sets of intrinsic literals from the formula containing b;b0 such that for all intrinsic literals lI 2I, the first term of lI is b, and likewise for b0;I0.</Paragraph>
      <Paragraph position="11"> Much like similarity between two formulae, the similarity between two labels relies on finding the maximal score over all possible orderings of a set of literals.</Paragraph>
      <Paragraph position="12"> Now let O be the set of all possible orderings of I0, O an element of O, Ij the j-th literal of I, and Oj the j-th literal of O. Then: (5) sim(b;b0) = maxO2O(Qnj=0 sim(Ij;Oj)) 1n We measure the similarity between a pair of intrinsic literals as the similarity between the two words multiplied by the weight of the first literal, dependent on the predicates p;p0 of the respective literals being equivilant.</Paragraph>
      <Paragraph position="14"> The similarity between two words is currently measured using a WordNet distance metric, applying weights introduced in (Moldovan et al., 2003). We will soon be integrating metrics which rely on other dimensions of similarity. null</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 Example
</SectionTitle>
      <Paragraph position="0"> We now walk through a simple example in order to present the current framework used to measure the level of constraint satisfaction (confidence score) achieved by a given passage. While a complete traversal of even a small passage would exceed the space available here, we will present a single instance of each type of usage of the sim() function.</Paragraph>
      <Paragraph position="1"> If we limit our focus to only a few key relationships, we get the following analysis of a given question and passage. null  Computing the similarity between two formulae, (loosely referred to here by their original text), gives the following:</Paragraph>
      <Paragraph position="3"> The similarity between the given extrinsic literals sharing the predicate SUBJECT:</Paragraph>
      <Paragraph position="5"> In order to find the result of this extrinsic similarity evaluation, we need to determine the similarity between the paired terms, (x1,y1) and (x2,y2). The similarity between x1 and y1 is measured as: (11) sim[x2, y2] = (sim[ROOT(x2,jkillj), ROOT(y2,jmurderj)] sim[TYPE(x2,jeventj), TYPE(y2,jeventj)])12 The result of this function depends on the combined similarity of the intrinsic literals that relate the given terms to values. The similarity between one of these intrinsic literal pairs is measured by: (12) sim[ROOT(x2,jkillj), ROOT(y2,jmurderj)] = sim[jkillj,jmurderj] weight[ROOT(x2,jkillj)] Finally, the similarity between a pair of words is computed as: (13) sim[jkillj,jmurderj] = 0:8 As stated earlier, our similarity metrics at the word level are currently based on recent work on WordNet distance functions. We are actively developing methods to complement this approach.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML