File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0902_metho.xml

Size: 23,546 bytes

Last Modified: 2025-10-06 14:09:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0902">
  <Title>Solving Logic Puzzles: From Robust Processing to Precise Semantics</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Challenges
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Combinatorial Semantics
</SectionTitle>
      <Paragraph position="0"> The challenge of combinatorial semantics is to be able to assign exactly one semantic representation to each word and sub-phrase regardless of its surrounding context, and to combine these representations in a systematic way until the representation for the entire sentence is obtained. There are many linguistic constructions in the puzzles whose compositional analysis is difficult, such as a large variety of noun-phrase structures (e.g., &amp;quot;Every sculpture must be exhibited in a different room&amp;quot;) and ellipses (e.g., &amp;quot;Brian saw a taller man than Carl [did]&amp;quot;).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Scope Ambiguities
</SectionTitle>
      <Paragraph position="0"> A sentence has a scope ambiguity when quantifiers and other operators in the sentence can have more than one relative scope. E.g., in constraint (4) of Figure 1, &amp;quot;each room&amp;quot; outscopes &amp;quot;at least one sculpture&amp;quot;, but in other contexts, the reverse scoping is possible. The challenge is to find, out of all the possible scopings, the appropriate one, to understand the text as the writer intended.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Reference Resolution
</SectionTitle>
      <Paragraph position="0"> The puzzle texts contain a wide variety of anaphoric expressions, including pronouns, definite descriptions, and anaphoric adjectives. The challenge is to identify the possible antecedents that these expressions refer to, and to select the correct ones. The problem is complicated by the fact that anaphoric expressions interact with quantifiers and may not refer to any particular context element. E.g., the anaphoric expressions in &amp;quot;Sculptures C and E are exhibited in the same room&amp;quot; and in &amp;quot;Each man saw a different woman&amp;quot; interact with sets ({C,E} and the set of all men, respectively).</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Plurality Disambiguation
</SectionTitle>
      <Paragraph position="0"> Sentences that include plural entities are potentially ambiguous between different readings: distributive, collective, cumulative, and combinations of these. For example, sentence 1 in Figure 1 says (among other things) that each of the six sculptures is displayed in one of the three rooms - the group of sculptures and the group of rooms behave differently here. Plurality is a thorny topic which interacts in complex ways with other semantic issues, including quantification and reference.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.5 Lexical Semantics
</SectionTitle>
      <Paragraph position="0"> The meaning of open-category words is often irrelevant to solving a puzzle. For example, the meaning of &amp;quot;exhibited&amp;quot;, &amp;quot;sculpture&amp;quot;, and &amp;quot;room&amp;quot; can be ignored because it is enough to understand that the first is a binary relation that holds between elements of groups described by the second and third words.1 This observa1The meanings are still important for the implicit knowledge that a sculpture cannot be exhibited in more than one room. However, such knowledge can be guessed, as explained in SS8.</Paragraph>
      <Paragraph position="1"> tion provides the potential for a general system that solves logic puzzles.</Paragraph>
      <Paragraph position="2"> Of course, in many cases, the particular meaning of open-category words and other expressions is crucial to the solution. An example is provided in question 2 of Figure 1: the system has to understand what &amp;quot;a complete list&amp;quot; means. Therefore, to finalize the meaning computed for a sentence, such expressions should be expanded to their explicit meaning. Although there are many such cases and their analysis is difficult, we anticipate that it will be possible to develop a relatively compact library of critical puzzle text expressions. We may also be able to use existing resources such as WordNet and FrameNet.</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.6 Information Gaps
</SectionTitle>
      <Paragraph position="0"> Natural language texts invariably assume some knowledge implicitly. E.g., Figure 1 does not explicitly specify that a sculpture may not be exhibited in more than one room at the same time.</Paragraph>
      <Paragraph position="1"> Humans know this implicit information, but a computer reasoning from texts must be given it explicitly. Filling these information gaps is a serious challenge; representation and acquisition of the necessary background knowledge are very hard AI problems. Fortunately, the puzzles domain allows us to tackle this issue, as explained in SS8.</Paragraph>
    </Section>
    <Section position="7" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.7 Presuppositions and Implicatures
</SectionTitle>
      <Paragraph position="0"> In addition to its semantic meaning, a natural language text conveys two other kinds of content. null Presuppositions are pieces of information assumed in a sentence. Anaphoric expressions bear presuppositions about the existence of entities in the context; the answer choice &amp;quot;Sculptures C and E&amp;quot; conveys the meaning {C,E}, but has the presupposition sculpture(C) [?] sculpture(E); and a question of the form A -B, such as question 1 in Figure 1, presupposes that A is consistent with the preamble.</Paragraph>
      <Paragraph position="1"> Implicatures are pieces of information suggested by the very fact of saying, or not saying, something. Two maxims of (Grice, 1989) dictate that each sentence should be both consistent and informative (i.e. not entailed) with respect to its predecessors. Another maxim dictates saying as much as required, and hence the sentence &amp;quot;No more than three sculptures may be exhibited in any room&amp;quot; carries the implicature that in some possible solution, three sculptures are indeed exhibited in the same room.</Paragraph>
      <Paragraph position="2"> Systematic calculation of presuppositions and implicatures has been given less attention in NLP and is less understood than the calculation of meaning. Yet computing and verifying them can provide valuable hints to the system whether it understood the meaning of the text correctly.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Morpho-Syntactic Analysis
</SectionTitle>
    <Paragraph position="0"> While traditional hand-built grammars often include a rich semantics, we have found their coverage inadequate for the logic puzzles task.</Paragraph>
    <Paragraph position="1"> For example, the English Resource Grammar (Copestake and Flickinger, 2000) fails to parse any of the sentences in Figure 1 for lack of coverage of some words and of several different syntactic structures; and parsable simplified versions of the text produce dozens of unranked parse trees. For this reason, we use a broad-coverage statistical parser (Klein and Manning, 2003) trained on the Penn Treebank. In addition to robustness, treebank-trained statistical parsers have the benefit of extensive research on accurate ambiguity resolution. Qualitatively, we have found that the output of the parser on logic puzzles is quite good (see SS10). After parsing, each word in the resulting parse trees is converted to base form by a stemmer.</Paragraph>
    <Paragraph position="2"> A few tree-transformation rules are applied on the parse trees to make them more convenient for combinatorial semantics. Most of them are general, e.g. imposing a binary branching structure on verb phrases, and grouping expressions like &amp;quot;more than&amp;quot;. A few of them correct some parsing errors, such as nouns marked as names and vice-versa. There is growing awareness in the probabilistic parsing literature that mismatches between training and test set genre can degrade parse accuracy, and that small amounts of correct-genre data can be more important than large amounts of wrong-genre data (Gildea, 2001); we have found corroborating evidence in misparsings of noun phrases common in puzzle texts, such as &amp;quot;Sculptures C and E&amp;quot;, which do not appear in the Wall Street Journal corpus. Depending on the severity of this problem, we may hand-annotate a small amount of puzzle texts to include in parser training data.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Combinatorial Semantics
</SectionTitle>
    <Paragraph position="0"> Work in NLP has shifted from hand-built grammars that need to cover explicitly every sentence structure and that break down on unexpected inputs to more robust statistical parsing.</Paragraph>
    <Paragraph position="1"> However, grammars that involve precise semantics are still largely hand-built (e.g. (Carpenter, 1998; Copestake and Flickinger, 2000)). We aim at extending the robustness trend to the semantics. We start with the compositional semantics framework of (Blackburn and Bos, 2000; Bos, 2001) and modify it to achieve greater robustness and coverage.2 One difference is that our lexicon is kept very small and includes only a few words with special semantic entries (like pronouns, connectives, and numbers). Open-category words come with their part-of-speech information in the parse trees (e.g. (NN dog)), so their semantics can be obtained using generic semantic templates (but cf. SS3.5).</Paragraph>
    <Paragraph position="2"> In classic rule-to-rule systems of semantics like (Blackburn and Bos, 2000), each syntactic rule has a separate semantic combination rule, and so the system completely fails on unseen syntactic structures. The main distinguishing goal of our approach is to develop a more robust process that does not need to explicitly specify how to cover every bit of every sentence. The system incorporates a few initial ideas in this direction.</Paragraph>
    <Paragraph position="3"> First, role and argument-structure information for verbs is expensive to obtain and unreliable anyway in natural texts. So to deal with verbs and VPs robustly, their semantics in our system exports only an event variable rather than variables for the subject, the direct object, etc. VP modifiers (such as PPs and ADVPs) combine to the VP by being applied on the exported event variable. NP modifiers (including the sentence subject) are combined to the event variable through generic roles: subj, np1, np2, etc. The resulting generic representations are suitable in the puzzles domain because usually only the relation between objects is important and not their particular roles in the relation.</Paragraph>
    <Paragraph position="4"> This is true for other tasks as well, including some broad-coverage question answering.</Paragraph>
    <Paragraph position="5"> All NPs are analyzed as generalized quantifiers, but a robust compositional analysis for the internal semantics of NPs remains a serious challenge. For example, the NP &amp;quot;three rooms&amp;quot; should be analyzed as Q(num(3),x,room(x),..), but the word &amp;quot;three&amp;quot; by itself does not contribute the quantifier - compare with &amp;quot;at least three rooms&amp;quot; Q([?]3,x,room(x),..). Yet another case is &amp;quot;the three rooms&amp;quot; (which presupposes 2Our system uses a reimplementation in Lisp rather than their Prolog code.</Paragraph>
    <Paragraph position="6"> a group g such that g [?] room [?] |g |= 3). The system currently handles a number of NP structures by scanning the NP left-to-right to identify important elements. This may make it easier than a strictly compositional analysis to extend the coverage to additional cases.</Paragraph>
    <Paragraph position="7"> All other cases are handled by a flexible combination process. In case of a single child, its semantics is copied to its parent. With more children, all combinations of applying the semantics of one child to its siblings are tried, until an application does not raise a type error (variables are typed to support type checking). This makes it easier to extend the coverage to new grammatical constructs, because usually only the lexical entry needs to be specified, and the combination process takes care to apply it correctly in the parse tree.</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 Scope Resolution
</SectionTitle>
    <Paragraph position="0"> One way of dealing with scope ambiguities is by using underspecified representations (URs). A UR is a meta-language construct, describing a set of object-language formulas.3 It describes the pieces shared by these formulas, but possibly underspecifies how they combine with each other. A UR can then be resolved to the specific readings it implicitly describes.</Paragraph>
    <Paragraph position="1"> We use an extension of Hole Semantics (Blackburn and Bos, 2000)4 for expressing URs and calculating them from parse trees (modulo the modifications in SS5). There are several advantages to this approach. First, it supports the calculation of just one UR per sentence in a combinatorial process that visits each node of the parse tree once. This contrasts with approaches such as Categorial Grammars (Carpenter, 1998), which produce explicitly all the scopings by using type raising rules for different combinations of scope, and require scanning the entire parse tree once per scoping.</Paragraph>
    <Paragraph position="2"> Second, the framework supports the expression of scoping constraints between different parts of the final formula. Thus it is possible to express hierarchical relations that must exist between certain quantifiers, avoiding the problems of naive approaches such as Cooper storage (Cooper, 1983). The expression of scoping constraints is not limited to quantifiers and is applicable to all other operators as well. Moreover, it is possible to express scope islands by  Another advantage is that URs support efficient elimination of logically-equivalent readings. Enumerating all scopings and using a theorem-prover to determine logical equivalences requires O(n2) comparisons for n scopings. Instead, filtering methods (Chaves, 2003) can add tests to the UR-resolution process, disallowing certain combinations of operators.</Paragraph>
    <Paragraph position="3"> Thus, only one ordering of identical quantifiers is allowed, so &amp;quot;A man saw a woman&amp;quot; yields only one of its two equivalent scopings. We also filter [?]square and [?] combinations, allowing only the equivalent square[?] and [?]. However, numeric quantifiers are not filtered (the two scopings of &amp;quot;Three boys saw three films&amp;quot; are not equivalent). Such filtering can result in substantial speed-ups for sentences with a few quantifiers (see (Chaves, 2003) for some numbers).</Paragraph>
    <Paragraph position="4"> Finally, our true goal is determining the correct relative scoping in context rather than enumerating all possibilities. We are developing a probabilistic scope resolution module that learns from hand-labeled training examples to predict the most probable scoping, using features such as the quantifiers' categories and their positions and grammatical roles in the sen-</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
7 Reference Resolution
</SectionTitle>
    <Paragraph position="0"> SL is not convenient for representing directly the meaning of referring expressions because (as in FOL) the extent of a quantifier in a formula cannot be extended easily to span variables in subsequent formulas. We therefore use Discourse Logic (DL), which is SL extended with DRSes and a-expressions as in (Blackburn and Bos, 2000) (which is based on Discourse Representation Theory (Kamp and Reyle, 1993) and its recent extensions for dealing with presuppositions).6 This approach (like other dynamic semantics approaches) supports the introduction of entities that can later be referred back to, and explains when indefinite NPs should be in5E.g. there is a strong preference for 'each' to take wide scope, a moderate preference for the first quantifier in a sentence to take wide scope, and a weak preference for a quantifier of the grammatical subject to take wide scope.</Paragraph>
    <Paragraph position="1"> 6Thus, the URs calculated from parse trees are actually URs of DL formulas. The scope resolution phase resolves the URs to explicit DL formulas, and the reference resolution phase converts these formulas to SL formulas.</Paragraph>
    <Paragraph position="2"> terpreted as existential or universal quantifiers (such as in the antecedent of conditionals). The reference resolution framework from (Blackburn and Bos, 2000) provides a basis for finding all possible resolutions, but does not specify which one to choose. We are working on a probabilistic reference-resolution module, which will pick from the legal resolutions the most probable one based on features such as: distance, gender, syntactic place and constraints, etc.</Paragraph>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
8 Filling Information Gaps
</SectionTitle>
    <Paragraph position="0"> To find a unique answer to every question of a puzzle, background information is required beyond the literal meaning of the text. In Question 1 of Figure 1, for example, without the constraint that a sculpture may not be exhibited in multiple rooms, answers B, D and E are all correct. Human readers deduce this implicit constraint from their knowledge that sculptures are physical objects, rooms are locations, and physical objects can have only one location at any given time. In principle, such information could be derived from ontologies. Existing ontologies, however, have limited coverage, so we also plan to leverage information about expected puzzle structures.</Paragraph>
    <Paragraph position="1"> Most puzzles we collected are formalizable as constraints on possible tuples of objects. The crucial information includes: (a) the object classes; (b) the constants naming the objects; and (c) the relations used to link objects, together with their arguments' classes. For the sculptures puzzle, this information is: (a) the classes are sculpture and room; (b) the constants are C,D,E,F,G,H for sculpture and 1,2,3 for room; (c) the relation is exhibit(sculpture,room). This information is obtainable from the parse trees and SL formulas. null Within this framework, implicit world knowledge can often be recast as mathematical properties of relations. The unique location constraint on sculptures, for example, is equivalent to constraining the mapping from sculptures to rooms to be injective (one-to-one); other cases exist of constraining mappings to be surjective (onto) and/or total. Such properties can be obtained from various sources, including cardinality of object classes, pure lexical semantics, and even through a systematic search for sets of implicit constraints that, in combination with the explicitly stated constraints, yield exactly one answer per question. Figure 3 shows the num- null straints on constraining the number of possible models ber of possible models for the sculptures puzzle as affected by explicit and implicit constraints in the preamble.</Paragraph>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
9 Solving the Puzzle
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
9.1 Expanding the answer choices
</SectionTitle>
      <Paragraph position="0"> The body of a logic puzzle question contains a (unique) wh-term (typically &amp;quot;which of the following&amp;quot;), a modality (such as &amp;quot;must be true&amp;quot; or &amp;quot;could be true&amp;quot;), and (possibly) an added condition. Each answer choice is expanded by substituting its SL form for the wh-term in the question body. For example, the expansion for answer choice (A) of question 1 in Figure 1 would be the SL form corresponding to: &amp;quot;If sculpture D is exhibited ..., then [Sculpture C is exhibited in room 1] must be true&amp;quot;.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
9.2 Translating SL to FOL
</SectionTitle>
      <Paragraph position="0"> To translate an SL representation to pure FOL, we eliminate event variables by replacing an SL form [?]e.P(e)[?]R1(e,t1)[?]..[?]Rn(e,tn) with the FOL form P(t1,..,tn). An ordering is imposed on role names to guarantee that arguments are always used in the same order in relations. Numeric quantifiers are encoded in FOL in the obvious way, e.g., Q([?]2,x,ph,ps) is translated to [?]x1[?]x2. x1 negationslash= x2[?](ph[?]ps)[x1/x][?](ph[?]ps)[x2/x]. Each expanded answer choice contains one modal operator. Modals are moved outward of negation as usual, and outward of conditionals by changing A - squareB to square(A - B) and A - B to (A[?]B). A modal operator in the outermost scope can then be interpreted as a directive to the reasoning module to test either entailment (square) or consistency () between the preamble and the expanded answer choice.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
9.3 Using FOL reasoners
</SectionTitle>
      <Paragraph position="0"> There are two reasons for using both theorem provers and model builders. First, they are complementary reasoners: while a theorem prover is designed to demonstrate the inconsistency of a set of FOL formulas, and so can find the correct answer to &amp;quot;must be true&amp;quot; questions through proof by contradiction, a model builder is designed to find a satisfying model, and is thus suited to finding the correct answer to &amp;quot;could be true&amp;quot; questions.7 Second, a reasoner may take a very long time to halt on some queries, but the complementary reasoner may still be used to answer the query in the context of a multiple-choice question through a process of elimination. Thus, if the model builder is able to show that the negations of four choices are consistent with the preamble (indicating they are not entailed), then it can be concluded that the remaining choice is entailed by the preamble, even if the theorem prover has not yet found a proof.</Paragraph>
      <Paragraph position="1"> We use the Otter 3.3 theorem prover and the MACE 2.2 model builder (McCune, 1998).8 The reasoning module forks parallel subprocesses, two per answer choice (one for Otter, one for MACE). If a reasoner succeeds for an answer choice, the choice is marked as correct or incorrect, and the dual sub-process is killed. If all answer-choices but one are marked incorrect, the remaining choice is marked correct even if its sub-processes did not yet terminate.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML