File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/c02-1047_metho.xml

Size: 15,732 bytes

Last Modified: 2025-10-06 14:07:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1047">
  <Title>Towards a Noise-Tolerant, Representation-Independent Mechanism for Argument Interpretationa0</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Argument Interpretation Using MML
</SectionTitle>
    <Paragraph position="0"> According to the MML criterion, we imagine sending to a receiver the shortest possible message that describes an NL argument. When a good interpretation is found, a message which encodes the NL argument in terms of this interpretation will be shorter than the message which transmits the words of the argument directly.</Paragraph>
    <Paragraph position="1"> A message that encodes an NL argument in terms of an interpretation is composed of two parts: (1) instructions for building the interpretation, and (2) instructions for rebuilding the original argument from this interpretation. These two parts balance the need for a concise interpretation (Part 1) with the need for an interpretation that matches closely the original argument (Part 2). For instance, the message for a concise interpretation that does not match well the original argument will have a short first part but a long second part. In contrast, a more complex interpretation which better matches the original argument may yield a shorter message overall. As a result, in finding the interpretation that yields the shortest message for an NL argument, we will have produced a plausible interpretation, which hopefully is the intended interpretation. To find this interpretation, we compare the message length of the candidate interpretations. These candidates are obtained as described in Section 4.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 MML Encoding
</SectionTitle>
      <Paragraph position="0"> The MML criterion is derived from Bayes Theorem:</Paragraph>
      <Paragraph position="2"> data and a6 is a hypothesis which explains the data.</Paragraph>
      <Paragraph position="3"> An optimal code for an event a16 with probability</Paragraph>
      <Paragraph position="5"> (measured in bits). Hence, the message length for the data and a hypothesis is: MLa1a3a2a5a4a7a6a15a8a28a10 MLa1a3a6a9a8a30a29 MLa1a3a2a15a14a6a9a8a32a31 The hypothesis for which MLa1a3a2a33a4a34a6a9a8 is minimal is considered the best hypothesis.</Paragraph>
      <Paragraph position="6"> Now, in our context, Arg contains the argument, and SysInt an interpretation generated by our system. Thus, we are looking for the SysInt which yields the shortest message length for</Paragraph>
      <Paragraph position="8"> The first part of the message describes the interpretation, and the second part describes how to reconstruct the argument from the interpretation. To calculate the second part, we rely on an intermediate representation called Implication Graph (IG). An Implication Graph is a graphical representation of an argument, which represents a basic &amp;quot;understanding&amp;quot; of the argument. It is composed of simple implications of the form Antecedenta0 Antecedenta26 a31 a31 a31 Antecedenta1a3a2 Consequent (wherea2 indicates that the antecedents imply the consequent, without distinguishing between causal and evidential implications). a4a6a5 Arg represents an understanding of the input argument.</Paragraph>
      <Paragraph position="9"> It contains propositions from the underlying representation, but retains the structure of the argument. a4a6a5 SysInt represents an understanding of a candidate interpretation. It is directly obtained from SysInt.</Paragraph>
      <Paragraph position="10"> Hence, both its structure and its propositions correspond to the underlying representation. Since both a4a6a5 Arg and a4a7a5 SysInt use domain propositions and have the same type of representation, they can be compared with relative ease.</Paragraph>
      <Paragraph position="11"> Figure 1 illustrates the interpretation of a small argument, and the calculation of the message length of the interpretation. The interpretation process obtains a4a6a5 Arg from the input, and SysInt from a4a6a5 Arg (left-hand side of Figure 1). If a sentence in Arg matches more than one domain proposition, the system generates more than one a4a6a5 Arg from Arg (Section 4.1). Each a4a6a5 Arg may in turn yield more than one SysInt. This happens when the underlying representation has several ways of connecting between the nodes in a4a6a5 Arg (Section 4.2). The message length calculation goes from SysInt to Arg through the intermediate representations a4a6a5 SysInt anda4a6a5 Arg (right-hand side of Figure 1). This calculation takes advantage of the fact that there can be only one a4a6a5 Arg for each Arg-SysInt combination.</Paragraph>
      <Paragraph position="12"> Hence,  Thus, the length of the message required to transmit the original argument from an interpretation is</Paragraph>
      <Paragraph position="14"> That is, for each candidate interpretation, we calculate the length of the message which conveys:  a4a6a5 Arg - how to obtain the sentences in Arg from the corresponding nodes ina4a7a5 Arg.</Paragraph>
      <Paragraph position="15"> The interpretation which yields the shortest message is selected (the message-length equations for each component are summarized in Table 1).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Calculating MLa1 SysInta8
</SectionTitle>
      <Paragraph position="0"> In order to transmit SysInt, we simply send its propositions and the relations between them. A standard MML assumption is that the sender and receiver share domain knowledge. Hence, one way to send SysInt consists of transmitting how SysInt is extracted from the domain representation. This involves selecting its propositions from those in the domain, and then choosing which of the possible relations between these propositions are included in the interpretation. In the case of a BN, the propositions are represented as nodes, and the relations between propositions as arcs. Thus the message length for SysInt in the context of a BN is</Paragraph>
      <Paragraph position="2"> The message which describes a4a7a5 Arg in terms of SysInt (or rather in terms of a4a6a5 SysInt) conveys how a4a6a5 Arg differs from the system's interpretation in two respects: (1) belief, and (2) argument structure.</Paragraph>
      <Paragraph position="3">  For each proposition a14 in both a4a6a5 SysInt and a4a6a5 Arg, we transmit any discrepancy between the belief stated in the argument and the system's belief in this proposition (propositions that appear in only one IG are handled by the message component which describes structural differences). The length of the message required to convey this information is</Paragraph>
      <Paragraph position="5"> (3)which expresses discrepancies in belief as a probability that the argument will posit a particular belief in a proposition, given the belief held by the system in this proposition. We have modeled this probability using a function which yields a maximum probability mass when the belief in propositiona14 according to the argument agrees with the system's belief.</Paragraph>
      <Paragraph position="6"> This probability gradually falls as the discrepancy between the belief stated in the argument and the system's belief increases, which in turn yields an increased message length.</Paragraph>
      <Paragraph position="7">  The message which transmits the structural discrepancies between a4a6a5 SysInt and a4a7a5 Arg describes the structural operations required to transform a4a6a5 SysInt into a4a7a5 Arg. These operations are: node insertions and deletions, and arc insertions and deletions. A node is inserted in a4a6a5 SysInt when the system cannot reconcile a proposition in the given argument with any proposition in its domain representation.</Paragraph>
      <Paragraph position="8"> In this case, the system proposes a special Escape (wild card) node. Note that the system does not presume to understand this proposition, but still hopes to achieve some understanding of the argument as a whole. Similarly, an arc is inserted when the argument mentions a relationship which does not appear ina4a6a5 SysInt. An arc (node) is deleted when the corresponding relation (proposition) appears in a4a7a5 SysInt, but is omitted from a4a7a5 Arg. When a node is deleted, all the arcs incident upon it are rerouted to connect its antecedents directly to its consequent. This operation, which models a small inferential leap, preserves the structure of the implication around the deleted node. If the arcs so rerouted are inconsistent witha4a6a5 Arg they will be deleted separately. For each of these operations, the message announces how many times the operation was performed (e.g., how many nodes were deleted) and then provides sufficient information to enable the message receiver to identify the targets of the operation (e.g., which nodes were deleted). Thus, the length of the message which describes the structural operations required to transforma4a7a5 SysInt intoa4a6a5 Arg comprises the following components:</Paragraph>
      <Paragraph position="10"> plus the penalty for each insertion. Since a node is inserted when no proposition in the domain matches a statement in the argument, we use an insertion penalty equal to a0a2a1 - the probabilitylike score of the worst acceptable word-match between a statement and a proposition (Section 4.1). Thus the message length for node in-</Paragraph>
      <Paragraph position="12"> their designations. To designate the nodes to be deleted, we select them from the nodes in SysInt</Paragraph>
      <Paragraph position="14"> their designations plus the direction of each arc.</Paragraph>
      <Paragraph position="15"> (This component also describes the arcs incident upon newly inserted nodes.) To designate an arc, we need to select a pair of nodes (head and tail) from the nodes ina4a7a5 SysInt and the newly inserted nodes. However, some nodes in a4a6a5 SysInt are already connected by arcs. These arcs must be subtracted from the total number of arcs that can be inserted, yielding # poss arc ins a10 C# nodes(a11a13a12 SysInt)+# nodes insa26</Paragraph>
      <Paragraph position="17"> We also need to send 1 extra bit per inserted arc to convey its direction. Hence, the length of the message that conveys arc insertions is:</Paragraph>
      <Paragraph position="19"/>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Calculating ML(Arga14IGArg)
</SectionTitle>
      <Paragraph position="0"> The given argument is structurally equivalent to a4a6a5 Arg. Hence, in order to transmit Arg in terms of a4a6a5 Arg we only need to transmit how each statement in Arg differs from the canonical statement generated for the matching node in a4a6a5 Arg (Section 4.1). The length of the message which conveys this information is</Paragraph>
      <Paragraph position="2"> where Sentencea16 in Arg is the sentence in the original argument which matches the proposition for node a14 in a4a6a5 Arg. Assuming an optimal message encoding, we obtain</Paragraph>
      <Paragraph position="4"/>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Proposing Interpretations
</SectionTitle>
    <Paragraph position="0"> Our system generates candidate interpretations for an argument by first postulating propositions that match the sentences in the argument, and then finding different ways to connect these propositions each variant is a candidate interpretation.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Postulating propositions
</SectionTitle>
      <Paragraph position="0"> We currently use a naive approach for postulating propositions. For each sentence a0 Arg in the given argument we generate candidate propositions as follows. For each proposition a14 in the domain, the system proposes a canonical sentence a1a2a0 a16 (produced by a simple English generator). This sentence is compared to a0 Arg, yielding a match-score for the pair (a0 Arg,a14 ). When a match-score is above a threshold a0 a1 , we have found a candidate interpretation for a0 Arg. For example, the proposition [G was in garden at 11] in Figure 1(b) is a plausible interpretation of the input sentence &amp;quot;Mr Green was seen in the garden at 11&amp;quot; in Figure 1(a). Some sentences may have no propositions with match-scores above a0a3a1 . This does not automatically invalidate the argument, as it may still be possible to interpret the argument as a whole, even if a few sentences are not understood (Section 3.3).</Paragraph>
      <Paragraph position="1"> The match-score for a sentence a0 Arg and a proposition a14 - a number in the [0,1] range - is calculated using a function which compares words in a0 Arg with words in a1a2a0 a16 . The goodness of a word-match depends on the following factors: (1) level of synonymy - the number of synonyms the words have in common (according to WordNet, Miller et al., 1990); (2) position in sentence; and (3) part-of-speech (PoS) - obtained using MINIPAR (Lin, 1998). That is, a word a3a5a4a7a6a8a10a9a12a11 in position a13 in a1a2a0 a16 matches perfectly a word a3a15a14 a6a9 Arg in position a16 in sentence a0 Arg, if both words are exactly the same, they are in the same sentence position, and they have the same PoS. The match-score between a3a5a4a17a6a8a10a9a12a11 and a3a18a14 a6a9 Arg is reduced as their level of synonymy falls, and as the difference in their sentence position increases. The match-score of two words is further reduced if they have different PoS. In addition, the PoS affects the penalty for a mismatch, so that mismatched non-content words are penalized less than mismatched content words.</Paragraph>
      <Paragraph position="2"> The match-scores between a sentence and its candidate propositions are normalized, and the result used to approximate Pra1 a0 Arga14a14 a8 , which is required for the MML evaluation (Section 3.4).2</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Connecting the propositions
</SectionTitle>
      <Paragraph position="0"> Since more than one node may match each of the sentences in an argument, there may be more than one a4a6a5 Arg that is consistent with the argument. For instance, the sentence &amp;quot;Mr Green was seen in the garden at 11&amp;quot; in Figure 1(a) matches both [G was in garden at 11] and [N saw G in garden] (although the former has a higher probability). If the other sentences in Figure 1(a) match only one proposition, two IGs that match the argument will be generated - one for each of the above alternatives.</Paragraph>
      <Paragraph position="1"> Figure 2 illustrates the remainder of the interpretation-generation process with respect to one a4a6a5 Arg. This process consists of finding connections between the nodes in a4a7a5 Arg; eliminating superfluous nodes; and generating sub-graphs of the resulting graph, such that all the nodes in a4a6a5 Arg are connected (Figures 2(b), 2(c) and 2(d), respectively). The connections between the nodes ina4a7a5 Arg are found by applying two rounds of inferences from these nodes (spreading outward). These two rounds enable the system to &amp;quot;make sense&amp;quot; of an argument with small inferential leaps (Zukerman, 2001). If upon completion of this process, some nodes ina4a6a5 Arg are still unconnected, the system rejects a4a6a5 Arg. This process is currently implemented in the context of a BN. However, any representation that supports the generation of a connected argument involving a given set of propositions would be appropriate.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML