Towards a Noise-Tolerant, Representation-Independent Mechanism for
Argument Interpretationa0
Ingrid Zukerman and Sarah George
School of Computer Science and Software Engineering
Monash University
Clayton, VICTORIA 3800, AUSTRALIA
Abstract
We describe a mechanism for the interpretation of
arguments, which can cope with noisy conditions
in terms of wording, beliefs and argument struc-
ture. This is achieved through the application of
the Minimum Message Length Principle to evalu-
ate candidate interpretations. Our system receives
as input a quasi-Natural Language argument, where
propositions are presented in English, and gener-
ates an interpretation of the argument in the form
of a Bayesian network (BN). Performance was eval-
uated by distorting the system’s arguments (gener-
ated from a BN) and feeding them to the system for
interpretation. In 75% of the cases, the interpreta-
tions produced by the system matched precisely or
almost-precisely the representation of the original
arguments.
1 Introduction
In this paper, we focus on the interpretation of argu-
mentative discourse, which is composed of impli-
cations. We present a mechanism for the interpreta-
tion of NL arguments which is based on the applica-
tion of the Minimum Message Length (MML) Prin-
ciple for the evaluation of candidate interpretations
(Wallace and Boulton, 1968). The MML princi-
ple provides a uniform and incremental framework
for combining the uncertainty arising from different
stages of the interpretation process. This enables
our mechanism to cope with noisy input in terms of
wording, beliefs and argument structure, and to fac-
tor out the elements of an interpretation which rely
on a particular knowledge representation.
So far, our mechanism has been tested on one
knowledge representation – Bayesian Networks
(BNs) (Pearl, 1988); logic-based representations
will be tested in the future. Figure 1(a) shows a
simple argument, and Figure 1(d) shows a subset
a1 This research was supported in part by Australian Research
Council grant A49927212.
of a BN which contains the preferred interpretation
of this argument (the nodes corresponding to the
original argument are shaded). In this example, the
argument is obtained through a web interface (the
uncertainty value of the consequent is entered us-
ing a drop-down menu). As seen in this example,
the input argument differs structurally from the sys-
tem’s interpretation. In addition, the belief value for
the consequent differs from that in the domain BN,
and the wording of the statements differs from the
canonical wording of the BN nodes. Still, the sys-
tem found a reasonable interpretation in the context
of its domain model.
The results obtained in this informal trial are val-
idated by our automated evaluation. This evalua-
tion, which assesses baseline performance, consists
of passing distorted versions of the system’s argu-
ments back to the system for interpretation. In 75%
of the cases, the interpretations produced by the sys-
tem matched the original arguments (in BN form)
precisely or almost-precisely.
In the next section, we review related research.
We then describe the application of the MML cri-
terion to the evaluation of interpretations. In Sec-
tion 4, we outline the argument interpretation pro-
cess. The results of our evaluation are reported in
Section 5, followed by concluding remarks.
2 Related Research
Our research integrates reasoning under uncertainty
for plan recognition in discourse understanding with
the application of the MML principle (Wallace and
Boulton, 1968).
BNs in particular have been used in several such
plan recognition tasks, e.g., (Charniak and Gold-
man, 1993; Horvitz and Paek, 1999; Zukerman,
2001). Charniak and Goldman’s system handled
complex narratives, using a BN and marker passing
for plan recognition. It automatically built and in-
crementally extended a BN from propositions read
ML(IG      |IG          )Arg SysInt
ML(Arg | IG      )Arg
ML(IG          |SysInt) = 0SysInt
(d) SysInt (Top 
     Candidate) N reported argument
N heard argument
G argued with B
G and B were enemies
G was in garden at 11
G was in garden
at time of death
G had motiveG had opportunity
G murdered B
N reported argument
N heard argument
G argued with B
G and B were enemies
G was in garden at 11
G was in garden
at time of death
G had motiveG had opportunity
G murdered B
(c) IG        for best 
     SysIntSysInt
(b) Top-ranked IG
N reported argumentG was in garden at 11
G murdered B
(a) Original argument (Arg)
Arg
The neighbour reported a             argument
  between Mr Green and Mr Body last week
Mr Green was         in the garden at 11
Mr Body was murdered by Mr Green
AND
-> [likely]
heated
seen
Figure 1: Interpretation and MML evaluation
in a story, so that the BN represented hypotheses
that became plausible as the story unfolded. In
contrast, we use a BN to constrain our understand-
ing of the propositions in an argument, and apply
the MML principle to select a plausible interpreta-
tion. Both Horvitz and Paek’s system and Zuker-
man’s handled short dialogue contributions. Horvitz
and Paek used BNs at different levels of an abstrac-
tion hierarchy to infer a user’s goal in information-
seeking interactions with a Bayesian Receptionist.
Zukerman used a domain model and user model rep-
resented as a BN, together with linguistic and at-
tentional information to infer a user’s goal from a
short-form rejoinder. However, the combination of
these knowledge sources was based on heuristics.
The MML principle is a model selection tech-
nique which applies information-theoretic criteria
to trade data fit against model complexity. Se-
lected applications which use MML are listed
in http://www.csse.monash.edu.au/a0 dld/
Snob.application.papers.
3 Argument Interpretation Using MML
According to the MML criterion, we imagine send-
ing to a receiver the shortest possible message that
describes an NL argument. When a good interpreta-
tion is found, a message which encodes the NL ar-
gument in terms of this interpretation will be shorter
than the message which transmits the words of the
argument directly.
A message that encodes an NL argument in terms
of an interpretation is composed of two parts: (1) in-
structions for building the interpretation, and (2) in-
structions for rebuilding the original argument from
this interpretation. These two parts balance the need
for a concise interpretation (Part 1) with the need
for an interpretation that matches closely the orig-
inal argument (Part 2). For instance, the message
for a concise interpretation that does not match well
the original argument will have a short first part
but a long second part. In contrast, a more com-
plex interpretation which better matches the orig-
inal argument may yield a shorter message over-
all. As a result, in finding the interpretation that
yields the shortest message for an NL argument, we
will have produced a plausible interpretation, which
hopefully is the intended interpretation. To find this
interpretation, we compare the message length of
the candidate interpretations. These candidates are
obtained as described in Section 4.
3.1 MML Encoding
The MML criterion is derived from Bayes Theorem:
Pra1a3a2a5a4a7a6a9a8a11a10 Pra1a3a6a9a8a13a12 Pra1a3a2a9a14a6a15a8 , where a2 is the
data and a6 is a hypothesis which explains the data.
An optimal code for an event a16 with probability
Pra1 a16
a8 has message length MLa1
a16
a8a17a10a19a18a21a20a23a22a25a24a27a26 Pra1
a16
a8
(measured in bits). Hence, the message length for
the data and a hypothesis is:
MLa1a3a2a5a4a7a6a15a8a28a10 MLa1a3a6a9a8a30a29 MLa1a3a2a15a14a6a9a8a32a31
The hypothesis for which MLa1a3a2a33a4a34a6a9a8 is minimal is
considered the best hypothesis.
Now, in our context, Arg contains the argument,
and SysInt an interpretation generated by our sys-
tem. Thus, we are looking for the SysInt which
yields the shortest message length for
MLa1Arga4 SysInta8a25a10 MLa1SysInta8a30a29 MLa1Arga14SysInta8
The first part of the message describes the in-
terpretation, and the second part describes how
to reconstruct the argument from the interpreta-
tion. To calculate the second part, we rely on
an intermediate representation called Implication
Graph (IG). An Implication Graph is a graphi-
cal representation of an argument, which repre-
sents a basic “understanding” of the argument.
It is composed of simple implications of the
form Antecedenta0 Antecedenta26 a31 a31 a31 Antecedenta1a3a2
Consequent (wherea2 indicates that the antecedents
imply the consequent, without distinguishing be-
tween causal and evidential implications). a4a6a5 Arg
represents an understanding of the input argument.
It contains propositions from the underlying repre-
sentation, but retains the structure of the argument.
a4a6a5 SysInt represents an understanding of a candidate
interpretation. It is directly obtained from SysInt.
Hence, both its structure and its propositions corre-
spond to the underlying representation. Since both
a4a6a5 Arg and a4a7a5 SysInt use domain propositions and
have the same type of representation, they can be
compared with relative ease.
Figure 1 illustrates the interpretation of a small
argument, and the calculation of the message length
of the interpretation. The interpretation process
obtains a4a6a5 Arg from the input, and SysInt from
a4a6a5 Arg (left-hand side of Figure 1). If a sentence in
Arg matches more than one domain proposition, the
system generates more than one a4a6a5 Arg from Arg
(Section 4.1). Each a4a6a5 Arg may in turn yield more
than one SysInt. This happens when the underlying
representation has several ways of connecting
between the nodes in a4a6a5 Arg (Section 4.2). The
message length calculation goes from SysInt to Arg
through the intermediate representations a4a6a5 SysInt
anda4a6a5 Arg (right-hand side of Figure 1). This calcu-
lation takes advantage of the fact that there can be
only one a4a6a5 Arg for each Arg–SysInt combination.
Hence,
Pra1 Arga4 SysInta8 a10 Pra1 Arga8a9a4a6a5 Arga8 SysInta8
a10 Pra1 Arga14
a4a7a5 Arga8 SysInt
a8 Pra1
a4a6a5 Arg
a14SysInta8 Pra1 SysInta8
cond. ind.a10 Pra1 Arga14
a4a7a5 Arg
a8 Pra1
a4a6a5 Arg
a14SysInta8 Pra1 SysInta8
Thus, the length of the message required to trans-
mit the original argument from an interpretation is
MLa1 Arg a4 SysInta8 a10 (1)
MLa1 Arga14a4a7a5 Arga8 a29 MLa1a4a7a5 Arga14SysInta8 a29 MLa1 SysInta8
That is, for each candidate interpretation, we cal-
culate the length of the message which conveys:
a10 SysInt – the interpretation,
a10
a4a6a5 Arg
a14SysInt – how to obtain the belief and struc-
ture of a4a6a5 Arg from SysInt,1 and
1We use
a11a13a12 SysInt for this calculation, rather than SysInt.
This does not affect the message length because the receiver
can obtaina11a13a12 SysInt directly from SysInt.
a10 Arg
a14
a4a6a5 Arg – how to obtain the sentences in Arg
from the corresponding nodes ina4a7a5 Arg.
The interpretation which yields the shortest mes-
sage is selected (the message-length equations for
each component are summarized in Table 1).
3.2 Calculating MLa1 SysInta8
In order to transmit SysInt, we simply send its
propositions and the relations between them. A
standard MML assumption is that the sender and re-
ceiver share domain knowledge. Hence, one way to
send SysInt consists of transmitting how SysInt is
extracted from the domain representation. This in-
volves selecting its propositions from those in the
domain, and then choosing which of the possible
relations between these propositions are included in
the interpretation. In the case of a BN, the proposi-
tions are represented as nodes, and the relations be-
tween propositions as arcs. Thus the message length
for SysInt in the context of a BN is
a20a23a22a25a24 a26 C# nodes(domainBN)
# nodes(SysInt)
a29 a20a23a22a25a24 a26 C# incident arcs(SysInt)
# arcs(SysInt) (2)
3.3 Calculating MLa1 IGArga14SysInta8
The message which describes a4a7a5 Arg in terms of
SysInt (or rather in terms of a4a6a5 SysInt) conveys how
a4a6a5 Arg differs from the system’s interpretation in two
respects: (1) belief, and (2) argument structure.
3.3.1 Belief differences
For each proposition a14 in both a4a6a5 SysInt and a4a6a5 Arg,
we transmit any discrepancy between the belief
stated in the argument and the system’s belief in this
proposition (propositions that appear in only one IG
are handled by the message component which de-
scribes structural differences). The length of the
message required to convey this information is
a15
a16a18a17a20a19a22a21
Arga23
a19a24a21
SysInt
MLa1a26a25a28a27a13a29 a1a14a30a8a9a4a7a5 Arga8 a14a25a31a27a13a29 a1a14a32a8a9a4a6a5 SysInta8 a8
where a25a31a27a33a29 a1a14a30a8a9a4a7a5a35a34 a8 is the belief in proposition a14
in a4a6a5 a34 . Assuming an optimal message encoding,
we obtain
a15
a16a36a17a37a19a22a21
Arga23
a19a22a21
SysInt
a18 a20a23a22a25a24 a26 Pra1a26a25a31a27a33a29 a1
a14a32a8a9a4a6a5 Arg
a8 a14a25a31a27a33a29 a1
a14a32a8a9a4a6a5 SysInt
a8 a8
(3)which expresses discrepancies in belief as a proba-
bility that the argument will posit a particular belief
in a proposition, given the belief held by the system
in this proposition. We have modeled this probabil-
ity using a function which yields a maximum proba-
bility mass when the belief in propositiona14 accord-
ing to the argument agrees with the system’s belief.
This probability gradually falls as the discrepancy
between the belief stated in the argument and the
system’s belief increases, which in turn yields an
increased message length.
3.3.2 Structural differences
The message which transmits the structural discrep-
ancies between a4a6a5 SysInt and a4a7a5 Arg describes the
structural operations required to transform a4a6a5 SysInt
into a4a7a5 Arg. These operations are: node insertions
and deletions, and arc insertions and deletions. A
node is inserted in a4a6a5 SysInt when the system can-
not reconcile a proposition in the given argument
with any proposition in its domain representation.
In this case, the system proposes a special Escape
(wild card) node. Note that the system does not pre-
sume to understand this proposition, but still hopes
to achieve some understanding of the argument as a
whole. Similarly, an arc is inserted when the argu-
ment mentions a relationship which does not appear
ina4a6a5 SysInt. An arc (node) is deleted when the corre-
sponding relation (proposition) appears in a4a7a5 SysInt,
but is omitted from a4a7a5 Arg. When a node is deleted,
all the arcs incident upon it are rerouted to connect
its antecedents directly to its consequent. This oper-
ation, which models a small inferential leap, pre-
serves the structure of the implication around the
deleted node. If the arcs so rerouted are inconsis-
tent witha4a6a5 Arg they will be deleted separately.
For each of these operations, the message an-
nounces how many times the operation was per-
formed (e.g., how many nodes were deleted) and
then provides sufficient information to enable the
message receiver to identify the targets of the op-
eration (e.g., which nodes were deleted). Thus, the
length of the message which describes the structural
operations required to transforma4a7a5 SysInt intoa4a6a5 Arg
comprises the following components:
MLa1 node insertionsa8 a29 MLa1 node deletionsa8 a29
MLa1 arc insertionsa8a30a29 MLa1 arc deletionsa8 (4)
a10 Node insertions = number of inserted nodes
plus the penalty for each insertion. Since a node
is inserted when no proposition in the domain
matches a statement in the argument, we use an
insertion penalty equal to a0a2a1 – the probability-
like score of the worst acceptable word-match
between a statement and a proposition (Sec-
tion 4.1). Thus the message length for node in-
sertions is
a20a23a22a25a24 a26 a1 # nodes insa8 a29 # nodes insa12 a1a18 a20a23a22a25a24 a26
a0a3a1
a8 (5)
a10 Node deletions = number of deleted nodes plus
their designations. To designate the nodes to be
deleted, we select them from the nodes in SysInt
(or a4a6a5 SysInt):
a20a23a22a25a24 a26 a1 # nodes dela8 a29a34a20a23a22a25a24 a26 C# nodes(a11a13a12 SysInt)
# nodes del (6)
a10 Arc insertions = number of inserted arcs plus
their designations plus the direction of each arc.
(This component also describes the arcs incident
upon newly inserted nodes.) To designate an arc,
we need to select a pair of nodes (head and tail)
from the nodes ina4a7a5 SysInt and the newly inserted
nodes. However, some nodes in a4a6a5 SysInt are al-
ready connected by arcs. These arcs must be
subtracted from the total number of arcs that can
be inserted, yielding
# poss arc ins a10 C# nodes(a11a13a12 SysInt)+# nodes insa26
a18 # arcs(
a4a6a5 SysInt)
We also need to send 1 extra bit per inserted arc
to convey its direction. Hence, the length of the
message that conveys arc insertions is:
a20a23a22a25a24 a26a25a1 # arcs insa8 a29 a20a23a22a25a24 a26 C# poss arc ins
# arcs ins
a29 # arcs ins
(7)
a10 Arc deletions = number of deleted arcs plus
their designations.
a20a23a22a25a24 a26 a1 # arcs dela8 a29 a20a22a25a24 a26 C# arcs(a11a13a12 SysInt)
# arcs del (8)
3.4 Calculating ML(Arga14IGArg)
The given argument is structurally equivalent to
a4a6a5 Arg. Hence, in order to transmit Arg in terms of
a4a6a5 Arg we only need to transmit how each statement
in Arg differs from the canonical statement gener-
ated for the matching node in a4a6a5 Arg (Section 4.1).
The length of the message which conveys this infor-
mation is
a15
a16a18a17a20a19a22a21
Arg
MLa1 Sentencea16 in Arga14a14
a8
where Sentencea16 in Arg is the sentence in the orig-
inal argument which matches the proposition for
node a14 in a4a6a5 Arg. Assuming an optimal message
encoding, we obtain
a15
a16a36a17a37a19a22a21
Arg
a18a21a20a23a22a25a24 a26 Pra1 Sentence
a16 in Arg
a14
a14
a8 (9)
We approximate Pra1 Sentencea16 in Arga14a14 a8 using
the score returned by the comparison function de-
scribed in Section 4.1.
Table 1: Summary of Message Length Calculation
MLa1 Arg a4 SysInta8 Equation 1
MLa1 SysInta8 Equation 2
MLa1a4a7a5 Arga14SysInta8
belief operations Equation 3
structural operations Equations 4, 5, 6, 7, 8
MLa1 Arga14a4a7a5 Arga8 Equation 9
4 Proposing Interpretations
Our system generates candidate interpretations for
an argument by first postulating propositions that
match the sentences in the argument, and then find-
ing different ways to connect these propositions –
each variant is a candidate interpretation.
4.1 Postulating propositions
We currently use a naive approach for postulating
propositions. For each sentence a0 Arg in the given
argument we generate candidate propositions as fol-
lows. For each proposition a14 in the domain, the
system proposes a canonical sentence a1a2a0 a16 (pro-
duced by a simple English generator). This sen-
tence is compared to a0 Arg, yielding a match-score
for the pair (a0 Arg,a14 ). When a match-score is above
a threshold a0 a1 , we have found a candidate interpre-
tation for a0 Arg. For example, the proposition [G was
in garden at 11] in Figure 1(b) is a plausible interpre-
tation of the input sentence “Mr Green was seen in
the garden at 11” in Figure 1(a). Some sentences
may have no propositions with match-scores above
a0a3a1 . This does not automatically invalidate the ar-
gument, as it may still be possible to interpret the
argument as a whole, even if a few sentences are
not understood (Section 3.3).
The match-score for a sentence a0 Arg and a propo-
sition a14 – a number in the [0,1] range – is cal-
culated using a function which compares words in
a0 Arg with words in a1a2a0
a16 . The goodness of a word-
match depends on the following factors: (1) level
of synonymy – the number of synonyms the words
have in common (according to WordNet, Miller et
al., 1990); (2) position in sentence; and (3) part-
of-speech (PoS) – obtained using MINIPAR (Lin,
1998). That is, a word a3a5a4a7a6a8a10a9a12a11 in position a13 in a1a2a0 a16
matches perfectly a word a3a15a14 a6a9 Arg in position a16 in
sentence a0 Arg, if both words are exactly the same,
they are in the same sentence position, and they have
the same PoS. The match-score between a3a5a4a17a6a8a10a9a12a11
and a3a18a14 a6a9 Arg is reduced as their level of synonymy
falls, and as the difference in their sentence position
increases. The match-score of two words is further
reduced if they have different PoS. In addition, the
PoS affects the penalty for a mismatch, so that mis-
matched non-content words are penalized less than
mismatched content words.
The match-scores between a sentence and its can-
didate propositions are normalized, and the result
used to approximate Pra1 a0 Arga14a14 a8 , which is required
for the MML evaluation (Section 3.4).2
4.2 Connecting the propositions
Since more than one node may match each of the
sentences in an argument, there may be more than
one a4a6a5 Arg that is consistent with the argument. For
instance, the sentence “Mr Green was seen in the
garden at 11” in Figure 1(a) matches both [G was in
garden at 11] and [N saw G in garden] (although the
former has a higher probability). If the other sen-
tences in Figure 1(a) match only one proposition,
two IGs that match the argument will be generated
– one for each of the above alternatives.
Figure 2 illustrates the remainder of the
interpretation-generation process with respect to
one a4a6a5 Arg. This process consists of finding con-
nections between the nodes in a4a7a5 Arg; eliminat-
ing superfluous nodes; and generating sub-graphs
of the resulting graph, such that all the nodes in
a4a6a5 Arg are connected (Figures 2(b), 2(c) and 2(d),
respectively). The connections between the nodes
ina4a7a5 Arg are found by applying two rounds of infer-
ences from these nodes (spreading outward). These
two rounds enable the system to “make sense” of an
argument with small inferential leaps (Zukerman,
2001). If upon completion of this process, some
nodes ina4a6a5 Arg are still unconnected, the system re-
jects a4a6a5 Arg. This process is currently implemented
in the context of a BN. However, any representa-
tion that supports the generation of a connected ar-
gument involving a given set of propositions would
be appropriate.
5 Evaluation
Our evaluation consisted of an automated experi-
ment where the system interpreted noisy versions
of its own arguments. These arguments were gener-
ated from different sub-nets of its domain BN, and
they were distorted at the BN level and at the NL
level. At the BN level, we changed the beliefs in
the nodes, and we inserted and deleted nodes and
arcs. At the NL level, we distorted the wording
of the propositions in the resultant arguments. All
2We are implementing a more principled model for sentence
comparison which yields more accurate probabilities.
(a) IG (b) Expand twice from the
      nodes in IG
(c) Eliminate nodes that
      aren’t in a shortest path
(d) Candidates are all the subgraphs of (c) that connect the nodes in IG
      (4 of the 9 candidates are shown)
a
b
c
a
d
e
b
f
c
i
j
m
n
p
s
tx
a
d
e
b
f
c
n
tx
a
d
e
b
f
c
m
n
p
s
tx
a
b
f
c
i
j
m p
s
b
f
c
i
j
n
tx
a
a
d
e
b
f
c
i
j
k
l
m
n
o
pq
s
tv x
Arg
Arg
Arg
Figure 2: Argument interpretation process
these distortions were performed for BNs of differ-
ent sizes (3, 5, 7 and 9 arcs). Our measure of perfor-
mance is the edit-distance between the original BN
used to generate an argument, and the BN produced
as the interpretation of this argument. For instance,
two BNs that differ by one arc have an edit-distance
of 2 (one addition and one deletion), while a perfect
match has an edit-distance of 0.
Overall, our results were as follows. Our system
produced an interpretation in 86% of the 5400 tri-
als. In 75% of the 5400 cases, the generated inter-
pretations had an edit-distance of 3 or less from the
original BN, and in 50% of the cases, the interpre-
tations matched perfectly the original BN. Figure 3
depicts the frequency of edit distances for the differ-
ent BN sizes under all noise conditions. We plotted
edit-distances of 0, a31 a31 a31 , 9 and a0a2a1 , plus the cate-
gory NI, which stands for “No Interpretation”. As
shown in Figure 3, the 0 edit-distance has the high-
est frequency, and performance deteriorates as BN
size increases. Still, for BNs of 7 arcs or less, the
vast majority of the interpretations have an edit dis-
tance of 3 or less. Only for BNs of 9 arcs the number
of NIs exceeds the number of perfect matches.
We also tested each kind of noise separately,
maintaining the other kinds of noise at 0%. All
the distortions were between 0 and 40%. We per-
formed 1560 trials for word noise, arc noise and
node insertions, and 2040 trials for belief noise,
which warranted additional observations. Figures 4,
Figure 3: Frequency of edit-distances for all noise
conditions (5400 trials)
5 and 6 show the recognition accuracy of our sys-
tem (in terms of average edit distance) as a func-
tion of arc, belief and word noise percentages, re-
spectively. The performance for the different BN
sizes (in arcs) is also shown. Our system’s perfor-
mance for node insertions is similar to that obtained
for belief noise (the graph was not included owing
to space limitations). Our results show that the two
main factors that affect recognition performance are
BN size and word noise, while the average edit dis-
tance remains stable for belief and arc noise, as well
as for node insertions (the only exception occurs for
40% arc noise and size 9 BNs). Specifically, for arc
noise, belief noise and node insertions, the average
Figure 4: Effect of arc noise on performance (1560
trials)
Figure 5: Effect of belief noise on performance
(2040 trials)
edit distance was 3 or less for all noise percentages,
while for word noise, the average edit distance was
higher for several word-noise and BN-size combi-
nations. Further, performance deteriorated as the
percentage of word noise increased.
The impact of word noise on performance rein-
forces our intention to implement a more principled
sentence comparison procedure (Section 4.1), with
the expectation that it will improve this aspect of our
system’s performance.
6 Conclusion
We have offered a mechanism which produces in-
terpretations of segmented NL arguments. Our ap-
plication of the MML principle enables our system
to handle noisy conditions in terms of wording, be-
liefs and argument structure, and allows us to isolate
the effect of the underlying knowledge representa-
tion on the interpretation process. The results of our
automated evaluation were encouraging, with inter-
Figure 6: Effect of word noise on performance
(1560 trials)
pretations that match perfectly or almost-perfectly
the source-BN being generated in 75% of the cases
under all noise conditions.

References

Eugene Charniak and Robert P. Goldman. 1993. A
Bayesian model of plan recognition. Artificial In-
telligence, 64(1):50–56.

Eric Horvitz and Tim Paek. 1999. A computa-
tional architecture for conversation. In UM99 –
Proceedings of the Seventh International Confer-
ence on User Modeling, pages 201–210, Banff,
Canada.

Dekang Lin. 1998. Dependency-based evaluation
of MINIPAR. In Workshop on the Evaluation of
Parsing Systems, Granada, Spain.

George Miller, Richard Beckwith, Christiane Fell-
baum, Derek Gross, and Katherine Miller. 1990.
Introduction to WordNet: An on-line lexical
database. Journal of Lexicography, 3(4):235–
244.

Judea Pearl. 1988. Probabilistic Reasoning in In-
telligent Systems. Morgan Kaufmann Publishers,
San Mateo, California.

C.S. Wallace and D.M. Boulton. 1968. An infor-
mation measure for classification. The Computer
Journal, 11:185–194.

Ingrid Zukerman. 2001. An integrated approach
for generating arguments and rebuttals and un-
derstanding rejoinders. In UM01 – Proceedings
of the Eighth International Conference on User
Modeling, pages 84–94, Sonthofen, Germany.
