Ambiguity Resolution in the DMTRANS PLUS 
Hiroaki Kitano, Hideto Tomabechi, and Lori Levin 
Abstract 
We present a cost-based (or energy-based) model of dis- 
ambiguation. When a sentence is ambiguous, a parse with 
the least cost is chosen from among multiple hypotheses. 
Each hypothesis is assigned a cost which is added when: 
(1) a new instance is created to satisfy reference success, 
(2) links between instances are created or removed to sat- 
isfy constraints on concept sequences, and (3) a concept 
node with insufficient priming is used for further process- 
ing. This method of ambiguity resolution is implemented in 
DMT~NS PLUS, which is a second generation bi-direetional 
English/Japanese machine translation system based on a mas- 
sively parallel spreading activation paradigm developed at 
the Center for Machine Translation at Carnegie Mellon Uni- 
versity. 
Center for Machine Translation 
Carnegie Mellon University 
Pittsburgh, PA 15213 U.S.A. 
access (DMA) paradigm of natural language process- 
ing. Under the DMA paradigm, the mental state of 
the hearer is modelled by a massively parallel network 
representing memory. Parsing is performed by pass- 
ing markers in the memory network. In our model, 
the meaning of a sentence is viewed as modifications 
made to the memory network. The meaning of a sen- 
tence in our model is definable as the difference in the 
memory network before and after understanding the 
sentence. 
2 Limitations of Current Methods 
of Ambiguity Resolution 
1 Introduction 
One of the central issues in natural language under- 
standing research is ambiguity resolution. Since many 
sentences are ambiguous out of context, techniques for 
ambiguity resolution have been an important topic in 
natural language understanding. In this paper, we de- 
scribe a model of ambiguity resolution implemented 
in DMTRANS PLUS, which is a next generation ma- 
chine translation system based on a massively parallel 
comuputational paradigm. In our model, ambiguities 
are resolved by evaluating the cost of each hypothe- 
sis; the hypothesis with the least cost will be selected. 
Costs are assigned when (1) a new instance is ere- 
ated to satisfy reference success, (2) links between in- 
stances are created or removed to satisfy constraints 
on concept sequences, and (3) a concept node with 
insufficient priming is used for further processing. 
The underlying philosophy of the model is to view 
parsing as a dynamic physical process in which one 
trajectory is taken from among many other possible 
paths. Thus our notion of the cost of the hypothesis is 
a representation of the workload required to take the 
path representing the hypothesis. One other impor- 
tant idea is that our model employs the direct memory 
*E-mail address is hiroaki@a.nl.cs.cmu.edu. Also with NEC Corporation. 
Traditional syntactic parsers have been using attach- 
ment preferences and local syntactic and semantic con- 
straints for resolving lexical and structural ambiguities. 
(\[17\], \[28\], \[2\], \[7\], \[26\], \[11\], \[5\]) However, these 
methods cannot select one interpretation from several 
plausible interpretations because they do not incorpo- 
rate the discourse context of the sentences being parsed 
(\[81, \[4\]). 
Connectionist-type approaches as seen in \[18\], \[25\], 
and \[8\] essentially stick to semantic restrictions and 
associations. However, \[18\], \[25\], \[24\] only provide 
local interactions, omitting interaction with contexL 
Moreover, difficulties regarding variable-binding and 
embedded sentences should be noticed. 
In \[8\], world knowledge is used through testing ref- 
erential success and other sequential tests. However, 
this method does not provide a uniform model of pars- 
ing: lexical ambiguities are resolved by marker passing 
and structural disambiguations are resolved by apply- 
ing separate sequential tests. 
An approach by \[15\] is similar to our model in that 
both precieve parsing as a physical process. However, 
their model, along with most other models, fails to 
capture discourse context. 
\[12\] uses marker passing as a method of contex- 
tual inference after a parse; however, no contextual in- 
formation is feed-backed during the sentential parsing 
(marker-passing is performed after a separate parsing 
- 72 - 
process providing multiple hypotheses of the parse). 
\[20\] is closer to our model in that marker-passing 
based contextual inference is used during a sentential 
parse (i.e., an integrated processing of syntax, seman- 
tics and pragmatics at real-time); however the parsing 
(LFG, and ease-frame based) and contextual inferences 
(marker-passing) are not under an uniform architecture. 
Past generations of DMTRANS (\[19\], \[23\]) have not 
incorporated cost-based structural ambiguity resolution 
schemes. 
3 Overview of DMTRANS PLUS 
3.1 Memory Access Parsing 
DMTRANS PLUS is a second generation DMA system 
based upon DMTRANS (\[19\]) with new methods of am- 
biguity resolution based on costs. 
Unlike most natural language systems, which are 
based on the "Build-and-Store" model, our system 
employs a "Recognize-and-Record" model (\[14\],\[19\], 
\[21\]). Understanding of an input sentence (or speech 
input in ~/iDMTRANS PLUS) is defined as changes made 
in a memory network. Parsing and natural language 
understanding in these systems are considered to be 
memory-access processes, identifying existent knowl- 
edge in memory with the current input. Sentences 
are always parsed in context, i.e., through utilizing 
the existing and (currently acquired) knowledge about 
the world. In other words, during parsing, relevant 
discourse entities in memory are constantly being re- 
membered. 
The model behind DMTRANS PLUS is a simulation 
of such a process. The memory network incorporates 
knowledge from morphophonetics to discourse. Each 
node represents a concept (Concept Class node; CC) 
or a sequence of concepts (Concept Sequence Class 
node; CSC). 
CCs represent such knowledge as phones (i.e. \[k\]), 
phonemes (i.e. /k/), concepts (i.e. *Hand-Gun, 
*Event, *Mtrans-Action), and plans (i.e. *Pick-Up- 
Gun). A hierarchy of Concept Class (CC) entities 
stores knowledge both declaratively and procedurely 
as described in \[19\] and \[21\]. Lexieal entries are rep- 
resented as lexical nodes which are a kind of CC. 
Phoneme sequences are used only for ~DMTRANS 
PLUS, the speech-input version of DM'IRANS PLUS. 
CSCs represent sequences of concepts such as 
phoneme sequences (i.e. </k//ed/i//g//il>), concept 
sequences (i.e. <*Conference *Goal-Role *Attend 
*Want>), and plan sequences (i.e. <*Declare-Want- 
Attend *Listen-Instruction>). The linguistic knowl- 
edge represented as CSCs can be low-level surface 
specific patterns such as phrasal lexicon entries \[1\] 
or material at higher levels of abstration such as in 
MOP's \[16\]. However, CSCs should not be confused 
with 'discourse segments' \[6\]. In our model, infor- 
mation represented in discourse segments are distribu- 
tively incorporated in the memory network. 
During sentence processing we create concept in- 
stances (CI) correpsonding to CCs and concept se- 
quence instances (CSI) corresponding to CSCs. This 
is a substantial improvement over past DMA research. 
Lack of instance creation and reference in past research 
was a major obstacle to seriously modelling discourse 
phenomena. 
CIs and CSIs are connected through several types of 
links. A guided marker passing scheme is employed 
for inference on the memory network following meth- 
ods adopted in past DMA models. 
DMTRANS PLUS uses three markers for parsing: 
• An Activation Marker (A-Marker) is created 
when a concept is initially activated by a lexical 
item or as a result of concept refinement. It indi- 
cates which instance of a concept is the source of 
activation and contains relevant cost information. 
A-Markers are passed upward along is-a links in 
the abstraction hierarchy. 
• A Prediction marker (P-Marker) is passed along 
a concept sequence to identify the linear order 
of concepts in the sequence. When an A-Marker 
reaches a node that has a P-Marker, the P-Marker 
is sent to the next element of the concept se- 
quence, thus predicting which node is to be acti- 
vated next. 
• A Context marker (C-Marker) is placed on a node 
which has contextual priming. 
Information about which instances originated acti- 
vations is carried by A-Markers. The binding list of 
instances and their roles are held in P-Markers 1. 
The following is the algorithm used in DMTRANS 
PLUS parsing: 
Let Lex, Con, Elem, and Seq be a set of lexical 
nodes, conceptual nodes, elements of concept se- 
quences, and concept sequences, respectively. 
Parse(~ 
For each word w in S, do" 
Activate(w), 
For all i and j: 
if Active(Ni) A Ni E Con 
IMarker parsing spreading activation is our choice over eon- 
nectionist network precisely because of this reason. Variable bind- 
ing (which cannot be easily handled in counectionist network) can 
be trivially attained through structure (information) passing of A- 
Markers and P-Markers. 
- 73 - 
then do concurrently: 
Activate(isa(Ni) 
if Active(ej.N~) ^ Predicted(ej.Ni) A-~Last(ej.Ni) 
then Predict(ej+l.Ni) 
if Active(ej.Ni) A Predicted(ej.Ni) ^ Last(ej.Ni) 
then Accept(Ni), Activate(isa(Ni) ) 
Predict(N) 
for all Ni E N do: 
if Ni E Con, 
then Pmark(Ni), Predict(isainv(Ni)) 
if Ni E Elem, 
then Pmark(Ni), Predict(isainv(N i) ) 
if Ni E Seq, 
then emark( eo.Ni), Predict(isainv(eo.Ni) ) 
if N~ = NIL, 
then Stop. 
Activate 
I ,--- instanceof(c) 
if i = ff then 
create inst( c ), A ddc ost, activate(c) 
else 
for each i E I 
do concurrently: 
activate(c) 
Accept 
if Constraints ~ T 
Asstone( Constraints), Addcost 
activate( isa( c ) ) 
where Ni and ej.Ni denote a node in the memory net- 
work indexed by i and a j-th element of a node Ni, 
respectively. 
Active(N) is true iff a node or an element of a node 
gets an A-Marker. 
Activate(N) sends A-Markers to nodes and elements 
given in the argument. 
Predict(N) moves a P-Marker to the next element of 
the CSC. 
Predicted(N) is true iff a node or an element of a node 
gets a P-Marker. 
Pmark(N) puts a P-Marker on a node or an element 
given in the argument. 
Last(N) is true iff an element is the last element of the 
concept sequence. 
Accept(N) creates an instance under N with links which 
connect the instance to other instances. 
isa(N) returns a list of nodes and elements which are 
connected to the node in the argument by abstraction 
links. 
isainv(N) returns a list of nodes and elements which 
are daughters of a node N. 
Some explanation would help understanding this al- 
gorithm: 
1. Prediction. 
Initially all the first elements of concept sequences 
(CSC - Concept Sequence Class) are predicted by 
putting P-Markers on them. 
2. Lexicai Access. 
A lexical node is activated by the input word. 
3. Concept Activation. 
An A-Marker is created and sent to the correspond- 
ing CC (Concept Class) nodes. A cost is added to the 
A-Marker if the CC is not C-Marked (i.e. A C-Marker 
is not placed on it.). 
4. Discourse Entity Identification 
A CI (Concept Instance) under the CC is searched 
for. 
If the CI exists, an A-Marker is propagated to 
higher CC nodes. 
Else, a CI node is created under the CC, and an 
A-Marker is propagated to higher CC nodes. 
5. Activation Propagation. 
An A-Marker is propagated upward in the absl~ac- 
tion hierarchy. 
6. Sequential prediction. 
When an A-Marker reaches any P-Marked node (i.e. 
part of CSC), the P-Marker on the node is sent to the 
next element of the concept sequence. 
7. Contextual Priming 
When an A-Marker reaches any Contextual Root 
node. C-Makers are put on the contexual children 
nodes designated by the root node. 
8. Conceptual Relation Instautiation. 
When the last element of a concept sequence re- 
cieves an A-Marker, Constraints (world and dis- 
course knowledge) are checked for. 
A CSI is created under the CSC with packaging 
links to each CI. This process is called concept refine- 
ment. See \[19\]. 
The memory network is modified by performing 
inferences stored in the root CSC which had the ac- 
cepted CSC attached to it. 
9. Activation Propagation 
A-Marker is propagated from the CSC to higher 
nodes. 
3.2 Memory Network Modification 
Several different incidents trigger the modification of 
the memory network during parsing: 
• An individual concept is instantiated (i.e. an in- 
stance is created) under a CC when the CC re- 
ceives an A-Marker and a CI (an instance that 
- 74 - 
was created by preceding utterances) is not exis- 
tent. This instantiation is a creation of a specific 
discourse entity which may be used as an existent 
instance in the subsequent recognitions. 
A concept sequence instance is created under the 
accepted CSC. In other words, if a whole concept 
sequence is accepted, we create an instance of 
the sequence instantiating it with the specific CIs 
that were created by (or identified with) the spe- 
cific lexical inputs. This newly created instance 
is linked to the accepted CSC with a instance re- 
lation link and to the instances of the elements of 
the concept sequences by links labelled with their 
roles given in the CSC. 
• Links are created or removed in the CSI creation 
phase as a result of invoking inferences based on 
the knowledge attached to CSCs. For example, 
when the parser accepts the sentence I went to 
the UMIST, an instance of I is created under the 
CC representing L Next, a CSI is created under 
PTRANS. Since PTRANS entails that the agent 
is at the location, a location link must be created 
between the discourse entities I and UMIST. Such 
revision of the memory network is conducted by 
invoking knowledge attached to each CSC. 
Since modification of any part of the memory net- 
work requires some workload, certain costs are added 
to analyses which require such modifications. 
4 Cost-based Approach to the 
Ambiguity Resolution 
Ambiguity resolution in DMTRANS PLUS is based on 
the calculation of the cost of each parse. Costs are 
attached to each parse during the parse process. 
Costs are attached when: 
1. A CC with insufficient priming is activated, 
2. A CI is created under CC, and 
3. Constraints imposed on CSC are not satisfied ini- 
tially and links are created or removed to satisfy 
the constraint. 
Costs are attached to A-Markers when these oper- 
ations are taken because these operations modify the 
memory network and, hence, workloads are required. 
Cost information is then carried upward by A-Markers. 
The parse with the least cost will be chosen. 
The cost of each hypothesis are calculated by: 
n m 
Ci = E cij + E constraintlk + biasi 
j=o k=o 
where Ci is a cost of the i-th hypothesis, cij is a cost 
carried by an A-Marker activating the j-th element of 
the CSC for the i-th hypothesis, constrainta is a cost 
of assuming k-th constraint of the i-th hypothesis, and 
b/as~ represents lexical preference of the CSC for the 
i-th hypothesis. This cost is assigned to each CSC and 
the value of Ci is passed up by A-Markers if higher- 
level processing is performed. At higher levels, each 
cij may be a result of the sum of costs at lower-levels. 
It should be noted that this equation is very simi- 
lax to the activation function of most neural networks 
except for the fact our equation is a simple linear equa- 
tion which does not have threshold value. In fact, if 
we only assume the addition of cost by priming at the 
lexical-level, our mechanism of ambiguity resolution 
would behave much like connectionist models with- 
out inhibition among syntactic nodes and excitation 
links from syntax to lexicon 2. However, the major 
difference between our approach and the connectionist 
approach is the addition of costs for instance creation 
and constraint satisfaction. We will show that these 
factors are especially important in resolving structural 
ambiguities. 
The following subsections describe three mecha- 
nisms that play a role in ambiguity resolution. How- 
ever, we do not claim that these are the only mecha- 
nisms involved in the examples which follow s . 
4.1 Contextual Priming 
In our system, some CC nodes designated as Contex- 
tual Root Nodes have a list of thematically relevant 
nodes. C-Markers are sent to these nodes as soon as 
a Contextual Root Node is activated. Thus each sen- 
tence and/or each word might influence the interpre- 
tation of following sentences or words. When a node 
with C-Marker is activated by receiving an A-Marker, 
the activation will be propagated with no cost. Thus, a 
parse using such nodes would have no cost. However, 
when a node without a C-Marker is activated, a small 
cost is attached to the interpretation using that node. 
In \[19\] the discussion of C-Marker propagation con- 
centrated on the resolution of word-level ambiguities. 
However, C-Markers are also propagated to conceptual 
2We have not incorporated these factors primarily because struc- 
tured P-Markers can play the role of top-down priming; however, 
we may be incorporating these factors in the future. 
3For example, in one implementation of DMTRANS, we are us- 
ing time-delayed decaying activations which resolve ambiguity even 
when two CI nodes are concurrently active. 
- 75 - 
class nodes, which can represent word-level, phrasal, 
or sentential knowledge. Therefore, C-Markers can 
be used for resolving phrasal-level and sentential-level 
ambiguities such as structural ambiguities. For exam- 
ple, atama ga itai literally means, '(my) head hurts.' 
This normally is identified with the concept sequences 
associated with the *have-a-symptom concept class 
node, but if the preceding sentence is asita yakuinkai 
da ('There is a board of directors meeting tomorrow'), 
the *have-a-problem concept class node must be ac- 
tivated instead. Contextual priming attained by C- 
Markers can also help resolve structural ambiguity in 
sentences like did you read about the problem with 
the students? The cost of each parse will be deter- 
mined by whether reading with students or problems 
with students is contextually activated. (Of course, 
many other factors are involved in resolving this type 
of ambiguity.) 
Our model can incorporate either C-Markers or a 
connectionist-type competitive activation and inhibi- 
tion scheme for priming. In the current implementa- 
tion, we use C-Markers for priming simply because C- 
Marker propagation is computationaUy less-expensive 
than connectionist-type competitive activation and in- 
hibition schemes 4. Although connectionist approaches 
can resolve certain types of lexical ambiguity, they 
are computationally expensive unless we have mas- 
sively parallel computers. C-Markers are a resonable 
compromise because they are sent to semantically rel- 
evant concept nodes to attain contextual priming with- 
out computationally expensive competitive activation 
and inhibition methods. 
4.2 Reference to the Discourse Entity 
When a lexical node activates any CC node, a CI node 
under the CC node is searched for (\[19\], \[21\]). This 
activity models reference to an already established dis- 
course entity \[27\] in the heater's mind. If such a CI 
node exists, the reference succeeds and this parse will 
be attached with no cost. However, if no such instance 
is found, reference failure results. If this happens, an 
instantiation activity is performed creating a new in- 
stance with certain costs. As a result, a parse using 
newly created instance node will be attached with some 
cost. 
For example, if a preceding discourse contained a 
reference to a thesis, a CI node such as THESIS005 
would have been created. Now if a new input sen- 
tence contains the word paper, CC nodes for THI/- 
'*This does not mean that our model can not incorporate a con- 
nectionist model. The choice of C-Markers over the eonnectionist 
approach is mostly due to computational cost. As we will describe 
later, our model is capable of incorporating a connectionist approach. 
SIS and SHEET-OF-PAPER are activated. This causes a 
search for CI nodes under both CC nodes. Since the 
CI node THESIS005 will be found, the reading where 
paper means thesis will not acquire a cost. However, 
assuming that there is not a CI node corresponding to 
a sheet of paper, we will need to create a new one for 
this reading, thus incurring a cost. 
We can also use reference to discourse entities to 
resolve structural ambiguities. In the sentence We 
sent her papers, ff the preceding discourse mentioned 
Yoshiko's papers, a specific CI node such as YOSHIKO- 
P/ff'ER003 representing Yoshiko's papers would have 
been created. Therefore, during the processing of We 
sent her papers, the reading which means we sent pa- 
pers to her needs to create a CI node representing pa- 
pers that we sent, incurring some cost for creating that 
instance node. On the other hand, the reading which 
means we sent Yoshiko's papers does not need to cre- 
ate an instance (because it was already created) so it is 
costless. Also, the reading that uses paper as a sheet 
of paper is costly as we have demonstrated above. 
4.3 Constraints 
Constraints are attached to each CSC. These con- 
straints play important roles during disambiguation. 
Constraints define relations between instances when 
sentences or sentence fragments are accepted. When 
a constraint is satisfied, the parse is regarded as plau- 
sible. On the other hand, the parse is less plausible 
when the constraint is unsatisfied. Whereas traditional 
parsers simply reject a parse which does not satisfy a 
given constraint, DMTRANS PLUS, builds or removes 
links between nodes forcing them to satisfy constraints. 
A parse with such forced constraints will record an 
increased cost and will be less preferred than parses 
without attached costs. 
The following example illustrates how this scheme 
resolves an ambiguity. As an initial setting we as- 
sume that the memory network has instances of 'man' 
(MAN1) and 'hand-gun' (HAND-GUN1) connected 
with a PossEs relation (i.e. link). The input utterance 
is" "Mary picked up an Uzzi. Mary shot the man with 
the hand-gun." The second sentence is ambiguous in 
isolation and it is also ambiguious if it is not known 
that an Uzzi is a machine gun. However, when it is 
preceeded by the first sentence and ff the hearer knows 
that Uzzi is a machine gun, the ambiguity is drastically 
reduced. DMTRANS PLUS hypothesizes and models 
this disambiguation activity utilizing knowledge about 
world through the cost recording mechanism described 
above. 
During the processing of the first sentence, DM- 
TRANS PLUS creates instances of 'Mary' and 'Uzzi' 
- 76 - 
and records them as active instances in memory (i.e., 
MARY1 and UZZI1 are created). In addition, a 
link between MARY1 and UZZI1 is created with the 
POSSES relation label. This link creation is invoked by 
triggering side-effects (i.e., inferences) stored in the 
CSC representing the action of 'MARY1 picking up 
the UZZII'. We omit the details of marker passing 
(for A-, P-, and C-Markers) since it is described detail 
elsewhere (particulary in \[19\]). 
When the second sentence comes in, an instance 
MARY1 already exists and, therefore, no cost is 
charged for parsing 'Mary '5. However, we now have 
three relevant concept sequences (CSC's6): 
CSCI: (<agent> <shoot> <object>) 
CSC2: (<agent> <shoot> <object> <with> <instrument>) 
CSC3: (<person> <with> <instrument>) 
These sequences are activated when concepts in 
the sequences are activated in order from below in 
the abstraction hierarchy. When the "man" comes in, 
recognition of CSC3:(<person> <with> <instrument>) 
starts. When the whole sentence is received, we have 
two top-level CSCs (i.e., CSC1 and CSC2) accepted 
(all elements of the sequences recognized). The ac- 
ceptance of CSC1 is performed through first accepting 
CSC3 and then substituting CSC3 for <object>. 
When the concept sequences are satisfied, their con- 
straints are tested. A constraint for CSC2 is (POSSES 
<agent> <instrument>) and a constraint for CSC3 (and 
CSCl, which uses CSC3) is (POSSES <person> <in- 
strument>). Since 'MARY1 POSSESS HAND-GUNI' 
now has to be satisfied and there is no instance of this 
in memory, we must create a POSSESS link between 
MARY1 and HAND-GUN1. A certain cost, say 10, 
is associated with the creation of this link. On the 
other hand, MAN1 POSSESS HAND-GUN1 is known 
in memory because of an earlier sentence. As a result, 
CSC3 is instantiated with no cost and an A-Marker 
from CSC3 is propagated upward to CSC1 with no 
cost. Thus, the cost of instantiating CSC1 is 0 and 
the cost of instantiating CSC2 is 10. This way, the 
interpretation with CSC 1 is favored by our system. 
sOl course, 'Mary' can be 'She'. The method for handling this 
type of pronoun reference was already reported in \[19\] and we do 
not discuss it here. 
6As we can see from this example of CSC's, a concept sequence 
can be normally regarded as a subcategorization list of a VP head. 
However, concept sequences are not restricted to such lists and are 
actually often at higher levels of abstraction representing MOP-like 
sequences. 
5 Discussion: 
5.1 Global Minima 
The correct hypothesis in our model is the hypothe- 
sis with the least cost. This corresponds to the notion 
of global minima in most connectionist literature. On 
other hand, the hypothesis which has the least cost 
within a local scope but does not have the least cost 
when it is combined with global context is a local 
minimum. The goal of our model is to find a global 
minimum hypothesis in a given context. This idea is 
advantageous for discourse processing because a parse 
which may not be preferred in a local context may 
yeild a least cost hypothesis in the global context. Sim- 
ilarly, the least costing parse may turn out to be costly 
at the end of processing due to some contexual infer- 
ence triggered by some higher context. 
One advantage of our system is that it is possible to 
define global and local minima using massively paral- 
lel marking passing, which is computationally efficient 
and is more powerful in high-level processing involv- 
ing variable-binding, structure building, and constraint 
propagations 7 than neural network models. In addi- 
tion, our model is suitable for massively parallel archi- 
tectures which are now being researched by hardware 
designers as next generation machines s. 
5.2 Psycholinguistic Relevance of the 
Model 
The phenomenon of lexical ambiguity has been studied 
by many psycholinguistic researchers including \[13\], 
\[3\], and \[17\]. These studies have identified contextual 
priming as an important factor in ambiguity resolution. 
One psycholinguistic study that is particularly 
relevent to DMTRANS PLUS is Crain and Steedman 
\[4\], which argues for the principle of referential suc- 
cess. Their experiments demonstrate that people prefer 
the interpretation which is most plausible and accesses 
previously defined discourse entities. This psycholin- 
guistic claim and experimental result was incorporated 
in our model by adding costs for instance creation and 
constraint satisfaction. 
Another study relevent to our model is be the lex- 
ical preference theory by Ford, Bresnan and Kaplan 
\[5\]. Lexical preference theory assumes a preference 
order among lexical entries of verbs which differ in 
subcategorization for prepositional phrases. This type 
of preference was incorporated as the bias term in our 
cost equation. 
7Refer to \[22\] for details in this direction. 
SSee \[23\] and \[9\] for discussion. 
- 77 - 
Although we have presented a basic mechanism to 
incorporate these psyeholinguistic theories, well con- 
trolled psycholinguistic experiments will be necessary 
to set values of each constant and to validate our model 
psycholinguistically. 
5.3 Reverse Cost 
In our example in the previous section, if the first 
sentence was Mary picked an S&W where the hearer 
knows that an S&W is a hand-gun, then an instance 
of 'MARY POSSES HAND-GUNI' is asserted as true 
in the first sentence and no cost is incurred in the in- 
terpretation of the second sentence using CSC2. This 
means that the cost for both PP-attachements in Mary 
shot the man with the handgun are the same (no cost 
in either cases) and the sentence remains ambiguous. 
This seems contrary to the fact that in Mary picked a 
S& W. She shot the man with the hand-gun, that natural 
interpretation (given that the hearer knows S&W is a 
hand-gun) seems to be that it was Mary that had the 
hand-gun not the man. Since our costs are only neg- 
atively charged, the fact that 'MARY1 POSSES S&W' 
is recorded in previous sentence does not help the dis- 
ambiguation of the second sentence. 
In order to resolve ambiguities such as this one 
which remain after our cost-assignment procedure has 
applies, we are currently working on a reverse cost 
charge scheme. This scheme will retroactively in- 
crease or decrease the cost of parses based on other 
evidence from the discourse context. For example, the 
discourse context might contain information that would 
make it more plausible or less plausible for Mary to use 
a handgun. We also plan to implement time-sensitive 
diminishing levels of charges to prefer facts recognized 
in later utterances. 
5.4 Incorporation of Connectionist Model 
As already mentioned, our model can incorporate 
connectionist models of ambiguity resolution. In a 
connectionist network activation of one node trig- 
gers interactive excitation and inhibition among nodes. 
Nodes which get more activated will be primed more 
than others. When a parse uses these more active 
nodes, no cost will be added to the hypothesis. On 
the other hand, hypotheses using less activated nodes 
should be assigned higher costs. There is nothing 
to prevent our model from integrating this idea, es- 
pecially for lexical ambiguity resolution. The only 
reason that we do not implement a connectionist ap- 
proach at present is that the computational cost will 
be emonomous on current computers. Readers should 
also be aware that DMA is a guided marker passing al- 
gorithm in which markers are passed only along certain 
links whereas connectionist models allow spreading 
of activation and inhibition virtually to any connected 
nodes. We hope to integrate DMA and connectionist 
models on a real massively parallel computer and wish 
to demonstrate real-time translation. One other possi- 
bility is to integrate with a connectionist network for 
speech recognition 9. We expect, by integrating with 
connectionist networks, to develop a uniform model 
of cost-based processing. 
6 Conclusion 
We have described the ambiguity resolution scheme 
in DMTRANS PLUS. Perhaps the central contribution 
of this paper to the field is that we have shown a 
method of ambiguity resolution in a massively paral- 
lel marker passing paradigm. Cost evaluation for each 
parse through (1) reference and instance creation, (2) 
constraint satisfaction and (3) C-Markers are combined 
into the marker passing model. We have also dicussed 
on the possibility to merge our model with connec- 
tionist models where they are applicable. The guiding 
principle of our model, that parsing is a physical pro- 
tess of memory modification, was useful in deriving 
mechanisms described in this paper. We expect further 
investigation along these lines to provide us insights 
in many aspects of natural language processing. 
Acknowldgements 
The authors would like to thank members of the Center 
for Machine Translation for fruitful discussions. We 
would especially like to thank Masaru Tomita, Hitoshi 
Iida, Jaime Carbonell, and Jay McClelland for their 
encouragement. 
Appendix: Implementation 
DMTRANS PLUS is implemented on IBM-RT's using 
both CMU-COMMONLISP and MULTILISP running on 
the Mach distributed operating system at CMU. Algo- 
rithms for structural disambiguation using cost attache- 
ment were added along with some other house-keeping 
functions to the original DMTRANS to implement DM- 
TRANS PLUS. All capacities reported in this paper have 
been implemented except the schemes mentioned in 
the sections 5.3 and 5.4 (i.e., negative costs, integra- 
tion of connectionist models). 
9Augmentation of the cost-basod model to the phonological level 
has already been impl~rnentod in \[10\]. 
- 78 - 
References 
\[1\] Becket, J.D. The phrasal lexicon. In 'Theoretical Issues in 
Natural Language Processing', 1975. 
\[2\] Boguraev, B. K., et. el., Three Papers on Parsing, Technical 
Report 17, Computer Laboratory, University of Cambridge, 
1982. 
\[3\] Cottrell, G., A Model of Lexical Access of Ambiguous Words, in 
'Lexical Ambiguity Resolution', S. Small, et. eLI. (eds), Morgan 
Kaufmann Publishers, 1988. 
\[4\] Crain, S. and Steex~an, M., On not being led up with guarden 
path: the use of context by the psychological syntax processor, 
in 'Natural Language Parsing', 1985. 
\[5\] Ford, M., Bresnan, J. and Kaplan, R., A Competence-Based 
Theory of Syntactic Closure, in 'The Mental Representation of 
Grammatical Relations', 1981. 
\[6\] Grosz, B. and Sidner, C. L., The Structure of Discourse Struc- 
ture, CSLI Report No. CSLI-85-39, 1985. 
\[7\] Hays, P. J., On semantic neLs, frames and associations, in 
'Proceedings of IJCAI-77, 1977. 
\[8\] Hirst' G., Semantic Interpretation and the Resolution of Am- 
biguity, Cambridge University Press, 1987. 
\[9\] Kitano, H., Multilingual Information Retrieval Mechanism us- 
ing VLSI, in 'Proceedings of RIAO-88', 1988. 
\[10\] Kitano, H., et. eL, Manuscript An Integrated Discourse Under- 
standing Model for an Interpreting Telephony under the Direct 
Memory Access Paradigm, Carnegie Mellon University, 1989. 
\[11\] Marcus, M. P., A theory of syntactic recognition for natural 
language, MIT Press, 1980. 
\[12\] Norvig, P., Unified Theory of Inference for Text Understading, 
Ph.D. Dissertation, University of California, Berkeley, 1987. 
\[13\] Prather, P. and Swinney, D., Lexical Processing andAmbigu. 
ity Resolution: An Autonomous Processing in an Interactive 
Box, in 'Lealcal Ambiguity Resolution', S. Small, eL el. (F_,ds), 
Morgan Kanfmann Publishers, 1988. 
\[14\] Riesbnck, C. and Martin, C., Direct Memory Access Parsing, 
YALEU/DCS/RR 354, 1985. 
\[15\] Selman, B. end Hint, G., Parsing as an Energy Minimize. 
tion Problem, in Genetic Algorithms and Simulated Annealing, 
Davis, L. (Ed.), Morgan Kanfmann Publishers, CA, 1987. 
\[16\] Schank, R., Dynamic Memory: A theory of learning in com. 
puters and people. Cambridge University Press. 1982 
\[17\] Small, S., eL IlL (~ls.) Lexical Ambiguity Resolution, Morgan 
Kanfmann Publishers, Inc., CA, 1988. 
\[18\] Small, S., et. el. TowardConnectionist Parsing, in Proceedings 
of AAAI-82, 1982. 
\[19\] Tornabechi, H., Direct Memory Access Translation, in 'Pro- 
ceedings of the IJCAI-88', 1987. 
\[20\] Tcmabechi, H. and Tomita, M., The Integration of Unifwatlan- 
based Syntax/Semantics and Memory.based Pragmatics for 
Real-Time Understanding of Noisy Continuous Speech Input, 
in 'Proceedings of the AAAI-88', 1988. 
\[21\] Tcsuabechi, H. and Tomita, M., Application of the Direct 
Memory Access paradigm to natural language interfaces to 
knowledge.based systems, in 'Proceedings of the COLING- 
88', 1988. 
\[22\] Tcrnabechi, H. and Tomita, M., Manuscript. MASSIVELY 
PARALLEL CONSTRAINT PROPAGATION: Parsing with 
Unification.based Grammar without Unification. Carnegie 
Mellon University. 
\[23\] Tcmabechi, H., Mitamura, T., and Tomita, M., DIRECTMEM- 
ORY ACCESS TRANSLATION FOR SPEECH INPUT: A Mas- 
sively Parallel Network of Episodic~Thematic and Phonolog. 
ical Memory, in 'Proceedings of the International Confer- 
ence un Fifth Generation Computer Systems 1988' (FGCS'88), 
1988. 
\[24\] Tonretzky, D. S., Connectionism and PP Attachment, in 'Pro- 
ceedings of the 1988 Connectionist Models Summer School, 
1988. 
\[25\] Waltz, D. L. and Pollack, J. B., Massively Parallel Parsing: A 
Strongly Interactive Model of Natural Language Interpretation. 
Cognitive Science 9(I): 51-74, 1985. 
\[26\] Wmmer, E., The ATN and the Sausage Machine: Which one 
is baloney? Cognition, 8(2), June, 1980. 
\[27\] Webber, B. L., So what can we talk about now?, in 'Com- 
putational Models of Discourse', (Eds. M. Brady and R.C. 
Berwick), MIT Press, 1983. 
\[28\] Wilks, Y. A., Huang, X. and Fass, D., Syntax, preference and 
right attachment, in 'Proceedings of the UCAI-85, 1985. 
- 79 - 
