Shalt2 - a Symmetric Machine Translation System with 
ABSTRACT 
Shall2 is a knowledge-based machine translation sys- 
tem with a symmetric architecture. The grammar 
rules, mapping rules between syntactic and conceptual 
(semantic) representations, and transfer rules for con- 
ceptual paraphrasing are all bi-directional knowledge 
sources used by both a parser and a generator. 
1. Introduction 
Shalt2 is a research prototype of a knowledge-based, 
multi-domain/multi-lingual MT system. It has two 
predecessors, SHALT \[~31 (1982-90) and KBMT-89131 
(1987-89), which brought us valuable lessons we re- 
flected in the design and implementation of Shalt2. As 
a result, Shall2 has been designed and implemented 
as a symmetric MT system with frame-based concep- 
tual representation. The advantages of a symmetric ar- 
chitecture coupled with a conceptual representation are 
obvious. First, it allows us to maintain single knowl- 
edge sources for both parsing and generation. We can 
automatically control the coverage and reversibility of 
these processes. Second, conceptual structures are a 
desirable interface for representing the meaning of sen- 
tences, machine-generated output (such as diagnosis by 
an expert system), and expressions in graphical lan- 
guages. AI-baeed approaches can provide powerful in- 
ference methods for identification of a (semi-)equivalent 
class of conceptual representations, which corresponds 
to a paraphrasing capability. Unlike interlingual MT 
systems, our approach relieves the parser of the bur- 
den of generating a unique meaning representation of 
all equivalent sentences, which becomes harder in pro- 
portion to the number of languages the system has to 
translate. 
2. Shalt2 Architecture and Knowl- 
edge Sources 
Shalt2 has five types of knowledge sources: a set 
G of syntactic grammar rules (including dictionaries), 
a set C of hierarchically defined conceptual primitives 
called concept definitions, a set M of mapping rules be- 
tween syntactic and conceptual structures, a set P of 
conceptual paraphrasing rules, and a set E of cases (a 
structurally and conceptually disambiguated set of sam- 
ple sentences). These knowledge sources are shared by 
three major processes: a parser, a concept mapper, and 
a generator. G should be provided for each language, 
whereas the set C should be defined for each applica- 
tion domain. M, P, and E should be provided for each 
pair of a language and a domain.t Figure 1 shows an 
overview of the Shall2 architecture. 
*fOur theory of multi-domaln translation aims to compose a set of mapping rules efilclelxtly when Mever~l doma.lnn are combined. 
It it expected thffit two ~tJ of mapping rules for • ,ingle language will differ m~nly is lexieal mipping rules. 
Conceptual Transfer 
Koichi TAKEDA Naohiko URAMOTO 
Tetsuya NASUKAWA TMjiro TSUTSUMI 
Tokyo l~esearch Laboratory, IBM Research 
5-19 Sanban-cho, Chiyoda-ku, Tokyo 102, Japan 
{takeda,uramoto,nasukawa,tsutsumi} ~trl.vnet.ibm.com 
| .... 
~p~ 
Figure h Shalt2 Architecture 
2.1 Syntactic Grammar 
Shalt2 currently has two English grammars (PEG 
and a less competent English grammar) and a Japanese 
grammar. The last two grammars are bi-directionai 
grammars written in an LFG-llke formalism called a 
pseudo-unification grammar (PUG)\[ x°\], and were orig- 
inally employed in KBMT-89.tt PEG is not bi- 
directional, since it has too many destructive opera- 
tions to build or encode record structures, but it is 
our primary English grammar for three reasons: (1) 
broad coverage, (2) ability to encapsulate structural am- 
biguities in a compact parse tree, and (3) compatibil- 
ity with other NLP programs that use PEG to analyze 
English sentences. Our bi-directional English gram- 
mar is following up PEG and will be able to replace 
it. The symmetric architecture of Shalt2, however, al- 
lows uni-directional knowledge sources and processes to 
be hooked into the system. Their coverage and abil- 
ity to parse or generate sentences can be measured in 
terms of a set of conceptual representations that they 
can relate to syntactic structures. 
Although the syntactic structures of PEG and PUG 
grammars differ, they are integrated into a single 
syntactic representation called Dependency Structures 
(DSs) \[s\]. Roughly speaking, a DS is a tree-like structure 
with a set of nodes and a set of ares, which correspond to 
maximal projections of syntactic constituents and gram- 
t-f The Eagfish version was originally written by GLteg et al. 12l but wan notbi-directional. The Jap~mese gr~.nmax was originally 
written by Mitamura and T~keda\[2\]. The Shall2 verlions of these PUG graxama~ have been modified conBiderably to allow 
them to handle coordinations, compaxativ~, and so on. 
ACIES DE COLING-92. NANTES. 23-28 AOLrT 1992 I 0 3 4 PROC. OF COLING-92. NANTES. AUG. 23-28. 1992 
tactical relationships, respectively. Some grammatical 
formalisms such as Constraint Dependency Grammar 14\] 
postulate DSs as syntactic representations to which a 
constraint satisfaction algorithm \[~1 can be applied in or- 
der to resolve syntactic/semantic ambiguities efficiently. 
In the following, we show a PEG parse tree and a PUG 
f-structure, which will have the same DS, for the simple 
sentence "Insert a diskette into the drive." 
" PEG Parse Tree: 
(IMPR (VERB* 'Insert' (insert PS)) 
(NP (DET (ADJ* 'a' (a BS))) 
(NOUN* 'diskette' (diskette SG))) 
(PP (PREP ~into') 
(DET (ADJ* 'the' (the BS))) 
(NOUN 'drive' (drive SG))) 
(PUNC '.')) 
F-structure in the PUG English grammar: 
((ROOT insert) (CAT v) (SUBCAT trans) 
(FORM inf) (MOOD imp) (TENSE pros) 
(0BJ ((ROOT diskette) (CAT n) 
(DET indef) (NUM sg))) 
(PPADJUNCT (((ROOT drive) (CAT n) 
(DET def) (NUM sg))))) 
DS: 
insert (CAT v, SUBCAT traus, 
MOOD imp, TENSE pros) 
DOBJECT 
diskette (CAT n, DET indof, NUM sg) 
PPADJUNCT 
drive (CAT n, PREP into, DET def, NUM ag) 
The reader *nay no*lee that the above sentence 
should really have ambiguities in prepositional phrase 
attachment, which result in two conflicting depen- 
dencies "insert -PPADJUNCT- drive" and "diskette - 
PPADJUNCT- drkve" in a single DS. We will discuss 
the handling of such ambiguities in Section 3. 
2.2 Concept Definitions 
A set of conceptual primitives is called a Natural Lan- 
guage (NL) class system\[hi, and is maintained as an 
e It 171 object-oriented knowledge base under l~ram K' . The 
NL class system consists of a set of constant classes, 
three meta-classes, and virtual classeo with an exclusive 
inheritance mechanism discussed below. 
Each class represcnts a concept in the real world. A 
class hms zero or more slots, which describe the fillers it 
may use to represent a compound concept. NL objects 
are particular instances of NL classes. There are is-a 
and parl-of relationships defined over the NL classes. 
For example, 
(deTclass *insert 
(is-a (value *action)) 
(:agent (sem *human *system)) 
(:theme (sem *physical-object)) 
(:goal (sem *location *physical-object))) 
defines a class *insert with an is-a link to a class *action, 
and three slots - :agent, :theme, and :goal. The (value 
...) facet shows an actual filler of the slot, while the 
(see ...) facet shows the selectional restrictions on the 
slot. 
A class inherits each slot definition from its super- 
class(on), that is, the fillers of the is-a slot, uuless the 
slot is redefined. A class can have more than one im- 
mediate superclass, but we restrict its inheritance to 
be exclusive rather than general multiple inheritance. 
Tilat is, au instance of a claim c~s inherit slot definitions 
from only one of its imraediate superclaaqes. The idea 
behind exclusive inheritance is to realize certain iden- 
tity of verbal and nominal word senses without mixing 
the slot definitions of both. For example, most verbs 
have nominal counterparts in a natural language, such 
as '~insert" and "insertion." Such a pair usually shares 
slot definitions (:agent, :tlmmc, and :goal) and selec- 
tional restrictions, except that "insert" inherits tense, 
aspect, and modality from its "verbal" superclazs but 
not cardinality, ordinal, and definiteness (that is, the 
quality of being indefinite or definite) from its "nom- 
inal" superclazs, although these features are inherited 
by "insertion." The following class definitions 
(defclass *physical-action 
(is-a (value *predicate))) 
(defclass *mental-object 
(is-a (value *object))) 
(defclass *action 
(in-u (value *physical-action 
*mental-object) ) ) 
allow every instance of subclasses of *action to inherit 
slot definitions from either *physical-action or *mental- 
object. Exclusive inheritance also contributes to per- 
formance improvement of the parser since it allows us 
to reduce the number of possible superclasses from an 
exponential number to a linear number. 
There are three recta-classes in NL classes - *vat, 
*set, and *fun - to represent concepts that are not in- 
eluded in the normal class hierarchy. The first, *vat, is 
a variable that ranges over a set of NL classes, which 
are constants. Pronouns and question words in natural 
languages usually carry this kind of incomplete concept. 
The second, *set, is a set constructor that can represent 
a coordinated structure in natural languages. The third, 
*fun, is a function from NL objects to NL objects. It 
captures the meaning of a so-called sentiofunction word. 
For example, in some usages, the verb "take" does not 
really indicate any specific action until it gets an argu- 
ment such as "a walk," "a rest," "a look." It is therefore 
well characterized as a function. 
Since we allow only exclusive inheritance, the NL 
class system certainly lacks the ability to organize 
classe~ front various viewpoints, unlike ordinary multi- 
pie inheritance. Virtual classes are therefore introduced 
to compensate for this inability. I~br example, 
(de~velass *option 
(dsf (*math-coprocessor 
*hard-disk *software))) 
(ds£vclass *male-thing 
(dsf (equal :sex *male))) 
shows two types of virtual classe, *option and *male- 
thing. The *option consists of the classes *math- 
coprocessor, *hard-disk, and *software. The *male- 
thing is a class that includes instances of any class with 
the :sex slot filled with *male. Note that the main- 
tainability of a class hierarchy drastically declines if we 
allow classes such as *option to be "actual" rather than 
virtual, as we will have many is-a links from anything 
that could be an option. The second type of virtual 
class helps incorporate an-called semantic features into 
the NL class system. Existin~ machine-readable dictio- 
naries (for example, LDOCEt el) often have entries with 
semantic features such as HUMAN, LIQUID, and VF~ 
HICLE that may not fit into a given ontological class 
hierarchy. A virtual class definition 
AcrEs DE COL1NG-92, NAlVtES, 23-28 ^OUr 1992 1 0 3 5 PRec. ol. COLING-92, NAIVrES, Auo. 23-28. 1992 
(de:f vclaos *haman 
(def (equal :haman *true))) 
with semantic restrictions (:agent (sere *human)) make 
it possible to integrate such entries into the NL class 
system. 
The NL class system currently includes a few thou- 
sand concepts extracted from the personal-computer 
domain. The word senses in the SHALT dictionary 
(about 100,000 entries) and the LDOCE (about 55,000 
entries) have been gradually incorporated into the NL 
class system. 
2.3 Mapping Rules 
Mapping rules define lexlcal and structural correspon- 
dences between syntactic and conceptual representa- 
tions. A lexical mapping rule has the form 
(emap *insert 
<=i=> insert ((CAT v) (SUBCAT trans)) 
(role =sQm (*physical-action)) 
(:agent =syn (SUBJECT)) 
(:theme =syn (DOBJECT)) 
(:goal =syn (PPADJUNCT 
((PREP into) (CAT n))))) 
where a transitive verb "insert" maps to or from an in- 
stance of*insert with its exclusive superclass *physical- 
action. The three slots for st,'uctural mapping between 
concepts (:agent, :theme, and :goal) and grammatical 
roles (SUBJECT, DOBJECT, and PPADJUNCT) are 
also defined in this rule. The :agent filler, for exam- 
ple, should be an instance that is mapped from a syn- 
tactic SUBJECT of the verb "insert." The :goal filler 
must be a prepositional phrase consisting of a noun with 
the preposition "into." The fragments of syntactic fea- 
ture structures following a lexical word or a grammat- 
ical function in a mapping rule specify the minimum 
structures that subsume feature structures of candidate 
syntactic constituents. These structural mappings are 
specific to this instance. 
The structural mapping rule 
(emap *physical-action <=s=> 
(:mood =syn (MOOD)) 
(:time =syn (TENSE))) 
specifies that the conceptual slots :mood and :time map 
to or from the grammatical roles MOOD and TENSE, 
respectively. Unlike the structural mapping in a lexi- 
cal mapping rule, these slot mappings can be inherited 
by any instance of a subclass of *physical-action. The 
*insert instance defined above, for example, can inherit 
these :mood and :time mappings. Given a dependency 
structure (DS), mapping rules work as distributed con- 
straints on the nodes and arcs in the DS in such a way 
that a conceptual representation R is au image of the DS 
iff R is the minimal representation satisfying all the lex- 
ical and structural mappings associated with each node 
and arc. On the other hand, given a conceptual repre- 
sentation K, mapping rules work inversely as constraints 
on 1% to define a minimal DS that can be mapped to 1%. 
Thus, assuming that lexieal mapping rules are similarly 
provided for nouns (diskette and drive) and feature val- 
ues (imp, pros, and so on), we will have the conceptual 
representation~ 
~Conceptual representation of a sentence consists of instances 
of classes. We use a hyphen and a number following ~ c|a~s name (*insert-l, *imp-l, ...) when it is necessaxy to show instaJlces 
explicitly. Otherwise, we idtntlfy class na~nes and instance names. 
(*insert-I 
(:mood (*imp-l)) 
(:time (*pros-l)) 
(:theme (*diskette-I (:def (*indef-l)) 
(:ham (*sg-l)))) 
(:goal (*drive-i (:dof (*def-l)) 
(:hUm (*sg-2))))) 
for tile sample sentence and its DS shown earlier in this 
section. 
2.4 Conceptual Paraphrasing Rules 
We assume that a source language sentence and 
its translation into a target language frequently have 
slightly different conceptual representations. An adjec- 
tive in English might be a eonjugable verb in trans- 
lation. These differences result in added/missing in- 
formation in the corresponding representations. The 
conceptual paraphrasing rules describe such equiva- 
lence and seml-equivalence among conceptual represen- 
tations. These rules are sensitive to the target language, 
but not to the source language, since the definition of 
equivalence among conceptual representations depends 
on the cultural and pragmatic background of the lan- 
guage in which a translation has to be expressed. An 
example of a paraphrasing rule is 
(oquiv (*equal (:agent (*X (:hUm (*V)))) 
(:theme (*Y/*porson 
(:def *indef) 
(:sum (*W))))) 
(*Z/*ac%ion (:agent (*X) (:num (*V)))) 
(such-that (humanization *Z *Y) 
(sibling *V *W))) 
where *Y/*person specifies *Y to be an instance of any 
subclass of *person, *equal is roughly the verb "be," 
humanization is a relation that holds for pairs such ms 
(*singer, *sing) and (*swimmer, *swim), and sibling 
holds for two instances of the same class. Intuitively, 
this rule specifies an equivalence relationship between 
sentences such as "Tom is a good singer" and "Tom 
sings well," as the following bindings hold: 
(*equal (:mood (*dec)) (:time (*pros)) 
(:agent (*tom (:num (*sg)))) 
(:theme (*singer (:property (*good)) 
(:dof (*indof)) 
(:aura (*sg))))) 
(*sing (:mood (*dec)) (:time (*pros)) 
(:agent (*tom (:num (*sg)))) 
(:property (*good))) 
whore *X = *tom, *Y = *singer, *Z = *sing, 
*V = *sg, and *W = *sg 
All the instances that have no binding in a rule must 
remain unchanged as the same slot fillers (e.g., *dec and 
*pros), while some fillers that have bindings in a rule 
may be missing from a counterpart instance (e.g., *indef 
and *W above). Note that *good has lexical mappings 
to the adjective "good" and the adverb "well." 
2.5 Case Base 
A case base is a set of DSs with no syntactic/semantic 
ambiguities. A conceptual representation for a DS can 
be computed by a top-down algorithm that recursively 
tries to combine an instance mapped from the root node 
of a DS with an instance of each of its subtrees. The arc 
from the node to a subtree deternfiues the conceptual 
slot name. 
We have already built a case base that includes 
about 30,000 sentences from the IBM Dictionary of 
ACRES DE COLING-92. NANTES, 23-28 AOUl 1992 1 0 3 6 PrisE, oF COLING-92, NANTES. AUG. 23-28, 1992 
teAT v, keteBPc~ 
theme OBJECT " : location--ln -- theme 
ECT ~ PPADJONCT 
• "*'-... / " (CAT n) diskette ...... 
ICAT n) :iocatlon-{n ~-~ --~ 
Only relevant hfformation is slmwn. Mapping constraints (e.g. :Umme = DOBJECT) are actually associated for each 
pall" of instances. 
Figure 2: SamI)le DS with Mapping Constraints 
Computing \[x\]. Selected sentences in the \],DOCE have 
also been added to the case base. The sentences in 
the LDOCE define word senses or show sample usages 
of each entry. Though composed of a limited vocabu- 
lary, they are often syntactically/semantically ambigu- 
ous and it is time-consuming for users to build the 
case base completely manually. Therefore, the Shalt2 
parser is used to develop the case base. Starting with 
a small, manually crafted "core" ease base, e~ch new 
sentence is analyzed and disambiguated by the parser 
to generate a DS, which is corrected or otodified by the 
user and then added to the case base. As the size of 
the case base grows, tim prot)ortion of human correc- 
tions/modifications decreases, since the output of the 
parser becomes more and more accurate. Wise process 
is called knowledge bootstrapping and is discussed by 
Nagao \[6\] in more detail. Mapping coustraints, howevcr, 
are associated with only a part of the case base, because 
the NL class system and the mapping rules arc not yet 
complete. 
3. Pm'ser 
The Shall2 parser first generates a DS with packed 
structural ambiguities for a given inlmt sentence. It 
actually calls a PEG parser or 'lbmita's LR-parser \[12\] 
for PUG, and then calls a DS converter to map a PEG 
parse tree or a PUG feature structurc~ to a DS. Next, 
mapping rules arc applied to the DS so that lexical 
and structural mappings are associated with each node 
and arc in the DS. Figure 2 shows a DS with map- 
ping constraints for the sentence "Keep the diskette in 
the drive," where the verb "keep" has tive word senses: 
*execute, *guard, *hold, *employ, and *own. It is clear 
that we will end up with ten distinct conceptual repre- 
sentations if we evaluate all tim mapping rules, and in 
general, combinatorial explosion could easily make the 
parser impractical. Viewing the mapping rules as coo- 
stralnts rather titan procedural rules is the key to our 
parser, and is called delayed composition of conceptual 
representation. 
A sentence analyzer called SENA \[x41 disambiguates 
the DS by using a constraint solver JAUNT\[S\] and a 
"~Tomlta~s lucid amhigulty packing \[12\] is used to obtain a fea- ture structure with packed atructurM ambiguities. 
case base. JAUNT applies grammatical constraints (for 
instance, nlodifier-modifiee links between nodes do not 
cross one another) and semantic constraints (such as 
selectional restrictions,t~ functional control, and other 
NL object identity constraints detected by the context 
analyzer) uniformly to a DS that has ambiguities, and 
calculates pairwise consistency efficiently for each com- 
bination of nodes. Finally, the case base provides prefo 
erential knowledge to favor one pair of nodes over all 
other consistent pairs. The disambiguation process call 
be summarized as follows: 
1. For each conflicting arc in the DS, calculate the 
"dlstauce" \[6\] between the two nodes in the arc by 
using a case base. 
2. Leave tile ,xrc with the minimal distance and elimi- 
nate all the other conflicting arcs. Each NL object 
associated with a matching node in the case base 
also gives a higher preference to the same class of 
instance over the other instances in a node. 
3. Propagate the deletion of arcs to other nodes and 
arcs in the DS. Eliminate nodes and arcs that are 
no longer valid in the DS. 
4. Apply the above steps until there are no conflicting 
arc~ in the DS. 
The resulting 1)S has 11o structural ambiguity. B.e- 
madling lexical ambiguities are similarly resolved, be- 
cause we can also determine which pair of NL objects 
connected with an arc has the minimal distance in the 
case base. Our case base for cmnputer manuals would 
support the "diskette -PPADJUNCT- drive" arc and 
the *hold-1 instance with diskette as its DOBJECT. 
Therefore, the parser will eventually return 
(*hold-1 
( : ~cheme (*diske~;t e 
( : location-in (*drive)) ) ) ) 
as a disambiguated result. Nagao \[e} discusses snore so- 
phisticated techniques such as scheduling sets of arcs to 
bc disambiguated, backtracking, and relaaation of case 
base matching by means of an is-a hierarchy. 
Finally, a context analyzer is called to resolve 
auaphora, identity of definitc nouns, and implicit iden- 
tity between NL objects. It stores the DS in the working 
space, where references to preceding instances are repre- 
sented by the links between instances in the DSs. These 
inter-sentential links are used to determine the scopes 
of adverbs such as "also" and "only". 
For example, if the phrase "role of an operator" ap- 
pears in a text, the word "operator" could be a person 
who operates a machine or a logical operator for com- 
putation, but no sufficient information is available to 
resolve this ambiguity at this moment. In such cases, 
creating referential links in a forest of DSs could lead 
us to find evidence for disamblguating these two mean- 
ings. The scope of an adverb, such as "also," is de- 
termined by identifying repeated NL objects and newly 
introduced NL objects, where the latter are more likely 
to fall within the scope of the adverb. 
The context analyzer uses a similar method to deter- 
mine lexical ambiguities that were not resolved by the 
sentence analyzer wlmn the case base failed to provide 
enough information. 
?~We use about 20 of the semantic features described in the LDOCE. The restrictions imposed by the 
features are rather "loose," and are used to eliminate only unlikely combinations 
of word senses, 
Aches DE COL1NG-92. NANII~S, 23-28 AO~r\[ 1992 l 03 '7 PRO{:, O1: COLING-92. NANTES, AUO. 23-280 1992 
4. Concept Mapper 
Given a conceptual representation, which is an out- 
put from the parser, and a target language, the concept 
mapper tries to discover another conceptual represea- 
tation that has a well-defined mapping to a DS while 
keeping the semantic content as intact as possible. This 
process is called conceptual transfer. If the given con- 
eeptual representation already has a well-defined map- 
ping to a DS, the concept mapper does nothing and 
Shalt2 works like an interlingual MT system. It is im- 
portant that conceptual transfer should be related with 
the mapping to a DS, because there are generally many 
conceptual representations with a similar semantic con- 
tent. The existence of well-defined mapping not only 
guarantees that the generator can produce a sentence 
in the target language, but also effectively eliminates 
unsuccessful paraphrasing. 
111 addition to the paraphrasing rules mentioned ear- 
lier, the concept mapper uses the following genera\] rules 
for conceptual transferJ The paraphrasing rules are 
composed to make a complex mapping. 
* Projection: Map an NL object witha filled slot s 
to an instance of the same class with the unfilled 
slot s. Projection corresponds to deletion of a slot 
8. 
s Generalization: Map an NL object of a class X to 
an instance of one of the superclasses of X. 
e Specialization: Map an NL object of a class X to 
an instance of one of the subclasses of X. 
As an example, a projection rule is frequently used when 
we translate English nouns into Japanese ones, as in the 
following examp: 
diskette (*diskette (:sum (*st))) 
diskettes (*diskette (:num (*pl))) 
a diskette (*diskette (:num (*st)) 
(:def (*indeX))) 
the diskettes (*diskette (:num (*pl)) 
(:def (*def))) 
~4A~r 7 ~ (*diskette) 
Here, the four English noun phrases above are usually 
translated by the same Japanese noun phasef~ (the fifth 
one), which does not carry any information on mum 
and :def. We provide a paraphrasing rule for trans- 
lation in the opposite direction such that for any in- 
stance of the *object can obtain appropriate :sum and 
:def fillers. The parser, however, is responsible for de- 
termining these fillers in most cases. In general, the 
designer of semi-equivalent rules for translation in one 
direction has to provide a way of inferring missing infor- 
mation for translation in the opposite direction. Gen- 
eralization and specialization rules are complementary 
and can be paired to become equivalent rules when a 
specialization rule for any instance of a class z is un- 
ambiguous. That is, without losing any fillers, one can 
always choose an instance of a subclas~ V to which z can 
be uniquely mapped. A generalization from e~ch ~ to z 
provides the opposite mapping. 
~Theee are ~emi-equivMent rules. Equivalent rules have higher priority when the tulsa axe to be applied. 
~fOne exception is that deictic noun phrases are translated when we use the ~apanme counterpart "-~¢3 ~ for the determiner 
"the". 
5. Grammar-Based Sentence Genera- 
tor 
Recent investigation of unification grammars and 
their bi-direetionality Its, 9, \]0\] has enabled us to design 
and implement a grammar-based generator. Our gen~ 
crater uses a PUG grammar, which is also used by the 
parser, to traverse or decompose a feature structure ob- 
tained from a D$ in order to find a sequence of grammar 
rule applications, which eventually lead to lexical rules 
for generating a sentence. The generation algorithm is 
based primarily on Wedekind's algorithm, but h merit- 
tied for PUG. 
The current implementation of our generator lacks 
subtle control of word ordering, honorific expressions, 
and style preference. We are developing a set of dis- 
course parameters to a~ociate preferences with gram- 
mar rules to be tried, so that specific expressions are 
favored by the parameter settings. 
Acknowledgment 
Yutaka Tsutsumi built an initial version of concep- 
tual definitions. Katashi Nagao originally designed and 
implemented the disambiguation algorighm of the sen- 
tence analyzer. His continuous efforts to build a large- 
scale ease base and to improve the disambiguation al- 
gorithm have been continued by the Shalt2 group. Hi- 
roshi Maruyama enhanced his constraint solver for our 
project, which has led us to a constraint propagation 
method with delayed composition of conceptual repre- 
sentations. Michael McDonald read through an early 
draft of this paper and gave us many suggestions on 
technical writing. We thank them all. 
References 
1) IBM Corp. "Dictionary off Computing" (gth Edition). SC20- 1999-07, 1987. 
2) D. Gates, K. Takeda, T. Mitamur&, L. LevLn, and M. Kee. "Anab y=i, and Generation Grammar". 
M~ehine T~,n=latio,, 4(1)153- 9S t March 1989. 
3) K. Goodman and S. Nirenburg, editors. "The KBMT Project: A 
Coze Study in Kno~ledse-B~#ed Me, chine ~an#|allon n. Mor- gan Kaufmann Pubi~herw, San Matzo, California, 1991. 
4) H. Maruyama. "Structural Disambiguation with Constraint PropagationS. In 
Prec. el the £8th Anntml Meetin 9 of ACL, volume 31 pages 31-38, June 1990. 
5) H. Maruyama. "JAUNT: A Constraint Solver for Disjunctive Feature Structures". Technical Report PDfOO59, Tokyo ~arch 
Lab., IBM Reaearchj 1991. 
6) K. Nagao. "Dependency Analyzer: A Knowledge-Baaed Ap- proach to Structural Dmambiguation". In prec. of Ihe 13th 
lnlern¢tlonal Con\]erence on Comput¢tional Ling,lille*, pagen 484-489, Aug. 1990. 
7) E. Nyberg. "The FrameKit Uler% Guide Version 2.0". Technical Report CMU-CMT-88-MEMO, Center for Machine Translation, 
Carnegie Mellon Univereity, March 1988. 
8) Procter P. Lo,t#man Dictionarp el Contemporar~ 1English. Longman Group Limited, Harlow and London~ England, I978. 
9) S.M. Shieher, F. C. I"4. Pereira, G. van Noord, and R. C. Moore. "Semantle-Head-Driven Generation". 
Computational Linsni#- |icJ, 19(1):30-42, March 1990. 
1O) K. Takeda. =Bi-DireetlonalGrammarlfor MachineTranalation". In 
Pros. of Seoul International Conference on Natural Lan. 
guase Proce,Jin~, page~ 1~2~197, Seoul, Korea, 1November I990. 
11) K. Takeda. UD~ign~ng Natural Language Objects n. In Prec. 
oJ £nd lnlernationgl SFmpo*i~m on Databame S~/Jtem* for Ad. vaneed Appllcationlt pagem 444-448, Tokyo, Japan, April 1991. 
12) M. Tomlta. "Effleienl Parsln 9 for N¢tural Language: A Ftst 
Algor~th~ \]or Practical Sp*leml n. Kluwer Academic Publi0here, Beaten, MA, 198S. 
13) T. Ttutgurnl. "A Prototype Engllsh-Japanene Machine Tr~nsi~- tips System for Tranalatln 8 IBM Computer Minuals". In prec. 
of the llth lnternation¢l ConJerenee on Computation¢l Ligtt. i/tieJ, Auguut 1986. 
14) N, Uramoto. "I~xical and Structural Disarnbiguation Using an Example*Brute". In Prec. 
of the Sad Ja~n-A~Jtmlia Joint .qytnpo#iu~t on NLP, pages 1nO-160, Oct. 1991. 
15) J, Wedekind, =Generation as Structure Driven Derivatlon". In 
P~e. of the l£th International Conference on Computational Ligui*iiem , 
page= 732-737 t August 1988. 
ACRES DE COLING-92, NANTES, 23-28 AOt~T 1992 i 0 3 8 PROC. OV COLING-92, NANTES, AUG. 23-28, 1992 
