Natural Language ~-~I Planner PAR graphics 
Figure 3: General architecture of the animation system 
The planner uses information from the general 
schema, such as pre-conditions and post-assertions, 
as well as information derived from the agents' ca- 
pabilities and the objects properties to fill in these 
gaps in several ways: 
• to select the way (activity) in which the instruc- 
tion is performed (enter by walking, by swim- 
ming, etc.); 
• to determine the prepartory actions that must 
be completed before the instruction is carried 
out, (for example, in order for an agent to open 
the door, the door has to be reachable and that 
may involve a locomotion process); 
• to decompose the action into smaller units (put 
the glass on the table, involves getting the glass, 
planning a route to the table, etc.) 
The output of the planner for the input instruction 
is a complete description of the actions involved, in- 
cluding participants, preparatory specifications, ter- 
mination conditions, manner, duration, etc. Partic- 
ipants bring with them a list of inherent properties 
of the agent (e.g. agent capabilities) or physical ob- 
jects (e.g., object configurations) and other charac- 
teristics, such as 'how to open' for an object such 
as a door. This complete description refers to a set 
of animation PARS which can be immediately ani- 
mated. 
In this way, a PAR schema for the action enter 
may actually translate into an animation PAR for 
walking into a certain area. One way to differenti- 
ate between action PAR schemas and instantiated 
animation PARs is to consider what it is possible to 
motion capture 4 (by attaching sensors to a moving 
human figure). For example, the enter action and 
the put action are quite general and underspecified 
and could not be motion captured. However, char- 
acteristic activities such as walking and swimming 
could be. For further details about the animation 
PARs and the animation system see (Badler et al., 
1999) and (Bindiganavaie et al., 2000). 
4 PAR as anIL 
The PAR representation for an action can be seen as 
a general template. PAR schemas include, as part 
of the basic sub-categorization frame, properties of 
4There are several other ways to generate motions, for 
example, through inverse kinematics, dynamics and key- 
framing. 
the action that can occur linguistically either as the 
main verb or as adjuncts to the main verb phrase. 
This captures problems of divergences, such as the 
ones described by Talmy (Talmy, 1991), for verb- 
framed versus satellite-framed s. 
New information may come from a sentence in 
natural  that modifies the action's inherent 
properties, such as in John hit the ball slowly, where 
'slowly' is not part of the initial representation of 
the action 'hit'. This new information is added to 
the PAR schema. 
Verb- versus Satellite-framed s 
Verb-Framed Languages (VFL) map the motion 
(path or path + ground location) onto the verb, 
and the manner either onto a satellite or an ad- 
junct, while Satellite-Framed Languages (SFL) map 
the motion into the satellite, and the manner onto 
the main verb. 
English and other Germanic s are consid- 
ered satellite-framed s, expressing the path 
in the satellite; Spanish, among other Romance lan- 
guages, is a verb-framed  and expresses the 
path in the main verb. The pairs of sentences (1) 
and (2) from Talmy (1991) show examples of these 
divergences. In (1), in English, the exit of the bot- 
tle is expressed by the preposition out, in Spanish 
the same concept is incorporated in the main verb 
salir (to exit). In (2), the concept of blowing out 
the candle is represented differently in English and 
Spanish. 
(1) The bottle .floated out 
La boteUa sali6 flotando 
(the bottle exited floating) 
(2) I blew out the candle 
Apagud la vela sopldndola 
(I extinguish the candle blowing) 
4.1 Motion 
In order to capture generalizations about motion ac- 
tions, we have a generalized PAR schema for mo- 
tion, and our hierarchy includes different types of 
motion actions such as inherently directed motion 
and manner of motion actions that inherit from the 
more general schema, as shown in Figure 4. Directed 
motion actions, such as enter and exit, don't bring 
with them the manner by which the action is carried 
out but they have a inherent termination condition. 
For example, 'enter a room' may be done by walk- 
ing, crawling or flying depending on the agents' ca- 
14 
motion/(par: motion) 
directed_motion manner_raotion 
enter/(term: in (0B J) ) exit/(term: out (0B J) ) crawl/(act : crawl) float/(act ::float) 
Figure 4: PAR schema hierarchy for motion actions 
pabilities, but it should end when the agent is in the 
room. In contrast, manner of motion verbs express 
the action explicitly and don't have an intrinsic ter- 
mination condition. 
Motion is a type of framing event where the path 
is in the main verb for VFLs and in the satellite for 
SFLs. In (3), we see the English sentence expressing 
the 'enter' idea in the preposition into whereas the 
Spanish sentence expresses it in the main verb entrar 
(to enter). 
(3) The bottle floated into the cave 
La botella entr5 flotando a la cueva 
(the bottle entered floating the cave) 
The PAR schemas don't distinguish the represen- 
tation for these sentences, because there is a sin- 
gle schema which includes both the manner and the 
path without specifying how they are realiized lin- 
guistically. Mappings from the lexical items to the 
schemas or to constraints in the schemas can be seen 
in Figure 5. 5 Independent of which is the source lan- 
guage, the PAR schema selected is motion, the ac- 
tivity field, which determines how the action is per- 
formed (in this case, by floating), is filled by float 
(the main verb in English, or the adjunct in Span- 
ish). The termination condition, which says that 
action ends when the agent is in the object, is added 
from the preposition in English and is part of the 
semantics of the main verb to enter in Spanish. 
EN float/\[par:motion,activity:float\] 
into/\[term:in(AG,OBJ)\] 
SP entrar/\[par:motion,term:in(AG,OBJ)\] 
flotar/\[activity :float\] 
Figure 5: Entries for the example sentences in (3) 
Because all of the necessary elements for a trans- 
lation are specified in this representation, it is up 
5A lexical item may have several mappings to reflect its 
semantics: For instance, float in English can be used also in 
the non-motion sense, in which case there will be two entries 
to capture that distinction. 
MOTION PAR 
activity : float 
agent : participants : object : bottle \] 
cave 
termination_cond : in(bottle, cave) 
Figure 6: A (simplified) PAR schema for the sen- 
tences in (3) 
to the  specific component to transform it 
into a surface structure that satisfies the grammati- 
cal principles of the destination . 
Comparison with other work 
Our approach now diverges considerably from the 
approach outlined in Palmer et al. (1998) which 
discusses the use of Feature-Based Tree Adjoining 
Grammars, (Joshi, 1985; Vijay-Shanker and Joshi, 
1991) to capture generalizations about manner-of- 
motion verbs. They do not propose an interlin- 
gua but use a transfer-based mechanism expressed 
in Synchronous Tree Adjoining Grammars to cap- 
ture divergences of VFL and SFL through the use 
of semantic features and links between the gram- 
mars. The problem of whether or not a preposi- 
tional phrase constitutes an argument to a verb or 
an adjunct (described by Palmer et al.) does not 
constitute a problem in our representation, since all 
the information is recovered in the same template 
for the action to be animated. 
The PAR approach is much more similar to 
the Lexical Conceptual Structures (LCS) approach, 
(Jackendoff, 1972; Jackendoff, 1990), used as an in- 
terlingua representation (Doff, 1993). Based on the 
assumption that motion and manner of motion are 
conflated in a matrix verb like swim, the use of LCS 
allows separation of the concepts of motion, direc- 
tion, and manner of motion in the sentence John 
swam across the lake. Each one of these concepts is 
15 
represented separately in the interlingua represen- 
tation, as GO, PATH and MANNER, respectively. 
Our approach allows for a similar representation and 
the end result is the same, namely that the event of 
swimming across the lake is characterized by sepa- 
rate semantic components, which can be expressed 
by the main schema and by the activity field. In ad- 
dition, our representation also incorporates details 
about the action such as applicability conditions, 
preparatory specifications, termination conditions, 
and adverbial modifiers. It is not clear to us how 
the LCS approach could be used to effect the same 
commonality of representation. 
4.2 Instrument 
The importance of the additional information such 
as the termination conditions can be more clearly 
illustrated with a different set of examples. Another 
class of actions that presents interesting divergences 
involves instruments where the instrument is used 
as the main verb or as an adjunct depending on the 
. The sentence pair in (4) shows this di- 
vergence for English and Portuguese. Because Por- 
tuguese does not have a verb for to spoon, it uses 
a more general verb colocar (to put) as the main 
verb and expresses the instrument in a prepositional 
phrase. Unlike directed motion actions, a put with 
hand-held instrument action (e.g., spoon, scoop, la- 
dle, etc.) leaves the activity field unspecified in both 
s. The specific action is generated by taking 
the instrument into account. A simplified schema is 
shown in Figure 7. 
(4) Mary spoons chocolate over the ice cream 
Mary coloca chocolate sobre o sorvete coma 
colher 
(Mary puts chocolate over the ice cream with 
a spoon) 
PUT3 PAR 
activity : - 
participants : 
agent: Mary 
objects: chocolate, 
icecresm, 
spoon 
preparatory_spec : get(Mary, spoon) 
termination_cond : over(chocolate, icecream) 
Figure 7: Representation of the sentences in (4) 
Notice that the only connection between to spoon 
and its Portuguese translation would be the termi- 
nation condition where the object of the verb, choco- 
late, has a new location which is over the ice cream. 
5 Conclusion 
We have discussed a parameterized representation 
of actions grounded by the needs of animation of 
instructions in a simulated environment. In order 
to support the animation of these instructions, our 
representation makes explicit many details that are 
often underspecified in the , such as start 
and end states and changes in the environment that 
happen as a result of the action. 
Sometimes the start and end state information 
provides critical information for accurate translation 
but it is not always necessary. Machine translation 
can often simply preserve ambiguities in the transla- 
tion without resolving them. In our application we 
cannot afford this luxury. An interesting question 
to pursue for future work will be whether or not we 
can determine which PAR slots are not needed for 
machine translation purposes. 
Generalizations based on action classes provide 
the basis for an interlingua approach that captures 
the semantics of actions without committing to any 
-dependent specification. This framework 
offers a strong foundation for handling the range 
of phenomena presented by the machine translation 
task. 
The structure of our PAR schemas incorpo- 
rate into a single template the kind of divergence 
presented in verb-framed and satellite-framed lan- 
guages. Although not shown in this paper, this 
representation can also capture idioms and non- 
compositional constructions since the animations of 
actions - and therefore the PARs that control them 
- must be equivalent for the same actions described 
in different s. 
Currently, we are also investigating the possibility 
of building these action representations from a class- 
based verb lexicon which has explicit syntactic and 
semantic information (Kipper et al., 2000). 
Acknowledgments 
The authors would like to thank the Actionary 
group, Hoa Trang Dang, and the anonymous review- 
ers for their valuable comments. This work was par- 
tially supported by NSF Grant 9900297. 

References 
Norman I. Badler, Martha Palmer, and Rama Bindi- 
ganavale. 1999. Animation control for real-time 
virtual humans. Communications off the ACM, 
42(7):65-73. 
Norman I. Badler, Rarna Bindiganavale, Jan All- 
beck, William Schuler, Liwei Zhao, and Martha 
Palmer, 2000. Embodied Conversational Agents, 
chapter Parameterized Action Representation for 
Virtual Human Agents. MIT Press. to appear. 
Rama Bindiganavale, William Schuler, Jan M. All- 
beck, Norman I. Badler, Aravind K. Joshi, and 
Martha Palmer. 2000. Dynamically altering agent 
behaviors using natural  instructions. 
Fourth International Conference on Autonomous 
Agents, June. 
Hoa Trang Dang, Karin Kipper, Martha Palmer, 
and Joseph Rosenzweig. 1998. Investigating reg- 
ular sense extensions based on intersective levin 
classes. In Proceedings of COLING-A CL98, pages 
293-299, Montreal, CA, August. 
Bonnie J. Dorr. 1993. Machine Translation: A View 
from the Lexicon. MIT Press, Boston, MA. 
R. Jackendoff. 1972. Semantic Interpretation in 
Generative Grammar. MIT Press, Cambridge, 
Massachusetts. 
R. Jackendoff. 1990. Semantic Structures. MIT 
Press, Boston, Mass. 
Aravind K. Joshi. 1985. How much context sensi- 
tivity is necessary for characterizing structural de- 
scriptions:. Tree adjoining grammars. In L. Kart- 
tunen D. Dowry and A. Zwicky, editors, Nat- 
ural  parsing: Psychological, computa- 
tional and theoretical perspectives, pages 206-250. 
Cambridge University Press, Cambridge, U.K. 
Aravind K. Joshi. 1987. An introduction to tree ad- 
joining grammars. In A. Manaster-Ramer, editor, 
Mathematics of Language. John Benjamins, Ams- 
terdam. 
Karin Kipper, Hoa Trang Dang, and Martha 
Palmer. 2000. Class-based construction of a verb 
lexicon. In submitted to AAAL 
Beth Levin. 1993. English Verb Classes and Alter- 
nation, A Preliminary Investigation. The Univer- 
sity of Chicago Press. 
Martha Palmer, Joseph Rosenzweig, and William 
Schuler. 1998. Capturing Motion Verb General- 
izations with Synchronous TAG. In Patrick St. 
Dizier, editor, Predicative Forms in NLP. Kluwer 
Press. 
William Schuler. 1999. Preserving semantic depen- 
dencies in synchronous tree adjoining grammar. 
Proceedings of the 37th Annual Meeting of the 
Association for Computational Linguistics (ACL 
'99). 
Stuart M. Shieber and Yves Schabes. 1990. Syn- 
chronous tree adjoining grammars. In Proceedings 
of the 13th International Conference on Compu- 
tational Linguistics (COLING '90), Helsinki, Fin- 
land, August. 
Stuart M. Shieber. 1994. Restricting the weak- 
generative capability of synchronous tree adjoin- 
ing grammars. Computational Intelligence, 10(4). 
Leonard Talmy. 1991. Path to realization-via as- 
pect and result. In Proceedings of the 17th Annual 
Meeting of the Berkeley Linguistic Society, pages 
480-519. 
K. Vijay-Shanker and Aravind Joshi. 1991. Uni- 
fication based tree adjoining grammars. In 
J. Wedekind, editor, Unification-based Grammars. 
MIT Press, Cambridge, Massachusetts. 
