FROM STRUCTURE TO PROCESS 
Colorer-assisted teaching o l ~ various strategies for generating 
pronoun constructions in French (i): 
Michael Zock 
C4rard Sabah Christophe Aiviset 
LIMSI - Langues Naturelles 
B.P. 30 - Orsay C~dex / France 
INSSEE - 3, av. P.Larousse 
9#241 Malakoff - France 
ABSIPACT 
This paper describes an implemented tutoring system (2), 
designed to help students to generate clitic-constmctions in French. 
While showing various ways of converting a given meaning structure 
into its corresponding surface expression, tbe system helps not on- 
ly to discover what data to process but also h~_~ this information 
processing should take_place. In other words, we are concerned with 
efficiency in verbal planning (performance). 
Recognizing that the same result can be obtained by various 
methods, the student should find out ~dch one is best suited to 
the circumstances (what is kr~n, task demands etc. ). Infon~atio- 
hal states, hence the preeesser's needs, may vary to a great ex- 
tent, as may his strategies or cognitive styles. In consequence, in 
order to become an efficient processor, flu student }\]as to acquire 
not only S\]RUCilRa_ or RULE-KNOWLEDGE but also PROCEDLI~-~_EIX~ 
(skill). 
With this in mind we have designed three modules in order 
to foster a reflective, experimental attitude in the learner, hel- 
ping him to discover insightfully the ,Dst efficient strategy. 
i. ~TION 
It is well known that the same output can be achieved by 
several methods. For example, a given set of sentences or texts can 
be generated by a variety of equivalent but different grammars. 
Any of U~se grammars can be used in numerous ways. 
Grammars are generally noutra\] with respect to processing 
(3). They pertain only to competence and performance factors such 
as memory load, focus of attention, etc. lie out of their scope. 
lh3oc~\] different granmars may be equivalent in terms of their pro- 
duct -they all produce the same result, i.e. the same set of sen- 
fences- they certainly differ in te,~ns of the processing, that is 
to say in terms of their relative efficiency (speed, memory load, 
etc. ). 
Whereas most scholars werking in the domain of generation 
de not deal with strategies (4) -they consider but one way to reach 
the solution- ~e will be concerned by the procedural inlolications 
of using a given gra, mar in a variety of ways. 
Instead of having co,loeting grammars, we will take one of 
them (5) and l~elate its efficiency to the way it is used. \]\]\]is per- 
fotmanoe-orionted approadl seems justified on theoretical as well 
as on practical grounds (economy and flexibility of processing). 
Let us take, for example, a student who ~uld like to be- 
cone fluent in French. Obviously, he would have to learn not only 
what to process, but also how to process in order to efficiently 
convert a given meaning (conceptual graph) into its corresponding 
expression (seetenee). In oN~er werds, our student has to learn 
not only a set of gcammatical rules but also a set of strateqies 
or operating principles (6) powerful and flexible enough to get 
from a given input (meaning) to the output (sentence) in the most 
economic way, i.e. with the fewest operations, with the least sto- 
rage, and wiN\] the minimum amount of transformations. 
2. PROCE55, FUNCTION OF STRUCTL~: 
It is a well known fact that students learning French have dif- 
fienlties in producing fluently sentences with 2 pronoun ccmploments 
such as: 
Dis-le moi V-OO-lO Tell me (tell me it) 
Ne le lui dis pas! neg-DO-lO-V-neg Don't tell him (that)! 
Ii te le donnora S-IO-DO-V He'll (.live it to you 
Ii le lui deonera S-DO-IO-V He'll give it to him 
Je te pr6sente ~ elle S-DO-V-prep-lO I'Ii pres~\]te you to her 
It is interesting to find out why these constructions are so 
difficult to learn and to process. We believe that there are three 
basic reasons for this: 
i) the structural idiesyneraeies oF the French system: 
mor~\]ology and syntax are interdep~ant; 
2) the procedural implications of this structure: 
many morphemes have an embedded structure (see below); 
3) the resource limitations of the l~mn processor: 
being a serial processor, the learnor can focus his attention 
on but one thing at a time. 
2.1 STRUCIIP#~ PARTICIIARITIES: 
French pronoun constructions are complicated because syntax 
and morphology are interrelated, form as well as position depending 
upon each other. Their generation implies that one is capable of 
dete~nLning at least three things: 
- the fom\] of a given referent: 
for example, the concepts SPEAKER or 3d PERSON can be realized 
in any of the following forms: 
SPEAKER: je, me, moi 
3d PERSON: il, e\].le, ils, dies, on, se, sol, 
le, la, ies, lui, lent, cux; 
- its position: 
In the affinnative mode there are three positions or sentence- 
frames: 
a) 5-10-DO-V il me le pr4sente (he presents him to me) 
b) S-DO--IO-V il l-e lui pr4sente (he presents him to her) 
c) S-DO-V---prep-I_O0 il me ~sonte ~ elle(he presents me to her) 
- ~letber the preposition, inherent in the base, should be made 
explicit or not. As the examples (b) and (c) clearly show, the 
same verbeonstmction may or may not require elision of the pre- 
position. Either one affects form as ~ell as position (7). 
It should be noted that while most verbs allow only for two 
patterns in the. declarative mode ('a' and 'b'), those with an ani- 
mate object such as 'pr6scnter' (to present) allow also for 'c'. 
566 
2.2 I-I~OCE1)LI~L \]M~tICATION~: 
~tm lir~dsttc constraints operate on all levels: phono\].~ji- 
caI, mrphol(~jical and syntact.i.cal. 
a) H~onoloqi~il constraints: 
lhe detemdnation of ~)rpbology generally requires three ope- 
rations (porsen, case, nLraber and so~Keimes gender), yet: pronouns 
are n~nosyl.labie. \]in c~isequencc, one cannot plan tl~ i~xt pronoun 
while uttering tl~ eut'rc~lt one as the pronoun uttered is ton short 
and the time needed for planning the next ~e being too long. 
b) Morphological. om~straints: 
There are number of cases where the indirect object has an 
embedded structure, i.e. U~. morl~ology of' the indirect object d e- 
p~Js upon information cx~dng from the direct ~_~t (8). This im- 
plies intercupLion of a routine. Suppose that the sentence: 
3d~ presents Paul to Mary 
is to be. prononinalized. Tile problem is ttk~ detemdnati~ of form 
and position of Lt~ prc~)nos, referring respectively to "Paul" ar~J 
to "Mary". The indirect ob~ (Mary) lexicalizes either as LUI 
or as EI£E, dopending upon ~,ther the direct ~. (Paul) rcl)re- 
sents the sgeaker/lister~r or a 3d person, in this latter ense (e) 
the verb follows the indirect object, va~ereas in the forrmr (d) i.t 
precedes it. 
(d) il me prgsenbe ~ EI_LE (he presents ~m to her) 
(e) il le LUI pr4sento (he presents him to her) 
e) 5Vntactical constraints: 
lhe lir~r order of tile corlstitoents carl generally not be 
established, u~til both objects are known. In cons~ce, at least 
one of the. t~o elements bas to be stored in werkLng nmmory. 
(f) i.1 le \]ui donne he give~s it to him (S-DO-IO-V) 
(g) il me ie donne he gives it to me (S-IO-DO-V) 
Suppose that the. direct object has been processed right af- 
ter U~ subject. In that ease one I<nows its form but not tlcce~sa- 
rily its position ('f' or 'g'). This latter de~x~nds upon the vah~ 
of the indirect object. If U~ indirect object is in tlm first or 
second person it precedes the dit'ect objeet (g), otherwise it Pol- 
.lows it (f). Should ~e start by processing the indirect object be- 
fore the direct one, we might have to keep tl~e fommr in working 
memory, lhis is precisely the case of "f" ~ere the indirect ob- 
ject is in U~e third person and not tmflexive. As one can see, in 
beth sih~tions one is faced with unwanted storage problems. 
Obviously these structural particularities of tlm Frend~ 
pron(xm system have implication~ not only for U~e process of lear- 
ning but also for the process of generation, namely: 
they exclude any ~ord-to-word processing, and 
they *~@Jire a certain amount of prepl~dng or 
look-algld. 
~lat is needed tl~n, in order tO avoid false starts or cor- 
reetions (bad<tracking), i.s global planninc~ on Um clause level 
rat)mr than local pl.annlnc~ on the word level. 
In the light of tl~-~se facts one has to ad~t that gm tree- 
ration of pro~m constructions in Frend~ is i~ot all that simple. 
Althe~jh the relevant features (rules) are simple in nature, their 
interaction is highly conlolex. It is U~s not surprising that stu- 
dents take a \].ong ti.~ to understand all U~ intricacies of the 
system, ~ieh would allow U~-ml eventually to integrate the rules 
into an efficient prccess-modei. 
3. (IZI:CTIVE: 
The syst~n descrit~:~ here ts an atbmi)t to help the stu- 
dent to acquire the necessary struchJral and procedural tmowle&je. 
\]~s goal mn be cl~aracteriz~t as follows: 
}~hi\]e lenrning experi~}ta\]ly about structore (gra,m~r- 
rulfxs) be sheu\]d Learn as w~\].l abouL the process of incrc~ilental 
senter~e generation. In other tmrds, by playlng wiU~ the system, 
the studmt should gain necessary insights into tJ~e gra,m~r, :its 
procedural implications etc. Fie should also reflect upon his {*~l 
strat~lies. All these insights shotl\]d help hlm to develop a more 
effiec~lt set of prxx~.edures. 
Since the discovery of moll opti,nl processing strategies 
implies thai one \].earns t~' to access tim (~/~mtltJc~\] database 
under di Pferent eirc~mtance~s, -the data and theh" use being se- 
parated- we have varied tie processing situation as well as the 
coding of the data. Variable task den~nds and n~ltiple represen- 
tation should enhance the flexibility, speed and econmly of pro- 
cessi ng. 
#. l}l-SCR\[PllChi OF IFE SYSILM: 
\[l*e heart of the systx~n is a I(nowlc~e base v~qich con- 
tains, in fol~l of production roles, the sLrt~Wra\] infom~ltien 
~vessary to ir~.'remontaiiy determine fern1 as well as pesitJon. 
F urthonllore tl*e system oonta:ins an inference mcehanJsal, :i .e. a 
.set of rules, ~hose function is to dediee new facts from any th- 
fot~tion given to the system. 
lhe base can tx ~. accessed in various ways, thus allowing 
for for varying usage of the knowledge acxeordi~J to the objective. 
We will. use it here in three ways, varying one of the following 
parmetors: input:, output, or processing, v&il.e keeping the other 
wing ways: 
- ~at is Imown at the input ? 
- wf~3t is expected at the output ? 
- ~hid~ trothed or strategy is used to get frem one to the other ? 
lhe thr(~ mothods have a coIrl/lOn goaL, ncqllle\].y, the buil- 
ding of larger bhx~ks (sd/emotas). Ole of the main objectives is 
to induce strat*;gies ~here items belonging c(xqoeptoally together 
aide also preeess~ together (grouping). This cl~nklng meU~ed 
avoids not only t~necessary disruptions and memory \]cad, but it 
hepefully Favors the evolutico fro~ serial to simaltanecus pro- 
oessing. 
5. )~PLZCAT\]ONS: 
%1 "IF£ SOC~IIC IvE~OD: 
The system guides the student in the form of a dialogue, 
by ~lowing him ~hat and }low to process in order to get from an 
input to the outpuC. file use\[' starts by pt~vidin 9 the input 
(verb pattern composed of a verb, its conlolements and preposi- 
tions ) : 
donner (qn,qc,~ qn) to give (so, sth, to so) 
lhe system takes over, asking for more infom~tion 
about these basic el~llents. By asking specific questions (per- 
sc~l, gender, nm~er etc.), the systems shows ~dch informoti~ 
Js relevant ~hm determining form as well as position. ~hile 
answering these questions the student incrementally determines 
the final form of t|~ sentenc*~, lhe following example may illus- 
trate Ute proeess: 
567 
IN~UI given 
by the user: 
PROCESSING 
donner- (quelqu'un, quelque chose, ~ quelqu'un) 
to give (somebody, something, to somebody) 
prompts f~m the system answers given by success, 
questions (attributes) the user (value) OUIPUTS 
SPEECH-ACT order 
SUBJECT 
person 2 
number plural donnez 
DIRECT OBJECT 
quantity definite 
person 3 
number singular 
gender male le 
IN)TRECT OBJECT 
person i 
nmi~er singular moi 
linearized output: donnez-le moi! 
(give it to me! ) 
\]he qualities of this socratic dialogue \].ie in the visuali- 
zation of the whole process. The system demonstrates which infor- 
mation should be processed and in what order. It also shows under 
~hat conditions movement of constituents are necessary. These per'- 
mutations are g~own on the screen, so that the user can learn ~deh 
features control those movements. Furthermore, the results of the 
processed date are shown on-line, i.e. the form and position of 
the ~ord determined are shown instantaneously. Finally the system 
tells ~hether the newly determined item can be articulated right 
away or not. The system is thus explicit with respect to rule 
knowledge and optimal in temm of processing. The result is ob- 
tained in the most economic way. 
The disadvantage of this sys~-drivm processing reside in 
the fact that the solution, or more precisely, the method used to 
arrive at the solution, is shown but not discovered. Moreover, on- 
ly one method is considered, hence the procedural knowledge remains 
implicit. Tim student will not even envisage other methods. He may 
thus know how to convert meaning into sentences, bat this knowledge 
being implicit, he will not know how to transfer it to other si- 
tuations. 
5,2 GUIDED DISCOVERY 
The system still controls the nature of the operations but no 
longer controls their order. The latter is controlled, via stra- 
tegies, by the user. He decides in what order to process tlm data, 
Having determined the subject, whose positions is invariable, one 
can choose from three strategies: 
- a syntactical one (syntactic-driven processing), 
- and two morphological ones (le~ical-ddven processing). 
If priority is given to syntax, no reordering of constituents 
is meant to take place, i.e. all information pertaining to ~rd 
order is processed. The result is an ordered eategoriai structure 
or syntactical frame (h) @rich will be filled in by the merpholo- 
gicai values determined later (i), for ex~lole: 
he gives it to her 
(h) sentence frame: Stl3~CT - DI~CT OBfCT - IN), 0B~CT - VERB 
(i) morpholo9~: il le lui - donne 
568 
If priority is given to morphelogy (lexically-driven generation), 
the form is determined before U~e relative order of the constituent 
elemeats. In this ease two strategies are possible: either one pro- 
eesses the direct or the iodi.rect object. 
The efficiency of these three strategies is of course not the 
same. It is precisely the user's task to find out whidl of these 
strategies is tlm most efficient. The system invites him to compare 
these methods by applying certain performance criteria: 
- oumher of steps necessary to generate the sentence, 
- what is kno~ when ? (form/position), 
- congruence of inpet/output order (are permutations necessary? 
LIFO/FIFO) 
- are there any conceptual disruptions ? (9) 
This experimental method should make the student aware of the 
fact that several strategies can be used to arrive at the solution. 
Ha should compare them with respect to certain criteria and reach 
his conclusions. 
5.3 USER DRI~_N EtPERIMENTATION: 
This method, like the previous one, is empirical. By playing 
with the system the student may gain certain insights about pro- 
cessing order, 
A matrix appears on the screen, ~ose blank spaces have to be 
filled in by the student. The herizontal line ~lows the syntactic 
information given with tim input (verb, subject, object, preposi- 
tion), -more inforfmltion is needed about those elements- tlm verti- 
cal line shows the nature of the information necessary to arrive 
at the output. 
Thus the processing once again consists of tlm specification 
of the values of a llst of attributes. However there is a funda- 
mental differences between this approach and tlm former, namely, 
the system has an inference mechanism. Each item of information 
given to the system is considered for its meaning potential, i.e. 
the system tries to fi.nd out whether some new facts can be inferred 
from the old fact. 
It should be noted that the inference power varies with the 
nature of the data as ~ll as with their order, lhere are eases 
~mre a single fact enables 3 other faet~ to he deduced (reflexives). 
A given inference ,my allow further deductions (inference-chain, 
knowledge propagation). This has of course an effect on Um process, 
tamely, tim greater the inference power, the greater tlm ecmow 
of precessing. This speaks for the following operating principle: 
the greater the inference power of a given piece 
of information, tim earlier it should be processed. 
\]his method is interesting in that, by testing different items 
and different order~ it makes possible to watch on the screen which 
items allow what inferences. Sinee those inferences depend upon 
the nature of the input as ~ell as on tim moment at ~hieh that in- 
formation is given, we believe that this module is particularly 
useful in helping discover the best possible order of processing. 
Furthermore we think that this method has another virtue, na- 
mely that it can simulate literally any knowledge state, thus me- 
king it possible, by experimental means to disoover the shortest 
path bet~en a given information state (input) and the solution 
(output). 
6. O\]4CLLSIONS: 
We have str(msed the need for teac~dn 9 procedural kr@le6tje 
(strategies) as well as structural knowledge (linguistic roles). 
~tmt~t~ore, ~ ar(~Jed that tt~ proeeO~-es to be \].earned had to 
be flexible, \[x~JSe the input conditions (informational states) 
as well as the cognitive styles may vary beth ar~ ledividuals 
and within the :mine incH.vidual. In integrating U~ student into 
the learning-process We hopefully make him: 
- actively curl(~JS (testing of hypothesis -learning by discovery); 
- conscious abc~t the need for planning (how far should one plan 
e~md ? ~qat are the planning units ?); 
= selective a\[x~it the means he should use (whid~ strategy is best 
under what e.troumstances ?). 
lhe ~ole zdca of having different strategies cmDete has 
been largely i~)red by current work on language generation. ~ile 
this as~ct may be only of secondary interest for automatLe g~m- 
ration in genoral, it certainly is not an animportant issue in 
cognitive model\[ ling, ~ther it be second langJage-learning or 
usage. 
7. NOTES: 
1 ° Oar grammar deals only with a sn~ll\] subset oF French, r~ely 
pronoun constructions (ctitics). Starting with input proposi- 
tions of the type: 
to give (sc~t~,se#~thing,to sm~N~dy) 
the system helps live student to determine the output. \]\]~ in- 
put above could \[cad to any of tl~ fell.owing output: 
QUESTION: Est-ce que tu le lui as dc~\]ne ° 
ASSERTION: Je le lui donne. 
BRDER: Donne le lui ! 
2 ° \]he modules described are written in S~m\]a and Prolng. They 
Were implemented by G.Sahah and C.Alviset. 
3 ° There are a few exceptions like Robinson's (1975), Carrol's 
(1980) or I<ml0en & Fbenk~'s (1982) approad~. 
#o See for example: Davey (1978), Mc Donald (1983), Mc i<eo~) 
(1982), Mann (1983), Sewa (\]983), Danlos (1985). 
5 ° Oar gr~m~r is basically a lexical-functional grammar (see 
Kay, 1979) 
6 ° ~Dng those operating principles are the following: 
- avoid disruptions by grouping together ~at belongs c~ep- 
tually toge~r; 
- start with the most informative items 
(feature hierarchy: PERS~, CASE_, NLM~R, GE~I~_R); 
- avoid unnecessary storage - start wi.th the lefi~st item. 
7 ° lhe fact that prepositions have marphological reflexes has 
been readily recognized by linguists. ~at has not been sho~ 
ate the conditions under which a preposition has to be expli- 
cited or not, but that is the kind of knowledge a speaker must 
have. 
8 ° This is generally r~t made explicit in lingeistlc descriptions. 
9 ° Given the fact that the ~le process is visualized in form of 
Pasca\]-like structures, the stt~ent can easily realize at what 
moment conceptual disnJptions take place. Hierarc~y is signalled 
through indentations. All features pertaining to the same re- 
fet~nt arerepresented on the same level. It can happen that one 
cannot process all infonmltton for a given referent. For exarole 
if prtority is given to syntax it often happens that one cannot 
oo~olete a procedure because of an onbedded structure. Having 
started with the dlrect object, one needs information from the 
indirect object before getting ~lck to the original object. \]his 
jm~in 9 forth and bade results in conceptual disruption, ~hich 
is precisely ~}at sheuld be avoided. 
References

Carroll, 3.B. 
1980 A Perfor~lnee Gr~w~nar Approach to Lang~\]ge Teaddng, in: 
8chiefelbuseh,R. : Non-specdl la,~mge cor~unication, ca. 
Park Press, Bait\]mare 

I)anlos,L. 
1985 C~ration Autematicue de Textes en Lancjues Naturelles 
Masson, Paris 

l)avey, A. 
1978 Discourse produc, tion, Edinburgh Lhiversity press 

Kay, M. 
1979 Functional Crmmar, in: \[~oecedings of IJ~ Am~Jal Meeting 
of the Linguistic Society of America 

Kerapen 
1982 
G. & Hoenka,%o, T. 
Incremental Senten~x.. Ceneration: Imolication for tlm Str-dc- 
Lure of a Syntactic Processor, Coling, pra(~Je 

Mann, W. 
1983 An Overview of the Pem~n Text Ceneration Systc~ 
ISI / I~-83-iI# 

Mc D~mld, D. 
1983 Nabmal Lar~juage Comration as a Cm~JtatJc~ll Problem: 
An Zntro(k~bion, in: Brady & 8erwicl< (eds.) Cemputatioual 
Models of Discourse 

Mc Ken~b K. 
1982 Generating Natural Language Text in Respon~ to Questions 
about Data-Base Structure, Ph.D Dissertation, University 
of Pennsy\]vania, TR NS-CIS-82-5 

Robins~l, 3. 
\]975 PerFormanec @amrnar, in: Reddy, R. (Ed.) Specch Reco nition 
Academic Press, New Yorl< 

Sowa, 3. 
1983 Generating Language fron Conceptual Graphs 
Cell). & Maths with Appis., Vol. 9, n°.l. 
