An Empirical Analysis of Constructing Non.restrictive NP 
Modifiers to Express Semantic Relations 
Hua Cheng and Chris Mellish 
Division of Informatics, University of Edinburgh 
80 South Bridge, Edinburgh EH1 1HN, UK 
huac, chrism @ dai. ed. ac. uk 
Abstract 
It is not a rare phenomenon for human written text 
to use non-restrictive NP modifiers to express es- 
sential pieces of information or support the situa- 
tion presented in the main proposition containing 
the NP, for example, "Private Eye, which couldn't 
afford the libel payment, had been threatened with 
closure." (from Wall Street Journal) Yet no previ- 
ous research in NLG investigates this in detail. This 
paper describes corpus analysis and a psycholinguis- 
tic experiment regarding the acceptability of using 
non-restrictive NP modifiers to express semantic re- 
lations that might normally be signalled by 'because' 
and 'then'. The experiment tests several relevant 
factors and enables us to accept or reject a number 
of hypotheses. The results are incorporated into an 
NLG system based on a Genetic Algorithm. 
1 Introduction 
To produce natural  text, an NLG system 
must be able to choose among possible paraphrases 
one that satisfies the highest number of constraints 
in a certain context. Paraphrases can use various 
constructions, for example, using nominalisation in- 
stead of a clause for event representation. We are 
particularly interested in the use of non-restrictive 
(NR) modifiers within a referring expression to ex- 
press certain semantic relationQ other than object- 
attribute elaboration (in the sense defined in (Mann 
and Thompson, 1987)), for instance, causal rela- 
tions, which are normally expressed by separate 
clauses connected by cue phrases (Knott, 1996) such 
as 'because '. 
"A non-restrictive component gives additional in- 
formation to a head that has already been viewed 
as unique or as a member of a class that has been 
independently identified,-mud therefoee is not' essml; 
tial for the identification of the head" (Quirk et al., 
1985). This definition can be extended to account 
for modifiers of not only definite referring expres- 
sions, but also definite and indefinite NPs of var- 
ious types. In this paper, an NR modifier refers 
to any NP modifying component that is not essen- 
tial for identifying the object denoted by the head, 
including all modifiers of an NP that does not in- 
tend to identify (e.g. indefinite referring expressions 
and predicative phrases) (Kronfeld, 1990). Our dis- 
cussion focuses on definite referring expressions in- 
cluding proper-names because of the dominance of 
such examples in our corpus. However, we would 
expect no difficulty in applying our observation to 
other types of NPs. 
The semantic roles of NR modifiers, in particular 
NR clauses, are mentioned in many grammar and 
linguistics books. Quirk et al. (1985) point out that 
an NR clause in a referring expression is usually neu- 
tral in its semantic role (i.e. it provides descriptive 
information about its head), but sometimes it can 
contribute to the semantics of the main clause in 
a variety of ways. They summarise three types of 
semantic relations that can be expressed by an NR 
clause (examples are given in Figure 1): 
® Causal, where the situation in the main clause 
is caused by that in the NR clause, e.g. (la). 
® Temporal, where the two clauses form a time 
sequence, e.g. (lb). 
® Circumstantial, where the NR clause sets a tem- 
poral or spatial framework for interpreting the 
main clause, e.g. (lc). 
Halliday (1985) mentions that a subordinate 
clause can elaborate a part of its primary clause 
through restating, clarifying, refining or adding a de- 
scriptive attribute or comment (see (2) of Figure 1). 
Halliday's notion of elaboration is much more gen- 
eral than that in other coherence theories like RST 
(Maim "andThompson; t987), and :the rdation ex- 
pressed in (2) would not be treated as elaboration 
in most NLG systems. 
Similar phenomena were observed from the MUSE 
corpus 2, a corpus of museum exhibit labels, which 
l kVe are concerned with semantic (informational) relations 
in this paper• Argumentative {intentional) relations are be- 
yond the scope of this paper. 
2This corpus is collected and annotated for the GNOME 
project (Poesio, 2000), which aims at developing general al- 
gorithms for generating nominal expressions. 
108 
(1) a. 
b. 
C. 
He sent ahead the se,yeant, who was the most experienced scout in the company. 
In 1960 he came to ,London4 .:wh.are. :he :haa~lived • ever ~in~ze. 
The boy, who had his satchel trailing behind him, ran past. 
(2) Inflation, which was necessary for the system, became also "lethal. 
(3) In spite of his French name, Martin Carlin was born in Germany and emigrated to Paris to become 
an ebeniste. 
Figure 1: -Examples for NR modifiers .contributing. to the semantics of the main clauses 
• _ ....... . • ~ '., . ..... _- - 
describe museum objects on display. For example, 
in (3) of Figure 1, the modifier French is not for 
identifying the name, but for establishing a conces- 
sion relation between the main proposition and the 
subordinate phrase to increase the reader's positive 
regard for where Martin Carlin was born. 
For the convenience of discussion, we define some 
terminology to be used throughout the paper: 
An NR construction/sentence : a sentence that has 
a main clause and a subordinate NR modifier 
attached to one of its NPs (e.g. (4b) of Fig- 
ure 2). 
A hypotactic construction/sentence : a sentence 
that has a main clause and a dependent clause, 
connected by a cue phrase. This is a common 
way of expressing semantic relations such as 
causality (e.g. (4a) of Figure 2). In this syn- 
tactic category, we single out a subclass of sen- 
tences according to one possible semantic con- 
nection between the two clauses. It is defined 
below. 
An elaboration realisation : a type of hypotactic 
construction where one clause elaborates the se- 
mantics of the other. We take cue phrases "as 
for" or "what is more" to signal elaboration re- 
lations 3 . 
Previous research in NLG mainly focuses on us- 
ing NR constructions to realise elaboration relations 
but not other semantic relations (e.g. (Scott and 
de Souza, 1990) and (Hovy, 1993)). The NR modi- 
tier usually adds a descriptive attribute to the object 
denoted by the head. 
The linguistic research suggests for an NLG sys- 
tem the possibility to express certain semantic rela- 
cue phrases in most cases, and therefore could avoid 
using cues too heavily., This could be a better re- 
alisation under certain circumstances. Secondly, an 
NR construction enables a wider range of relations 
(especially those that are preferred to be expressed 
implicitly) to be selected for text structuring because 
the corresponding syntactic option is available. 
To understand how to enable an NLG system to 
generate such modifiers, we are faced with two ques- 
tions, which are not answered by linguistic research: 
1. Can this type of modifier be identified by human 
subjects, i.e. can humans tell the difference be- 
tween different NP modifier uses? 
2. Under what circumstances can an NR construc- 
tion be used in substitution of a hypotactic con- 
struction without changing the meaning dra- 
matically and how close are the meanings con- 
veyed by the two representations? 
An NLG system must come up with some solu- 
tions, simple or complex, to these two questions in 
order to choose among paraphrases. In this paper, 
we use cue phrases ms a signal of semantic relations 
rather than try to identify the relations directly. 
We describe systematically controlled experiments 
aimed at finding out the factors related to the gen- 
eration of this type of modifier in referring expres- 
sions. The result is intended to be reliable enough 
to be used by NLG systems in generating descriptive 
text. 
2 Corpus annotation 
To answer the first question, we annotated the 
MUSE corpus, from which we have observed three 
types of modifier uses in an NP: 
tions through NR constructions, which is important .... Firstly,. pro~i.ding .properties £o .uniquely identify 
in two aspects. Firstly, an NR construction gives--.--the objects or concepts denoted bythe-NP.With- 
a more concise alternative realisation for a relation, 
where the relation is expressed implicitly rather than 
explicitly and usually more subtly. It does not need 
3\Ve acknowledge that these cue phrases are controversial 
in their semantic interpretations, but not using cue phrases 
would be even more ambiguous. Besides, our experiment does 
not heavily depend on these cue phrases. 
out these modifiers, the NP can denote more than 
one object/concept or sets of objects/concepts and 
is ambiguous in its interpretation, e.g. those in (6a). 
Such modifiers usually appear in phrases headed by 
the definite article 'the', which according to Loebner 
(1987) has the same meaning in all its uses, includ- 
ing in generic references and predicatives. Modifiers 
109 
(4) a. 
b. 
(5) a. 
b. 
Private Eye had been threatened with closure because it couldn't afford the libel payment. 
Private ~Ege;-',which. couldn~t~.a~o.rd.thevlibel. :paymen.t,.,: had:~been~threa~ned'with" closure. 
But P&G contends the new Cheer is a unique formula that also offers an ingredient that prevents 
colors from fading. And retailers are expected to embrace the product, because it will take up less 
shelf space. 
And retailers are expected to embrace the product, which will take up less shelf space. 
Figure 2: Examples for inferrability 
in other types of genericreferences, e.g. indefini:tes; 
also belong here. 
This type subsumes the modifiers normally con- 
sidered by the referring expression generation mod- 
ule of an NLG system for uniquely identifying the 
referents (e.g. (Dale, 1992)). 
Secondly, having no effect in constraining a unique 
or unambiguous concept out of the NP which is ei- 
ther already unique or not required to have a unique 
interpretation, but being important to the situation 
presented in the main proposition containing the NP. 
This type includes the modifiers described in the 
previous section and many modifiers in indefinite 
predicatives, e.g. that in (6b). 
Thirdly, providing additional details about the 
referents of the NP, which functions the same way 
as the NP without these modifiers, e.g. those in 
(6c). The effect of such modifiers is usually local 
to the heads they describe rather than to the main 
propositions as a whole, which is the main difference 
between this and the second type of modifier. 
This type subsumes the modifiers normally gen- 
erated by an aggregation module, in particular one 
using embedding (e.g. (Shaw and McKeown, 1997), 
(Cheng, 1998)). 
(6) a. the decoration on this cabinet; the best 
looking food I ever saw 
b. This is a mighty empty country. 
c. the wide gilt bronze straps on the cof- 
fer fronts and sides; He lived in a five- 
room apartment in the Faubourg Saint- 
Antoine. 
To find out whether the above distinctions make 
sense to human subjects, we designed an annotation 
scheme for modifiers in NPs, describing which ele- 
ments of an NP should be marked as a modifier and 
how to mark the features for a modifier. Apart from 
other features, each modifier should be anno/atecl 
with a pragmatic function feature (PRAGM), which 
specifies why a modifier is used it: an NP. The pos- 
sible values for this feature are unique, int and attr, 
corresponding to the three types of modifier uses de- 
scribed above (we will use the value names to refer 
to the different types of modifier in the rest of this 
paper). X.XlL was used as the markup . 
We' had -two trained annotators mark the NP 
modifiers in the MUSE corpus according to their 
understanding of the scheme. The agreement be- 
tween them on the PRAGM feature by means of the 
Kappa statistic (Caxletta, 1996) is .734, which means 
that the distinctions we are trying to make can be 
identified by human subjects to some extent. The 
main ambiguity exists between int and attr modi- 
fiers. There seems to be a gradual difference between 
them and where to draw the line is a bit arbitrary. 
In the MUSE corpus annotated so far, 19% of 1078 
modifiers in all types of NPs axe identified as int. So 
this is not a trivial phenomenon. 
3 An experiment 
We reduced the size of the problem of when to use 
an NR construction by focusing on two relations: a 
causal relation signalled by 'because' and a temporal 
relation signalled by 'then'. The reason for choosing 
these relations is that the possibilities of expressing 
them through NR constructions have already been 
shown by linguists. The two cue phrases are typical 
for the corresponding relations and can often substi- 
tute other cue phrases for the same relations. In the 
rest of this paper, we will still use the term causal 
or temporal relation, but what we actually mean is 
the specific relation signalled by 'because' or 'then'. 
3.1 Independent variables and hypotheses 
From the generation point of view, our question is: 
given two facts and the semantic relation between 
them, what extra input do we need for making real- 
isation decisions? 
We collected examples of 'because' sentences from 
the MUSE corpus, and Wall .Street Journal source 
data, and transfered them to NR sentences by hand. 
Comparing the two constructions, we found some 
~, .An~eresting..vaxiation.:. _Eor .example,:compaxing the 
sentences in Figure 2, we found intuitively that the 
meanings of (4a) and (4b) are much closer than those 
of (5a) and (5b). In other words, (4b) can be used 
in substitution of (4a), whereas (5b) cannot, so easily 
41n (Carletta, 1996), a value of K between .8 and I in- 
dicates good agreement; a value between .6 and .8 indicates 
some agreement. 
110 
Independent Variables I\] Levels 
.Relation ...causal , temporal 
Inferrability strong weak 
Position initial final 
Order hypotactic vs. NR NR vs. hypotactic 
Subordination I nuc subordinate sat subordinate 
Cued/NoCue I use cue not use cue 
Table 1: Independent variables and their values 
substitute (5a). A simiiar pa~ttern can be foun(i in a 
number of other collected sentences. 
We claim that it is the degree ofinferrability of the 
relation between the semantics expressed through 
the two clauses that makes the difference. We define 
the inferrability of a causal/temporal relation as: 
Given two separate \]acts, the likeli- 
hood of human subjects inferring from their 
world knowledge that a causal/temporal 
connection between the \]acts might plausi- 
bly exist. 
In examples (4) and (5), the fact that Private Eye 
cannot afford the libel payment is very likely to di- 
rectly cause the closure threaten, whereas a prod- 
uct occupying less space is not usually a cause of 
it being accepted by retailers according to common 
sense. Therefore, the two realisations in (4) can be 
used in substitution of one another whereas those in 
(5) cannot. 
In\]errability is dynamic and user dependent. 
Given two facts, people with different background 
knowledge can infer the relation between them with 
different ease. If a relation is easily recognisable 
according to general world knowledge, we say that 
the inferrability of the relation is globally strong, 
in which case a hypotactic and an NR construction 
can express the relation almost equally well (if not 
considering rhetorical effect). Context can also con- 
tribute to the inferrability of a relation. A relation 
not easily recognisable from world knowledge may 
be identified by a reader with ease as the discourse 
proceeds. In this case, we say that the inferrabil- 
ity of the relation is locally strong, where the two 
constructions can express the relation equally well 
only in a certain context. In this paper, we mainly 
consider the global aspect of a relation and we will 
describe how we decided the value of inferrability in 
the next section. 
In Table 1, we summarise the factors (indepen- 
dent variables) that might play a role in the close- 
ness judgement between the semantics of a hypotac- 
tie construction and an NR construction. The levels 
are possible values of these factors. Besides Rela- 
tion and In\]errability. Position gives the location of 
the NP that contains the NR modifier. It can be the 
first (initial) or the last (final) phrase in a sentenceS; 
Order gives the order of presentation; a hypotactic 
sentence to be compared with an NR sentence or vice 
versa, which is used to balance the influence of cue 
phrases on human judgement; Subordination speci- 
fies whether the nucleus or the satellite is realised 
as an NR clause6; and Cued/NoCue means using a 
cue phrase in the NR clause or not, which is only 
applicable to the temporal relation, for example, 
(7) The health-care services announced the spinoff 
plan last January, which was then revised 
in May. 
Based on our observation of human written sen- 
tences, we have the following hypotheses: 
Hypothesis ! For both causal and temporal rela- 
tions, the inferrability of the relation between the se- 
mantics of two \]acts contributes significantly to the 
semantic similarities between a hypotactic construc- 
tion and an NR construction. 
In other words, if the in\]errability of the relation 
between the two facts is strong, the semantic rela- 
tion can be expressed similarly through an NR con- 
struction, otherwise, the similarity is significantly re- 
duced. 
Hypothesis 2 For the causal relation, the satellite 
subordination bears significantly higher similarity m 
meaning to the hypotactic construction than the nu- 
cleus subordination does. 
For example, (4b) would be preferred to "Private 
Eye, which had been threatened with closure, couldn't 
afford the libel payment." 
Hypothesis 3 For the temporal relation, both the 
position of subordination and the use of an appro- 
priate cue phrase in the NR clause make a signifi- 
cant difference to the semantic similarities between 
• a hypotactic and an NR construction. - 
This hypothesis prefers Example (7) to the reali- 
sation that does not have 'then'. 
5|n our implementation, we restrict ourselves to sentences 
with two NPs. 
aWe assume that in the causal relation, the clause bearing 
'because'is always the satellite. Since the temporal relation 
is a multinuclear relation, this factor does not apply. 
111 
Dependent Variables 
Naturalness Similarity. 
exactly the same _~ 
very similar 
more similar than di~erent 
N/A 
natural 
fairly natural 
more different than similar so-so 
very different fairly unnatural 
totally different unnatural 
Table 2: Dependent variables and their values 
3.2 The design of the experiment 
To assess a semantic similarity, which is thought to 
be influenced by the independent variables, we use 
human subjects to judge the following two depen- 
dent variables: 
Naturalness : how fluent a sentence is on its own. 
Similarity : how similar the meanings of two sen- 
tences are without considering their natural- 
ness. 
The scales of the variables are selected such that 
all values on the scale have natural verbal descrip- 
tions that could be grasped easily by our subjects 
(see Table 2). Similar rating methods have been 
described in (Jordan et al., 1993) to compare the 
output of a machine translation system with that of 
expert humans. 
Since we want to measure different groups of 
similarity judgement based on different in\]errabil- 
ity, order or position levels, a between-groups de- 
sign (Hatch and Lazaraton, 1991) seems to be most 
appropriate. The design we used is illustrated in 
Table 3, where all possible combinations of the in- 
dependent variables are listed. In the table, para- 
phrases gives the types of alternative sentences each 
original sentence has. They should be scored by hu- 
man subjects for their similarities to the original sen- 
tences and their naturalness. 
We used a method similar to random selection 
to create a stratified random sample. The sample 
should contain 12 hypotactic sentences and 12 NR 
sentences: two for each combination of the causal re- 
lation and one for each combination of the temporal 
relation. These numbers were used to obtain as big 
a sample as possible which could still be judged by 
human subjects in a relatively short period of time 
(say less than 30 minutes). 
Using cue phrases as- the indicators of'the se ..... 
mantic relations between clauses, we collected all 
the sentences containing 'because' or 'then' from the 
Wall Street Journal source data. and went through 
each of them to pick out those that actually signal 
the desired relations and can potentially have NR- 
realisations, i.e. where there is a coreference relation 
between the two NPs in the two clauses. Sentences 
containing NR clauses signalled by ', which' or ', 
who ':~were~=coUected similarly<,<From: these~:seritcnces, 
we randomly selected one by category. If it realised 
an unused factor combination, it was kept in the 
sample. This process was repeated until we collected 
the right number of test items which instantiated all 
combinations of properties in Table 3. 
We asked two subjects to mark the 24 selected 
items with regard to their inferrability on a five- 
point scale: 5 for very likely, 4 for quite likely, 3 
for possibly, 2 for .even less possibly and 1 for un- 
known.-We~took values of 4 and 5 as Strong ahd"the 
others as weak. The subjects and an author agreed 
on 19 items, and the author's version was used for 
the experiment. 
For the test items, we manually produced the cor- 
responding paraphrases, which were then put into a 
questionnaire for human assessment of the two de- 
pendent variables for each paraphrase. 
3.3 Results 
We had ten native English speakers evaluating tile 
similarity and naturalness on the sample. 
3.3.1 Similarity 
Since the similarity data is ordinal data and departs 
significantly from a theoretical normal distribution 
according to One-Sample Komogorov-Smirnov Test, 
we chose Mann Whitney U, which is a test for com- 
paring two groups on the basis of their ranks above 
and below the median. The result is summarised in 
Table 4, with statistically significant items in bold- 
face (taking the conventional .05 p level). The Z 
scores tell how many standard deviations above or 
below the mean an observation might be. Means 
gives the means of the similarity scores with respect 
to the values of the independent variables in Table 1. 
For the causal relation, there is a significant dif- 
ference between the means of similarities of the two 
groups of different inferrabilities (P<.0005). So we 
have high confidence to accept part of Hypothesis 1. 
i.e. the strong inferrability of the causal relation be- 
tween the semantics of two facts makes the semantic 
similarities between a hypotactic construction and 
an NR construction significantly higher than the 
weak case does. In the strong case, tile mean of 
similarity is 4.59, wilich is ,close to very similar. 
We treated order as a factor to be balanced and 
did not expect it to have a significant effect, but 
it does (P=.008). An NR paraphrase shows much 
higher similarity to its corresponding hypotactic sen- .... 
tence (with a mean of 4.46) than the other way 
round (with a mean of 3.83), but the difference be- 
comes smaller for the strong inferrability case. This 
could be because the causal relations expressed in 
NR sentences generally sound weaker than those in 
hypotactic sentences and the cue phrase has a big 
influence on the perceptibility of a relation. 
112 
Independent Variables I 
Relation \[ Order I inferrabflity I.Position 
causal 
temporal 
strong initial 
hypotactic vs. final 
NR sentence weak initial 
final 
strong initial 
NR sentence final 
vs. hypotactic weak initial 
final 
strong initial 
~ypot, actic vs ....... 5finAl 
NR sentence weak initial 
final 
strong initial 
NR sentence final 
vs. hypotactic weak initial 
final 
Paraphrases 
nuc & sat subordination 
NR sentence 
nuc & sat subordination 
NR sentence 
causal & 
elaboration hypotactic 
• cued & not 
cued NR sentence 
temporal & 
elaboration hypotactic 
Table 3: A between-groups 
Relation DependVar \[ Factors 
causal 
(160 cases) 
temporal 
(80 cases) 
Similarity 
Similarity 
(cued) 
design 
Means Z 2-tailed P 
Inferrability 4.59/3.70 -4.1015 <.0005 
Order 4.46/3.83 -2.6400 .0083 
Position 4.11/4.18 -.2136 .8308 
Inferrability 4.88/5.00 -.1022 .9086 
Order 5.08/4.80 -1.1756 .2398 
Position 4.80/5.08 -2.0649 I .0389 
Table 4: The output of Mann 
For the temporal relation, position is the only sig- 
nificant factor (P=.0389). So part of Hypothesis 3 is 
confirmed, that is, the final position subordination 
makes an NR paraphrase significantly more similar 
to the corresponding hypotactic construction than 
the initial position does. 
We do not have enough evidence to accept the 
claim that the inferrability of the temporal relation 
contributes significantly to the similarity judgement 
(as in Hypothesis 1). However, when we calculated 
the similarity mean for the alternative sentences us- 
ing cue phrases, strong or weak in inferrability, we 
got 4.94 (very similar). Comparing this with that of 
the strong causal case using the Mann Whitney U 
test, we get a significance level of 0.0294. This means 
that we have strong confidence to believe that the 
similarity mean for the temporal relation if using a 
cue phrase is significantly . higher. -than, that for the 
strong causal relation. Therefore, the temporal re- 
lation can always be realised by an NR construction 
as long as an appropriate cue phrase is used in the 
NR clause. 
The assumption of normality is also not met by 
the subset of the data related to Hypothesis 2 and 3 
(i.e. the similarity scores for nucleus/satellite subor- 
Whitney U on the similarity data 
dination paraphrases and cued/nocue paraphrases). 
We used the Wilcoxon Matched-Pairs Signed-Ranks 
Test because we were comparing pairs of para- 
phrases. The result is given in Table 5. We accept 
the hypothesis that the similarity means of nucleus 
and satellite subordination are significantly different 
in the initial position (Hypothesis 2). This confirms 
the linguistic observation that information of greater 
importance should be presented in a main position 
rather than a subordinate position. We can also ac- 
cept the hypothesis that for the temporal relation, 
using cue phrases in NR clauses can significantly im- 
prove the similarity score of the NR construction 
(Hypothesis 3). 
3.3.2 Naturalness 
~,¥e -used the Mann Whitney U test on naturalness 
with regards to order, inferrability and position, and 
found no significant connection. Figure 3 shows the 
distribution of naturalness assessment of the para- 
phrases for the causal and temporal relation respec- 
tively. The majority of the NR constructions are 
natural or fairly natural, which suggests that they 
could be good alternative realisations. 
113 
causal 
temporal 
D.. 
Paired Variables~ Means \] Z value \] 2-tail Sig \] 
 ua7evaa -3.o2 .oo3 
Relation \[ 
Table 5: The output of the Wilcoxon Matched-Pairs Signed-Ranks Test 
 because to NR clause 
~NR clause to because 
60 
50- 
¢) 
O 
0.. 
60- 
50- 
 thee to NR clause 
~NR clause to then 
Figure 3: The naturalness of the causal paraphrases (left) and the temporal paraphrases (right) 
3.3.3 Summary 
We briefly summarise the heuristics drawn from the 
experiment for expressing the causal and temporal 
relations with an NR construction. This is an ac- 
ceptable realisation in the following circumstances: 
e the causal relation holds between two facts and 
the inferrability of the relation is strong, in 
which case satellite subordination should be 
used; or 
® the temporal relation holds between two facts, 
in which case a final position subordination and 
an appropriate cue phrase, like 'then', should be 
used in the NR clause. 
We also found that an NR construction can ex- 
press the causal/temporal relation and the object- 
attribute elaboration relation at the same time, ir- 
respective of the inferrability of the relation. Gen- 
erally speaking, a semantic relation expressed by an 
NR construction sounds weaker than a hypotactic 
realisation with a cue phrase. Therefore, if a rela- 
tion is to be emphasised, NR constructions should 
not be used. 
4 Implementing the results in a 
OA-based text planner 
int-modifiers have a mixed character, i.e. like attr- 
modifiers they are not essential for identifying the 
referents, but like unique-modifiers they are not op- 
tional. Because of their role in supporting the se- 
mantics of the main propositions, the selection of 
int-modifiers should be a part of the text planning 
process, where a text structure is constructed to ful- 
fill the overall goals for producing the text. How- 
ever, compared with unique-modifiers, int-modifiers 
are less essential for an NP and they can only be 
added if there are available syntactic slots. 
Since embedding deals with attr-modifiers at both 
a content selection and an abstract realisation level, 
it could coordinate the addition of int-modifiers. 
Therefore, the text planner could consult the embed- 
ding module as to whether a property can be realised 
as an NP modifier, under the constraints from the 
NP type and the unique-modifiers that are already 
there. In other words, the text planner chooses facts 
to satisfy certain goals and the embedding process 
decides if the facts can be realised as NP modifiers 
in an abstract sense. 
We need a generation architecture that allows a 
certain degree of interaction between text planning, 
referring expression generation and embedding. So 
we chose the Genetic Algorithm based text planner 
described in (Mellish et el., 1998). Their task is, 
given a set of "facts and-relations between facts, 'to 
produce a legal RST tree using all the facts and some 
relations. Tile text planning is basically a two step 
process. Firstly sequences of facts are generated by 
applying GA operators, and secondly the rhetorical 
structure trees built from these sequences are evalu- 
ated and the good sequences are kept for producing 
better offspring. 
114 
We extended the text planner by adding a GA op- 
erator called embedding mutation, .which ~andomly 
selects two items mentioning a common entity from 
a sequence and assumes an embedding on them. Em- 
beddings are evaluated together with the other prop- 
erties an RST tree has. In this way, embedding is 
performed during text planning. The ultimate score 
of a tree is the sum of positive and negative scores 
for all the good and bad properties it bears. Since 
good embeddings are scored higher, they are kept in 
the sequences for producing,better offspring and. are 
very likely to be included in the final output. 
We incorporated the results from the experiment 
into the GA planner by using them as preferences 
for evaluating RST trees. We treated inferrability 
as an input to the system. If a good embedding can 
be formed from two facts connected by an RST re- 
lation (i.e. either of the two cases in Section 3.3.3 
is satisfied and the required syntactic slot is free), 
the embedding is scored higher than the hypotactic 
realisation. However, this emphasis on embedding 
might not be appropriate. In a real application en- 
vironment, other communicative intentions should 
be incorporated to balance the scoring for differ- 
ent realisations. And generally, inferrability has to 
be implemented based on limited domain-dependent 
knowledge and user configuration. 
5 Conclusion and future work 
This paper investigates the use of NR modifiers in 
referring expressions to express certain semantic re- 
lations. This is a commonly used strategy by human 
authors, which has not been explored by an NLG 
system before. Our experiment shows that when the 
conditions for inferrability etc. are satisfied, certain 
relations can be expressed through an NR construc- 
tion as well as a normally used hypotactic construc- 
tion with little difference in semantics. This facili- 
tates for an NLG system a way of expressing these 
semantic relations more concisely and subtly which 
could not be achieved by other means. 
Our experiment is restricted in many ways. One 
possible extension is to use more cue phrases to cover 
a wider range of cases for each semantic relation. In 
reality, the application domain should decide which 
relations need to be tested. 

References 
Jean Carletta. 1996. Assessing agreement on classi- 
fication tasks: the kappa statistic,. Computational 
Linguistics, 22(2):249-254. 
Hua Cheng. 1998. Embedding new information into 
referring expressions. In Proceedings of COLING- 
A CL '98, pages 1478-1480, Montreal, Canada. 
Robert Dale. 1992. Generating Referring Expres- 
sions: Constructing Descriptions in a Domain of 
Objects and Processes. The MIT Press. 
M.A.K. Halliday. 1985. An Introduction to Func- 
tianal- Grammar. Edward..,&rnold (.PUblishers) 
Ltd., London, UK. 
Evelyn Hatch and Anne Lazaraton. 1991. The Re- 
search Manual: Design and Statistics for Applied 
Linguistics. Newbury House Publishers. 
Eduard Hovy. 1993. Automated discourse genera- 
tion using discourse structure relations. Artificial 
Intelligence 63, Special Issue on Natural Language 
Processing, 1. 
: ~Pamela: Jordan,:~:~Bonnie: Dorr, _and John Benoit. 
..... 1993: A first-pass approach for evaluating ma- 
chine translation systems. Machine Translation, 
8(1-2):49-58. 
Alistair Knott. 1996. A Data-Driven Methodol- 
ogy for Motivating a Set of..Coherence Relations. 
Ph.D. thesis, Department of Artificial Intelligence, 
University of Edinburgh, Edinburgh. 
Amichai Kronfeld. 1990. Reference and Compu- 
tation. Studies in Natural Language Processing. 
Cambridge University Press. 
Sebastian Loebner. 1987. Definites. Journal of Se- 
mantics, 4:279-306. 
William Mann and Sandra Thompson. 1987. 
Rhetorical structure theory: A theory of text or- 
ganization. Technical Report ISI/RR-87-190, In- 
formation Sciences Institute, University of South- 
ern California. 
Chris Mellish, Alistair Knott, Jon Oberlander, 
and Mick O'Donnell. 1998. Experiments using 
stochastic search for text planning. In Proceed- 
ings of the 9th International Workshop on Natural 
Language Generation, Ontario, Canada. 
Massimo Poesio. 2000. Annotating a corpus to de- 
velop and evaluate discourse entity realization al- 
gorithms: Issues and preliminary results. In Pro- 
ceedings of LREC, Athens, May. 
Randolph Quirk, Sidney Greenbaum, Geoffrey 
Leech, and Jan Svartvik. 1985. A Grammar of 
Contemporary English. Longman Group Ltd. 
Donia Scott and Clarisse Sieckenius de Souza. 1990. 
Getting the message across in rst-based text gen- 
eration. In R. Dale, C. Mellish, and M. Zock, edi- 
tors, Current Research in Natural Language Gen- 
eration, pages 47-73. Academic Press. 
James Shaw and Kathleen McKeown. 1997. An ar- 
chitecture for aggregation in text generation. In 
Proceedings of the Fifteenth International Joint 
Conference on Artificial Intelligence, Poster Ses- 
sion, Japan. 
