Models for the Semantic Classification of Noun Phrases
Dan Moldovan, Adriana Badulescu,
Marta Tatu, and Daniel Antohe
Computer Science Department
University of Texas at Dallas
Dallas, Texas
moldovan@utdallas.edu
Roxana Girju
Department of Computer Science
Baylor University
Waco, Texas
girju@cs.baylor.edu
Abstract
This paper presents an approach for detecting
semantic relations in noun phrases. A learning
algorithm, called semantic scattering, is used
to automatically label complex nominals, gen-
itives and adjectival noun phrases with the cor-
responding semantic relation.
1 Problem description
This paper is about the automatic labeling of semantic
relations in noun phrases (NPs).
The semantic relations are the underlying relations be-
tween two concepts expressed by words or phrases. We
distinguish here between semantic relations and semantic
roles. Semantic roles are always between verbs (or nouns
derived from verbs) and other constituents (run quickly,
went to the store, computer maker), whereas semantic
relations can occur between any constituents, for exam-
ple in complex nominals (malaria mosquito (CAUSE)),
genitives (girl’s mouth (PART-WHOLE)), prepositional
phrases attached to nouns (man at the store (LOCATIVE)),
or discourse level (The bus was late. As a result, I missed
my appointment (CAUSE)). Thus, in a sense, semantic
relations are more general than semantic roles and many
semantic role types will appear on our list of semantic
relations.
The following NP level constructions are consid-
ered here (cf. the classifications provided by (Quirk
et al.1985) and (Semmelmeyer and Bolander 1992)):
(1) Compound Nominals consisting of two consecutive
nouns (eg night club - a TEMPORAL relation - indicat-
ing that club functions at night), (2) Adjective Noun con-
structions where the adjectival modifier is derived from a
noun (eg musical clock - a MAKE/PRODUCE relation), (3)
Genitives (eg the door of the car - a PART-WHOLE rela-
tion), and (4) Adjective phrases (cf. (Semmelmeyer and
Bolander 1992)) in which the modifier noun is expressed
by a prepositional phrase which functions as an adjective
(eg toy in the box - a LOCATION relation).
Example: “Saturday’s snowfall topped a one-day record
in Hartford, Connecticut, with the total of 12.5 inches,
the weather service said. The storm claimed its fatal-
ity Thursday, when a car which was driven by a college
student skidded on an interstate overpass in the moun-
tains of Virginia and hit a concrete barrier, police said”.
(www.cnn.com - “Record-setting Northeast snowstorm
winding down”, Sunday, December 7, 2003).
There are several semantic relations at the noun phrase
level: (1) Saturday’s snowfall is a genitive encoding a
TEMPORAL relation, (2) one-day record is a TOPIC noun
compound indicating that record is about one-day snow-
ing - an ellipsis here, (3) record in Hartford is an adjective
phrase in a LOCATION relation, (4) total of 12.5 inches
is an of-genitive that expresses MEASURE, (5) weather
service is a noun compound in a TOPIC relation, (6) car
which was driven by a college student encodes a THEME
semantic role in an adjectival clause, (7) college student is
a compound nominal in a PART-WHOLE/MEMBER-OF re-
lation, (8) interstate overpass is a LOCATION noun com-
pound, (9) mountains of Virginia is an of-genitive show-
ing a PART-WHOLE/PLACE-AREA and LOCATION rela-
tion, (10) concrete barrier is a noun compound encoding
PART-WHOLE/STUFF-OF.
1.1 List of Semantic Relations
After many iterations over a period of time we identified a
set of semantic relations that cover a large majority of text
semantics. Table 1 lists these relations, their definitions,
examples, and some references. Most of the time, the
semantic relations are encoded by lexico-syntactic pat-
terns that are highly ambiguous. One pattern can express
a number of semantic relations, its disambiguation be-
ing provided by the context or world knowledge. Often
semantic relations are not disjoint or mutually exclusive,
two or more appearing in the same lexical construct. This
is called semantic blend (Quirk et al.1985). For example,
the expression “Texas city” contains both a LOCATION as
well as a PART-WHOLE relation.
Other researchers have identified other sets of seman-
tic relations (Levi 1979), (Vanderwende 1994), (Sowa
1994), (Baker, Fillmore, and Lowe 1998), (Rosario and
Hearst 2001), (Kingsbury, et al. 2002), (Blaheta and
Charniak 2000), (Gildea and Jurafsky 2002), (Gildea
and Palmer 2002). Our list contains the most frequently
used semantic relations we have observed on a large cor-
pus.
Besides the work on semantic roles, considerable in-
terest has been shown in the automatic interpretation of
complex nominals, and especially of compound nomi-
nals. The focus here is to determine the semantic re-
lations that hold between different concepts within the
same phrase, and to analyze the meaning of these com-
pounds. Several approaches have been proposed for em-
pirical noun-compound interpretation, such as syntactic
analysis based on statistical techniques (Lauer and Dras
1994), (Pustejovsky et al. 1993). Another popular ap-
proach focuses on the interpretation of the underlying se-
mantics. Many researchers that followed this approach
relied mostly on hand-coded rules (Finin 1980), (Van-
derwende 1994). More recently, (Rosario and Hearst
2001), (Rosario, Hearst, and Fillmore 2002), (Lapata
2002) have proposed automatic methods that analyze and
detect noun compounds relations from text. (Rosario and
Hearst 2001) focused on the medical domain making use
of a lexical ontology and standard machine learning tech-
niques.
2 Approach
2.1 Basic Approach
We approach the problem top-down, namely identify
and study first the characteristics or feature vectors of
each noun phrase linguistic pattern, then develop mod-
els for their semantic classification. This is in contrast to
our prior approach ( (Girju, Badulescu, and Moldovan
2003a)) when we studied one relation at a time, and
learned constraints to identify only that relation. We
study the distribution of the semantic relations across dif-
ferent NP patterns and analyze the similarities and dif-
ferences among resulting semantic spaces. We define a
semantic space as the set of semantic relations an NP con-
struction can encode. We aim at uncovering the general
aspects that govern the NP semantics, and thus delineate
the semantic space within clusters of semantic relations.
This process has the advantage of reducing the annotation
effort, a time consuming activity. Instead of manually an-
notating a corpus for each semantic relation, we do it only
for each syntactic pattern and get a clear view of its se-
mantic space. This syntactico-semantic approach allows
us to explore various NP semantic classification models
in a unified way.
This approach stemmed from our desire to answer
questions such as:
1. What influences the semantic interpretation of various
linguistic constructions?
2. Is there only one interpretation system/model that
works best for all types of expressions at all syntactic lev-
els? and
3. What parameters govern the models capable of seman-
tic interpretation of various syntactic constructions?
2.2 Semantic Relations at NP level
It is well understood and agreed in linguistics that con-
cepts can be represented in many ways using various con-
structions at different syntactic levels. This is in part why
we decided to take the syntactico-semantic approach that
analyzes semantic relations at different syntactic levels
of representation. In this paper we focus only on the be-
havior of semantic relations at NP level. A thorough un-
derstanding of the syntactic and semantic characteristics
of NPs provides valuable insights into defining the most
representative feature vectors that ultimately drive the
discriminating learning models.
Complex Nominals
Levi (Levi 1979) defines complex nominals (CNs) as ex-
pressions that have a head noun preceded by one or more
modifying nouns, or by adjectives derived from nouns
(usually called denominal adjectives). Most importantly
for us, each sequence of nouns, or possibly adjectives and
nouns, has a particular meaning as a whole carrying an
implicit semantic relation; for example, “spoon handle”
(PART-WHOLE) or “musical clock” (MAKE/PRODUCE).
CNs have been studied intensively in linguistics,
psycho-linguistics, philosophy, and computational lin-
guistics for a long time. The semantic interpretation
of CNs proves to be very difficult for a number of rea-
sons. (1) Sometimes the meaning changes with the
head (eg “musical clock” MAKE/PRODUCE, “musical cre-
ation” THEME), other times with the modifier (eg “GM
car” MAKE/PRODUCE, “family car” POSSESSION). (2)
CNs’ interpretation is knowledge intensive and can be id-
iosyncratic. For example, in order to interpret correctly
“GM car” we have to know that GM is a car-producing
company. (3) There can be many possible semantic re-
lations between a given pair of word constituents. For
example, “USA city” can be regarded as a LOCATION as
well as a PART-WHOLE relation. (4) Interpretation of CNs
can be highly context-dependent. For example, “apple
juice seat” can be defined as “seat with apple juice on the
table in front of it” (cf. (Downing 1977)).
Genitives
The semantic interpretation of genitive constructions
No. Semantic Definition / Example
Relation
1 POSSESSION an animate entity possesses (owns) another entity; (family estate; the girl has a new car.), (Vanderwende 1994)
2 KINSHIP an animated entity related by blood, marriage, adoption or strong affinity to another animated entity; (Mary’s daughter;
my sister); (Levi 1979)
3 PROPERTY/ characteristic or quality of an entity/event/state; (red rose; The thunderstorm was awful.); (Levi 1979)
ATTRIBUTE-HOLDER
4 AGENT the doer or instigator of the action denoted by the predicate;
(employee protest; parental approval; The king banished the general.); (Baker, Fillmore, and Lowe 1998)
5 TEMPORAL time associated with an event; (5-o’clock tea; winter training; the store opens at 9 am),
includes DURATION (Navigli and Velardi 2003),
6 DEPICTION- an event/action/entity depicting another event/action/entity; (A picture of my niece.),
DEPICTED
7 PART-WHOLE an entity/event/state is part of another entity/event/state (door knob; door of the car),
(MERONYMY) (Levi 1979), (Dolan et al. 1993),
8 HYPERNYMY an entity/event/state is a subclass of another; (daisy flower; Virginia state; large company, such as Microsoft)
(IS-A) (Levi 1979), (Dolan et al. 1993)
9 ENTAIL an event/state is a logical consequence of another; (snoring entails sleeping)
10 CAUSE an event/state makes another event/state to take place; (malaria mosquitoes; to die of hunger; The earthquake
generated a Tsunami), (Levi 1979)
11 MAKE/PRODUCE an animated entity creates or manufactures another entity; (honey bees; nuclear power plant; GM makes cars) (Levi 1979)
12 INSTRUMENT an entity used in an event/action as instrument; (pump drainage; the hammer broke the box) (Levi 1979)
13 LOCATION/SPACE spatial relation between two entities or between an event and an entity; includes DIRECTION; (field mouse;
street show; I left the keys in the car), (Levi 1979), (Dolan et al. 1993)
14 PURPOSE a state/action intended to result from a another state/event; (migraine drug; wine glass; rescue mission;
He was quiet in order not to disturb her.) (Navigli and Velardi 2003)
15 SOURCE/FROM place where an entity comes from; (olive oil; I got it from China) (Levi 1979)
16 TOPIC an object is a topic of another object; (weather report; construction plan; article about terrorism); (Rosario and Hearst 2001)
17 MANNER a way in which an event is performed or takes place; (hard-working immigrants; enjoy immensely; he died of
cancer); (Blaheta and Charniak 2000)
18 MEANS the means by which an event is performed or takes place; (bus service; I go to school by bus.) (Quirk et al.1985)
19 ACCOMPANIMENT one/more entities accompanying another entity involved in an event; (meeting with friends; She came with us) (Quirk et al.1985)
20 EXPERIENCER an animated entity experiencing a state/feeling; (Mary was in a state of panic.); (Sowa 1994)
21 RECIPIENT an animated entity for which an event is performed; (The eggs are for you) ; includes BENEFICIARY; (Sowa 1994)
22 FREQUENCY number of occurrences of an event; (bi-annual meeting; I take the bus every day); (Sowa 1994)
23 INFLUENCE an entity/event that affects other entity/event; (drug-affected families; The war has an impact on the economy.);
24 ASSOCIATED WITH an entity/event/state that is in an (undefined) relation with another entity/event/state; (Jazz-associated company;)
25 MEASURE an entity expressing quantity of another entity/event; (cup of sugar;
70-km distance; centennial rite; The jacket cost $60.)
26 SYNONYMY a word/concept that means the same or nearly the same as another word/concept;
(NAME) (Marry is called Minnie); (Sowa 1994)
27 ANTONYMY a word/concept that is the opposite of another word/concept; (empty is the opposite of full); (Sowa 1994)
28 PROBABILITY OF the quality/state of being probable; likelihood
EXISTENCE (There is little chance of rain tonight); (Sowa 1994)
29 POSSIBILITY the state/condition of being possible; (I might go to Opera tonight); (Sowa 1994)
30 CERTAINTY the state/condition of being certain or without doubt; (He definitely left the house this morning);
31 THEME an entity that is changed/involved by the action/event denoted by the predicate;
(music lover; John opened the door.); (Sowa 1994)
32 RESULT the inanimate result of the action/event denoted by the predicate; includes EFFECT and PRODUCT.
(combustion gases; I finished the task completely.); (Sowa 1994)
33 STIMULUS stimulus of the action or event denoted by the predicate (We saw [the painting].
I sensed [the eagerness] in him. I can see [that you are feeling great].) (Baker, Fillmore, and Lowe 1998)
34 EXTENT the change of status on a scale (by a percentage or by a value) of some entity;
(The price of oil increased [ten percent]. Oil’s price increased by [ten percent]. ); (Blaheta and Charniak 2000)
35 PREDICATE expresses the property associated with the subject or the object through the verb;
(He feels [sleepy]. They elected him [treasurer]. ) (Blaheta and Charniak 2000)
Table 1: A list of semantic relations at various syntactic levels (including NP level), their definitions, some examples,
and references.
is considered problematic by linguists because they
involve an implicit relation that seems to allow for
a large variety of relational interpretations; for ex-
ample: “John’s car”-POSSESSOR-POSSESSEE, “Mary’s
brother”-KINSHIP, “last year’s exhibition”-TEMPORAL,
“a picture of my nice”-DEPICTION-DEPICTED, and “the
desert’s oasis”-PART-WHOLE/PLACE-AREA. A charac-
teristic of these constructions is that they are very pro-
ductive, as the construction can be given various inter-
pretations depending on the context. One such example
is “Kate’s book” that can mean the book Kate owns, the
book Kate wrote, or the book Kate is very fond of.
Thus, the features that contribute to the semantic in-
terpretation of genitives are: the nouns’ semantic classes,
the type of genitives, discourse and pragmatic informa-
tion.
Adjective Phrases are prepositional phrases attached to
nouns acting as adjectives (cf. (Semmelmeyer and
Bolander 1992)). Prepositions play an important role
both syntactically and semantically. Semantically speak-
ing, prepositional constructions can encode various se-
mantic relations, their interpretations being provided
most of the time by the underlying context. For instance,
the preposition “with” can encode different semantic re-
lations: (1) It was the girl with blue eyes (MERONYMY),
(2) The baby with the red ribbon is cute (POSSESSION),
(3) The woman with triplets received a lot of attention
(KINSHIP).
The conclusion for us is that in addition to the nouns se-
mantic classes, the preposition and the context play im-
portant roles here.
In order to focus our research, we will concentrate for
now only on noun - noun or adjective - noun composi-
tional constructions at NP level, ie those whose mean-
ing can be derived from the meaning of the constituent
nouns (“door knob”, “cup of wine”). We don’t consider
metaphorical names (eg, “ladyfinger”), metonymies (eg,
“Vietnam veteran”), proper names (eg, “John Doe”), and
NPs with coordinate structures in which neither noun is
the head (eg, “player-coach”). However, we check if
the constructions are non-compositional (lexicalized) (the
meaning is a matter of convention; e.g., “soap opera”,
“sea lion”), but only for statistical purposes. Fortunately,
some of these can be identified with the help of lexicons.
2.3 Corpus Analysis at NP level
In order to provide a unified approach for the detection of
semantic relations at different NP levels, we analyzed the
syntactic and semantic behavior of these constructions on
a large open-domain corpora of examples. Our intention
is to answer questions like: (1) What are the semantic re-
lations encoded by the NP-level constructions?, (2) What
is their distribution on a large corpus?, (3) Is there a com-
mon subset of semantic relations that can be fully para-
phrased by all types of NP constructions?, (4) How many
NPs are lexicalized?
The data
We have assembled a corpus from two sources: Wall
Street Journal articles from TREC-9, and eXtended
WordNet glosses (XWN) (http://xwn.hlt.utdallas.edu).
We used XWN 2.0 since all its glosses are syntacti-
cally parsed and their words semantically disambiguated
which saved us considerable amount of time. Table 2
shows for each syntactic category the number of ran-
domly selected sentences from each corpus, the num-
ber of instances found in these sentences, and finally the
number of instances that our group managed to annotate
by hand. The annotation of each example consisted of
specifying its feature vector and the most appropriate se-
mantic relation from those listed in Table 1.
Inter-annotator Agreement
The annotators, four PhD students in Computational Se-
mantics worked in groups of two, each group focusing
on one half of the corpora to annotate. Noun - noun
(adjective - noun, respectively) sequences of words were
extracted using the Lauer heuristic (Lauer 1995) which
looks for consecutive pairs of nouns that are neither
preceded nor succeeded by a noun after each sentence
was syntactically parsed with Charniak parser (Charniak
2001) (for XWN we used the gold parse trees). More-
over, they were provided with the sentence in which the
pairs occurred along with their corresponding WordNet
senses. Whenever the annotators found an example en-
coding a semantic relation other than those provided or
they didn’t know what interpretation to give, they had
to tag it as “OTHERS”. Besides the type of relation, the
annotators were asked to provide information about the
order of the modifier and the head nouns in the syntac-
tic constructions if applicable. For instance, in “owner
of car”-POSSESSION the possessor owner is followed by
the possessee car, while in “car of John”-POSSESSION/R
the order is reversed. On average, 30% of the training
examples had the nouns in reverse order.
Most of the time, one instance was tagged with one
semantic relation, but there were also situations in which
an example could belong to more than one relation in the
same context. For example, the genitive “city of USA”
was tagged as a PART-WHOLE/PLACE-AREA relation and
as a LOCATION relation. Overall, there were 608 such
cases in the training corpora. Moreover, the annotators
were asked to indicate if the instance was lexicalized or
not. Also, the judges tagged the NP nouns in the training
corpus with their corresponding WordNet senses.
The annotators’ agreement was measured using the
Kappa statistics, one of the most frequently used mea-
sure of inter-annotator agreement for classification tasks:
a0a2a1a4a3a6a5a8a7a10a9a12a11a14a13a15a3a6a5a16a7a18a17a19a11
a20
a13a21a3a6a5a8a7a10a17a22a11
, where a23a25a24a27a26a29a28a31a30 is the proportion of
times the raters agree and a23a25a24a27a26a29a32a33a30 is the probability of
agreement by chance. The K coefficient is 1 if there is
a total agreement among the annotators, and 0 if there is
no agreement other than that expected to occur by chance.
Table 3 shows the semantic relations inter-annotator
agreement on both training and test corpora for each NP
construction. For each construction, the corpus was splint
into 80/20 training/testing ratio after agreement.
We computed the K coefficient only for those instances
tagged with one of the 35 semantic relations. For each
pattern, we also computed the number of pairs that were
tagged with OTHERS by both annotators, over the number
of examples classified in this category by at least one of
the judges, averaged by the number of patterns consid-
ered.
The K coefficient shows a fair to good level of agree-
ment for the training and testing data on the set of 35 re-
lations, taking into consideration the task difficulty. This
can be explained by the instructions the annotators re-
ceived prior to annotation and by their expertise in lexical
semantics. There were many heated discussions as well.
Wall Street Journal eXtended WordNet 2.0
CNs Genitives Adjective Complex Nominals
NN AdjN ’s of Phrases NN
No. of sentences 7067 5381 50291 27067 14582 51058
No. of instances 5557 500 2990 4185 3502 12412
No. of annotated instances 2315 383 1816 3404 1341 1651
Table 2: Corpus statistics.
Kappa Agreement ( 1 - 35 )
Complex Nominals Genitives Adjective OTHERS
NN Adj N ’s of Phrases
0.55 0.68 0.66 0.65 0.67 0.69
Table 3: The inter-annotator agreement on the semantic annotation of the NP constructions considered on both training
and test corpora. For the semantic blend examples, the agreement was done on one of the relations only.
2.4 Distribution of Semantic Relations
Even noun phrase constructions are very productive al-
lowing for a large number of possible interpretations, Ta-
ble 4 shows that a relatively small set of 35 semantic re-
lations covers a significant part of the semantic distribu-
tion of these constructions on a large open-domain cor-
pus. Moreover, the distribution of these relations is de-
pendent on the type of NP construction, each type en-
coding a particular subset. For example, in the case of
of-genitives, there were 21 relations found from the total
of 35 relations considered. The most frequently occur-
ring relations were PART-WHOLE, ATTRIBUTE-HOLDER,
POSSESSION, LOCATION, SOURCE, TOPIC, and THEME.
By comparing the subsets of semantic relations in each
column we can notice that these semantic spaces are not
identical, proving our initial intuition that the NP con-
structions cannot be alternative ways of packing the same
information. Table 4 also shows that there is a subset
of semantic relations that can be fully encoded by all
types of NP constructions. The statistics about the lex-
icalized examples are as follows: N-N (30.01%), Adj-N
(0%), s-genitive (0%), of-genitive (0%), adjective phrase
(1%). From the 30.01% lexicalized noun compounds ,
18% were proper names.
This simple analysis leads to the important conclusion
that the NP constructions must be treated separately as
their semantic content is different. This observation is
also partially consistent with other recent work in lin-
guistics and computational linguistics on the grammatical
variation of the English genitives, noun compounds, and
adjective phrases.
We can draw from here the following conclusions:
1. Not all semantic relations can be encoded by all NP
syntactic constructions.
2. There are semantic relations that have preferences over
particular syntactic constructions.
2.5 Models
2.5.1 Mathematical formulation
Given each NP syntactic construction considered, the
goal is to develop a procedure for the automatic label-
ing of the semantic relations they encode. The semantic
relation derives from the lexical, syntactic, semantic and
contextual features of each NP construction.
Semantic classification of syntactic patterns in general
can be formulated as a learning problem, and thus bene-
fit from the theoretical foundation and experience gained
with various learning paradigms. This is a multi-class
classification problem since the output can be one of the
semantic relations in the set. We cast this as a supervised
learning problem where input/ output pairs are available
as training data.
An important first step is to map the characteristics of
each NP construction (usually not numerical) into feature
vectors. Let’s define witha0a2a1 the feature vector of an in-
stancea3 and leta4 be the space of all instances; iea0a1a6a5 a4 .
The multi-class classification is performed by a func-
tion that maps the feature spacea4 into a semantic space
a7 ,
a8a10a9
a4a12a11
a7 , wherea7 is the set of semantic relations
from Table 1, ie a24a14a13 a5 a7 .
Let a15 be the training set of examples or instances
a15
a1
a26a16a0
a20
a24
a20a18a17
a0a20a19a16a24a21a19
a17a23a22a24a22a25a22a24a17
a0a27a26a24a28a26a30a30a29 a26a16a4a32a31
a7
a30
a26 where
a33 is the
number of examplesa0 each accompanied by its semantic
relation label a24 . The problem is to decide which semantic
relation a24 to assign to a new, unseen examplea0a26a35a34
a20 . In or-
der to classify a given set of examples (members ofa4 ),
one needs some kind of measure of the similarity (or the
difference) between any two given members ofa4 . Most
of the times it is difficult to explicitly define this func-
tion, sincea4 can contain features with numerical as well
as non-numerical values.
Note that the features, thus spacea4 , vary from an NP
pattern to another and the classification function will be
pattern dependent. The novelty of this learning problem
is the feature spacea4 and the nature of the discriminating
No. Semantic Frequencya7a1a0 a11 Examples
Relations CNs Genitives Adjective
NN AdjN ’s of Phrases
1 POSSESSION 2.91 9.44 14.55 3.89 1.47 “family estate”
2 KINSHIP 0 0 7.94 3.23 0.20 “woman with triplets”
3 ATTRIBUTE-HOLDER 12.38 7.34 8.96 10.77 4.09 “John’s coldness”
4 AGENT 4.21 10.49 9.75 0.98 3.46 “the investigation of the crew”
5 TEMPORAL 0.82 0 11.96 0.53 7.97 “last year’s exhibition”
6 DEPICTION-DEPICTED 0 0 1.49 2.86 0.20 “a picture of my nice”
7 PART-WHOLE 19.68 10.84 27.38 31.70 5.56 “the girl’s mouth”
8 IS-A (HYPERNYMY) 3.95 1.05 0 0.04 1.15 “city of Dallas”
9 ENTAIL 0 0 0 0 0
10 CAUSE 0.08 0 0.23 0.57 1.04 “malaria mosquitoes”
11 MAKE/PRODUCE 1.56 2.09 0.86 0.98 0.63 “shoe factory”
12 INSTRUMENT 0.65 0 0.07 0.16 0.94 “pump drainage”
13 LOCATION/SPACE 7.51 20.28 1.57 3.89 18.04 “university in Texas”
14 PURPOSE 6.69 3.84 0.07 0.20 5.45 “migraine drug”
15 SOURCE 1.69 11.53 2.51 5.85 2.94 “president of Bolivia”
16 TOPIC 4.04 4.54 0.15 4.95 9.13 “museum of art”
17 MANNER 0.40 0 0 0 0.20 “performance with style”
18 MEANS 0.26 0 0 0 0.10 “bus service”
19 ACCOMPANIMENT 0 0 0 0 0.42 “meeting with friends”
20 EXPERIENCER 0.08 2.09 0.07 0.16 2.30 “victim of lung disease”
21 RECIPIENT 1.04 0 3.54 2.66 2.51 “Josephine’s reward”
22 FREQUENCY 0.04 7.00 0 0 0.52 “bi-annual meeting”
23 INFLUENCE 0.13 0.35 0 0 0.73 “drug-affected families”
24 ASSOCIATED WITH 1.00 0.35 0.15 0.16 2.51 “John’s lawyer”
25 MEASURE 1.47 0.35 0.07 13.88 1.36 “hundred of dollars”
26 SYNONYMY 0 0 0 0 0
27 ANTONYMY 0 0 0 0 0
28 PROBABILITY 0 0 0 0 0
29 POSSIBILITY 0 0 0 0 0
30 CERTAINTY 0 0 0 0 0
31 THEME 6.51 1.75 3.30 6.26 9.75 “car salesman”
32 RESULT 0.48 0 0.07 0.41 1.36 “combustion gas”
33 STIMULUS 0 0 0 0 0
34 EXTENT 0 0 0 0 0
35 PREDICATE 0.04 0 0 0 0.10
OTHERS 23.19 6.64 5.19 5.77 15.73 “airmail stamp”
Total no. of examples 100a0 (2302) 100a0 (286) 100a0 (1271) 100a0 (2441) 100a0 (953)
Table 4: The distribution of the semantic relations on the annotated corpus after agreement. The number in parentheses
represent number of examples for a particular pattern.
function
a8
derived for each syntactic pattern.
2.5.2 Feature space
An essential aspect of our approach below is the
word sense disambiguation (WSD) of the content words
(nouns, verbs, adjectives and adverbs). Using a state-
of-the-art open-text WSD system, each word is mapped
into its corresponding WordNet 2.0 sense. When dis-
ambiguating each word, the WSD algorithm takes into
account the surrounding words, and this is one important
way through which context gets to play a role in the se-
mantic classification of NPs.
So far, we have identified and experimented with the
following NP features:
1. Semantic class of head noun specifies
the WordNet sense (synset) of the head noun and
implicitly points to all its hypernyms. It is extracted
automatically via a word sense disambiguation module.
The NP semantics is influenced heavily by the meaning
of the noun constituents. Example: “car manufacturer”
is a kind of manufacturer that MAKES/PRODUCES cars.
2. Semantic class of modifier noun
specifies the WordNet synset of the modifier noun. In
case the modifier is a denominal adjective, we take the
synset of the noun from which the adjective is derived.
Example: “musical clock” - MAKE/PRODUCE, and
“electric clock”- INSTRUMENT.
2.5.3 Learning Models
Several learning models can be used to provide the dis-
criminating function
a8
. So far we have experimented
with three models: (1) semantic scattering, (2) decision
trees, and (3) naive Bayes. The first is described below,
the other two are fairly well known from the machine
learning literature.
Semantic Scattering. This is a new model developed
by us particularly useful for the classification of com-
pound nominals a2 a20 a2 a19 without nominalization. The se-
mantic relation in this case derives from the semantics of
the two noun concepts participating in these constructions
as well as the surrounding context.
Model Formulation. Let us define witha7a4a3a6a5 a1a8a7
a8
a5
a1a10a9
anda7a11a3a6a12 a1a13a7
a8
a12
a13a11a9
the sets of semantic class features (ie,
WordNet synsets) of the NP modifiers and, respectively
NP heads (ie features 2 and 1). The compound nominal
semantics is distinctly specified by the feature pair
a8
a5
a1
a8
a12
a13
,
written shortly as
a8
a1a13 . Given feature pair
a8
a1a13 , the proba-
bility of a semantic relation r is a23 a26 a24a1a0
a8
a1a13 a30
a1a3a2 a7a10a5a5a4 a6a8a7a10a9 a11
a2 a7a11a6a8a7a12a9 a11
, de-
fined as the ratio between the number of occurrences of a
relation r in the presence of feature pair
a8
a1a13 over the num-
ber of occurrences of feature pair
a8
a1a13 in the corpus. The
most probable relation a13a24 is a13a24 a1a15a14 a24a17a16a19a18 a14a21a20 a5a23a22a25a24 a23 a26 a24a1a0
a8
a1a13 a30
a1
a14
a24a17a16a21a18
a14a21a20
a5a26a22a25a24 a23 a26
a8
a1a13a19a0a24 a30 a23 a26 a24 a30
Since the number of possible noun synsets combina-
tions is large, it is difficult to measure the quantities
a27
a26 a24
a17
a8
a1a13 a30 and
a27
a26
a8
a1a13 a30 on a training corpus to calculate
a23 a26 a24a1a0
a8
a1a13 a30 . One way of approximating the feature vectora8
a1a13 is to perform a semantic generalization, by replacing
the synsets with their most general hypernyms, followed
by a series of specializations for the purpose of eliminat-
ing ambiguities in the training data. There are 9 noun hi-
erarchies, thus only 81 possible combinations at the most
general level. Table 5 shows a row of the probability ma-
trix a23 a26 a24a1a0
a8
a1a13 a30 for
a8
a1a13
a1a15a28
a27a30a29
a3
a29a32a31a34a33
a28
a27a30a29
a3
a29a32a31 . Each entry, for
which there is more than one relation, is scattered into
other subclasses through an iterative process till there is
only one semantic relation per line. This can be achieved
by specializing the feature pair’s semantic classes with
their immediate WordNet hyponyms. The iterative pro-
cess stops when new training data does not bring any im-
provements (see Table 6).
2.5.4 Overview of the Preliminary Results
The f-measure results obtained so far are summarized
in Table 7. Overall, these results are very encouraging
given the complexity of the problem.
2.5.5 Error Analysis
An important way of improving the performance of a
system is to do a detailed error analysis of the results.
We have analyzed the sources of errors in each case and
found out that most of them are due to (in decreasing or-
der of importance): (1) errors in automatic sense disam-
biguation, (2) missing combinations of features that occur
in testing but not in the training data, (3) levels of special-
ization are too high, (4) errors caused by metonymy, (6)
errors in the modifier-head order, and others. These er-
rors could be substantially decreased with more research
effort.
A further analysis of the data led us to consider a differ-
ent criterion of classification that splits the examples into
nominalizations and non-nominalizations. The reason is
that nominalization noun phrases seem to call for a differ-
ent set of learning features than the non-nominalization
noun phrases, taking advantage of the underlying verb-
argument structure. Details about this approach are pro-
vided in (Girju et al. 2004)).
3 Applications
Semantic relations occur with high frequency in open
text, and thus, their discovery is paramount for many ap-
plications. One important application is Question An-
swering. A powerful method of answering more difficult
questions is to associate to each question the semantic re-
lation that reflects the meaning of that question and then
search for that semantic relation over the candidates of
semantically tagged paragraphs. Here is an example.
Q. Where have nuclear incidents occurred? From the
question stem word where, we know the question asks
for a LOCATION which is found in the complex nomi-
nal “Three Mile Island”-LOCATION of the sentence “The
Three Mile Island nuclear incident caused a DOE policy
crisis”, leading to the correct answer “Three Mile Island”.
Q. What did the factory in Howell Michigan make?
The verb make tells us to look for a MAKE/PRODUCE
relation which is found in the complex nominal “car
factory”-MAKE/PRODUCE of the text: “The car factory in
Howell Michigan closed on Dec 22, 1991” which leads to
answer car.
Another important application is building semantically
rich ontologies. Last but not least, the discovery of
text semantic relations can improve syntactic parsing and
even WSD which in turn affects directly the accuracy of
other NLP modules and applications. We consider these
applications for future work.

References
C. Baker, C. Fillmore, and J. Lowe. 1998. The Berkeley
FrameNet Project. In Proceedings of COLLING/ACL,
Canada.
D. Blaheta and E. Charniak. 2000. Assigning function
tags to parsed text. In Proceedings of the 1st Annual
Meeting of the North American Chapter of the Associa-
tion for Computational Linguistics (NAACL), Seattle,
WA.
E. Charniak. 2001. Immediate-head parsing for language
models. In Proceedings of ACL, Toulouse, France.
W. Dolan, L. Vanderwende, and S. Richardson. 1993.
Automatically deriving structured KBs from on-line
dictionaries. In Proceedings of the Pacific Association
for Computational Linguistics Conference.
P. Downing. 1977. On the creation and use of English
compound nouns. Language, 53(4), 810-842.
T. Finin. 1980. The Semantic Interpretation of Com-
pound Nominals. Ph.D dissertation, University of Illi-
nois, Urbana, Illinois.
D. Gildea and D. Jurafsky. 2002. Automatic Labeling of
Semantic Roles. In Computational Linguistics, 28(3).
D. Gildea and M. Palmer. 2002. The Necessity of Parsing
for Predicate Argument Recognition. In Proceedings
of the 40th Meeting of the Association for Computa-
tional Linguistics (ACL 2002).
R. Girju, A. Badulescu, and D. Moldovan. 2003. Learn-
ing Semantic Constraints for the Automatic Discovery
of Part-Whole Relations. In Proceedings of the Human
Language Technology Conference (HLT-03), Canada.
R. Girju, A.M. Giuglea, M. Olteanu, O. Fortu, and D.
Moldovan. 2004. Support Vector Machines Applied to
the Classification of Semantic Relations in Nominal-
ized Noun Phrases. In Proceedings of HLT/NAACL
2004 - Computational Lexical Semantics workshop,
Boston, MA.
P. Kingsbury, M. Palmer, and M. Marcus. 2002. Adding
Semantic Annotation to the Penn TreeBank. In Pro-
ceedings of the Human Language Technology Confer-
ence (HLT 2002), California.
M. Lapata. 2002. The Disambiguation of Nominalisa-
tions. In Computational Linguistics 28:3, 357-388.
M. Lauer and M. Dras. 1994. A probabilistic model of
compound nouns. In Proceedings of the 7th Australian
Joint Conference on AI.
M. Lauer. 1995. Designing Statistical Language Learn-
ers: Experiments on Compound Nouns. In PhD thesis,
Macquarie University, Sidney.
Judith Levi. 1979. The Syntax and Semantics of Com-
plex Nominals. New York: Academic Press.
R. Navigli and P. Velardi. 2003. Ontology Learning and
Its Application to Automated Terminology Transla-
tion. In IEEE Intelligent Systems.
J. Pustejovsky, S. Bergler, and P. Anick. 1993. Lexical
semantic techniques for corpus analysis. In Computa-
tional Linguistics, 19(2).
R. Quirk, S. Greenbaum, G. Leech, and J. Svartvik. 1985.
A comprehensive grammar of english language, Long-
man, Harlow.
B. Rosario and M. Hearst. 2001. Classifying the Se-
mantic Relations in Noun Compounds via a Domain-
Specific Lexical Hierarchy. In the Proceedings of the
2001 Conference on Empirical Methods in Natural
Language Processing, (EMNLP 2001), Pittsburgh, PA.
B. Rosario, M. Hearst, and C. Fillmore. 2002. The De-
scent of Hierarchy, and Selection in Relational Seman-
tics. In the Proceedings of the Association for Compu-
tational Linguistics (ACL-02), University of Pennsyl-
vania.
M. Semmelmeyer and D. Bolander. 1992. The New Web-
ster’s Grammar Guide. Lexicon Publications, Inc.
J. F. Sowa. 1994. Conceptual Structures: Information
Processing in Mind and Machine. Addison Wesley.
L. Vanderwende. 1994. Algorithm for automatic in-
terpretation of noun sequences. In Proceedings of
COLING-94, pg. 782-788.
