Extracting Structural Paraphrases from Aligned Monolingual Corpora
Ali Ibrahim Boris Katz Jimmy Lin
MIT Artificial Intelligence Laboratory
200 Technology Square
Cambridge, MA 02139
{aibrahim,boris,jimmylin}@ai.mit.edu
Abstract
We present an approach for automatically
learning paraphrases from aligned mono-
lingual corpora. Our algorithm works
by generalizing the syntactic paths be-
tween corresponding anchors in aligned
sentence pairs. Compared to previous
work, structural paraphrases generated by
our algorithm tend to be much longer
on average, and are capable of captur-
ing long-distance dependencies. In ad-
dition to a standalone evaluation of our
paraphrases, we also describe a question
answering application currently under de-
velopment that could immensely bene-
fit from automatically-learned structural
paraphrases.
1 Introduction
The richness of human language allows people to
express the same idea in many different ways; they
may use different words to refer to the same entity or
employ different phrases to describe the same con-
cept. Acquisition of paraphrases, or alternative ways
to convey the same information, is critical to many
natural language applications. For example, an ef-
fective question answering system must be equipped
to handle these variations, because it should be able
to respond to differently phrased natural language
questions.
While there are many resources that help sys-
tems deal with single-word synonyms, e.g., Word-
Net, there are few resources for multiple-word or
domain-specific paraphrases. Because manually
collecting paraphrases is time-consuming and im-
practical for large-scale applications, attention has
recently focused on techniques for automatically ac-
quiring paraphrases.
We present an unsupervised method for acquir-
ing structural paraphrases, or fragments of syntactic
trees that are roughly semantically equivalent, from
aligned monolingual corpora. The structural para-
phrases produced by our algorithm are similar to the
S-rules advocated by Katz and Levin for question
answering (1988), except that our paraphrases are
automatically generated. Because there is disagree-
ment regarding the exact definition of paraphrases
(Dras, 1999), we employ that operating definition
that structural paraphrases are roughly interchange-
able within the specific configuration of syntactic
structures that they specify.
Our approach is a synthesis of techniques devel-
oped by Barzilay and McKeown (2001) and Lin
and Pantel (2001), designed to overcome the limi-
tations of both. In addition to the evaluation of para-
phrases generated by our method, we also describe
a novel information retrieval system under develop-
ment that is designed to take advantage of structural
paraphrases.
2 Previous Work
There has been a rich body of research on automati-
cally deriving paraphrases, including equating mor-
phological and syntactic variants of technical terms
(Jacquemin et al., 1997), and identifying equiva-
lent adjective-noun phrases (Lapata, 2001). Unfor-
tunately, both are limited in types of paraphrases that
they can extract. Other researchers have explored
distributional clustering of similar words (Pereira et
al., 1993; Lin, 1998), but it is unclear to what extent
such techniques produce paraphrases.
1
Most relevant to this paper is the work of Barzi-
lay and McKeown and the work of Lin and Pan-
tel. Barzilay and McKeown (2001) extracted
both single- and multiple-word paraphrases from a
sentence-aligned corpus for use in multi-document
summarization. They constructed an aligned corpus
from multiple translations of foreign novels. From
this, they co-trained a classifier that decided whether
or not two phrases were paraphrases of each other
based on their surrounding context. Barzilay and
McKeown collected 9483 paraphrases with an av-
erage precision of 85.5%. However, 70.8% of the
paraphrases were single words. In addition, the
paraphrases were required to be contiguous.
Lin and Pantel (2001) used a general text corpus
to extract what they called inference rules, which
we can take to be paraphrases. In their algorithm,
rules are represented as dependency tree paths be-
tween two words. The words at the ends of a path
are considered to be features of that path. For each
path, they recorded the different features (words)
that were associated with the path and their respec-
tive frequencies. Lin and Pantel calculated the sim-
ilarity of two paths by looking at the similarity of
their features. This method allowed them to extract
inference rules of moderate length from general cor-
pora. However, the technique is computationally
expensive, and furthermore can give misleading re-
sults, i.e., paths having the opposite meaning often
share similar features.
3 Approach
Our approach, like Barzilay and McKeown’s, is built
on the application of sentence-alignment techniques
used in machine translation to generate paraphrases.
The insight is simple: if we have pairs of sentences
with the same semantic content, then the difference
in lexical content can be attributed to variations in
the surface form. By generalizing these differences
we can automatically derive paraphrases. Barzilay
and McKeown perform this learning process by only
1
For example, “dog” and “cat” are recognized to be similar,
but they are obviously not paraphrases of one another.
considering the local context of words and their fre-
quencies; as a result, paraphrases must be contigu-
ous, and in the majority of cases, are only one word
long. We believe that disregarding the rich syntactic
structure of language is an oversimplification, and
that structural paraphrases offer several distinct ad-
vantages over lexical paraphrases. Long distance re-
lations can be captured by syntactic trees, so that
words in the paraphrases do not need to be contigu-
ous. Use of syntactic trees also buffers against mor-
phological variants (e.g., different inflections) and
some syntactic variants (e.g., active vs. passive).
Finally, because paraphrases are context-dependent,
we believe that syntactic structures can encapsulate
a richer context than lexical phrases.
Based on aligned monolingual corpora, our tech-
nique for extracting paraphrases builds on Lin and
Pantel’s insight of using dependency paths (derived
from parsing) as the fundamental unit of learning
and using parts of those paths as features. Based
on the hypothesis that paths between identical words
in aligned sentences are semantically equivalent, we
can extract paraphrases by scoring the path fre-
quency and context. Our approach addresses the
limitations of both Barzilay and McKeown’s and
Lin and Pantel’s work: using syntactic structures al-
lows us to generate structural paraphrases, and using
aligned corpora renders the process more computa-
tionally tractable. The following sections describe
our approach in greater detail.
3.1 Corpus Alignment
Multiple English translations of foreign novels, e.g.,
Twenty Thousand Leagues Under the Sea by Jules
Verne, were used for extraction of paraphrases.
Although translations by different authors differ
slightly in their literary interpretation of the origi-
nal text, it was usually possible to find correspond-
ing sentences that have the same semantic content.
Sentence alignment was performed using the Gale
and Church algorithm (1991) with the following cost
function:
cost of substitution =1−
ncw
anw
ncw: number of common words
anw: average number of words in two strings
Here is a sample from two different translations
of Twenty Thousand Leagues Under the Sea:
Ned Land tried the soil with his feet, as if to take
possession of it.
Ned Land tested the soil with his foot, as if he were
laying claim to it.
To test the accuracy of our alignment, we man-
ually aligned 454 sentences from two different ver-
sions of Chapter 21 from Twenty Thousand Leagues
Under the Sea and compared the results of our au-
tomatic alignment algorithm against the manually
generated “gold standard.” We obtained a precision
of 0.93 and recall of 0.88, which is comparable to
the numbers (P.94/R.85) reported by Barzilay and
McKeown, who used a different cost function for the
alignment process.
3.2 Parsing and Postprocessing
The sentence pairs produced by the alignment al-
gorithm are then parsed by the Link Parser (Sleator
and Temperly, 1993), a dependency-based parser de-
veloped at CMU. The resulting parse structures are
post-processed to render the links more consistent:
Because the Link Parser does not directly identify
the subject of a passive sentence, our postprocessor
takes the object of the by-phrase as the subject by
default. For our purposes, auxiliary verbs are ig-
nored; the postprocessor connects verbs directly to
their subjects, discarding links through any auxiliary
verbs. In addition, subjects and objects within rela-
tive clauses are appropriately modified so that the
linkages remained consistent with subject and object
linkages in the matrix clause. For sentences involv-
ing verbs that have particles, the Link Parser con-
nects the object of the verb directly to the verb it-
self, attaching the particle separately. Our postpro-
cessor modifies the link structure so that the object
is connected to the particle in order to form a contin-
uous path. Predicate adjectives are converted into an
adjective-noun modification link instead of a com-
plete verb-argument structure. Also, common nouns
denoting places and people are marked by consult-
ing WordNet.
3.3 Paraphrase Extraction
The paraphrase extraction process starts by finding
anchors within the aligned sentence pairs. In our ap-
proach, only nouns and pronouns serve as possible
anchors. The anchor words from the sentence pairs
are brought into alignment and scored by a simple
set of ordered heuristics:
• Exact string matches denote correspondence.
• Noun and matching pronoun (same gender and
number) denote correspondence. Such a match
penalizes the score by 50%.
• Unique semantic class (e.g., places and people)
denotes correspondence. Such a match penal-
izes the score by 50%.
• Unique part of speech (i.e., the only noun
pair in the sentences) denotes correspondence.
Such a match penalizes the score by 50%.
• Otherwise, attempt to find correspondence by
finding longest common substrings. Such a
match penalizes the score by 50%.
• If a word occurs more than once in the aligned
sentence pairs, all possible combinations are
considered, but the score for such a correspond-
ing anchor pair is further penalized by 50%.
For each pair of anchors, a breadth-first search is
used to find the shortest path between the anchor
words. The search algorithm explicitly rejects paths
that contain conjunctions and punctuation. If valid
paths are found between anchor pairs in both of the
aligned sentences, the resulting paths are considered
candidate paraphrases, with a default score of one
(subjected to penalties imposed by imperfect anchor
matching).
Scores of candidate paraphrases take into account
two factors: the frequency of anchors with respect
to a particular candidate paraphrase and the variety
of different anchors from which the paraphrase was
produced. The initial default score of any paraphrase
is one (assuming perfect anchor matches), but for
each additional occurrence the score is incremented
by
1
2
n
, where n is the number of times the current
set of anchors has been seen. Therefore, the effect of
seeing new sets of anchors has a big initial impact on
the score, but the additional increase in score is sub-
jected to diminishing returns as more occurrences of
the same anchor are encountered.
count
aligned sentences 27479
parsed aligned sentences 25292
anchor pairs 43974
paraphrases 5925
unique paraphrases 5502
gathered paraphrases (score ≥ 1.0) 2886
Table 1: Summary of the paraphrase generation pro-
cess
Figure 1: Distribution of paraphrase length
4 Results
Using the approach described in previous sections,
we were able to extract nearly six thousand different
paraphrases (see Table 1) from our corpus, which
consisted of two translations of 20,000 Leagues Un-
der the Sea, two translations of The Kreutzer Sonata,
and three translations of Madame Bouvary.
Our corpus was essentially the same as the one
used by Barzilay and McKeown, with the exception
of some short fairy tale translations that we found to
be unsuitable. Due to the length of sentences (some
translations were noted for their paragraph-length
sentences), the Link Parser was unable to produce
a parse for approximately eight percent of the sen-
tences. Although the Link Parser is capable of pro-
ducing partial linkages, accuracy deteriorated signif-
icantly as the length of the input string increased.
The distribution of paraphrase length is shown in
Figure 1. The length of paraphrases is measured by
the number of words that it contains (discounting the
anchors on both sides).
To evaluate the accuracy of our results, 130
Evaluator Precision
Evaluator 1 36.2%
Evaluator 2 40.0%
Evaluator 3 44.6%
Average 41.2%
Table 2: Summary of judgments by human evalua-
tors for 130 unique paraphrases
unique paraphrases were randomly chosen to be
assessed by human judges. The human assessors
were specifically asked whether they thought the
paraphrases were roughly interchangeable with each
other, given the context of the genre. We believe that
the genre constraint was important because some
paraphrases captured literary or archaic uses of par-
ticular words that were not generally useful. This
should not be viewed as a shortcoming of our ap-
proach, but rather an artifact of our corpus. In ad-
dition, sample sentences containing the structural
paraphrases were presented as context to the judges;
structural paraphrases are difficult to comprehend
without this information.
A summary of the judgments provided by human
evaluators is shown in Table 2. The average preci-
sion of our approach stands at just over forty per-
cent; the average length of the paraphrases learned
was 3.26 words long. Our results also show that
judging structural paraphrases is a difficult task and
inter-assessor agreement is rather low. All of the
evaluators agreed on the judgments (either positive
or negative) only 75.4% of the time. The average
correlation constant of the judgments is only 0.66.
The highest scoring paraphrase was the equiva-
lence of the possessive morpheme ’s with the prepo-
sition of. We found it encouraging that our algo-
rithm was able to induce this structural paraphrase,
complete with co-indexed anchors on the ends of the
paths, i.e., A’s B ⇐⇒ B of A. Some other interest-
ing examples include:
2
A
1
∗
←→ liked
O
←→ A
2
⇐⇒
A
1
∗
←→ fond
OF
←→ of
J
−→ A
2
Example: The clerk liked Monsieur Bovary. ⇐⇒
2
Brief description of link labels: S: subject to verb; O: object
to verb; OF: certain verbs to of; K: verbs to particles; MV: verbs
to certain modifying phrases. See Link Parser documentation
for full descriptions.
Score Threshold Avg. Precision Avg. Length Count
≥ 1.0 40.2% 3.24 130
≥ 1.25 46.0% 2.88 58
≥ 1.5 47.8% 2.22 23
≥ 1.75 38.9% 1.67 12
Table 3: Breakdown of our evaluation results
The clerk was fond of Monsieur Bovary.
A
1
s
←→ rush
K
−→ over
MV
−→ to
J
−→ A
2
⇐⇒
A
1
s
←→ run
MV
−→ to
J
−→ A
2
Example: And he rushed over to his son, who had
just jumped into a heap of lime to whiten his shoes.
⇐⇒ And he ran to his son, who had just precipi-
tated himself into a heap of lime in order to whiten
his boots.
A
1
s
←→ put
K
−→ on
O
−→ A
2
⇐⇒
A
1
s
←→ wear
O
−→ A
2
Example: That is why he puts on his best waistcoat
and risks spoiling it in the rain. ⇐⇒ That’s why he
wears his new waistcoat, even in the rain!
A
1
∗
←→ fit
MV
−→ to
I
−→ give
O
−→ A
2
⇐⇒
A
1
∗
←→ appropriate
MV
−→ to
I
−→ supply
O
−→ A
2
Example: He thought fit, after the first few mouth-
fuls, to give some details as to the catastrophe. ⇐⇒
After the first few mouthfuls he considered it appro-
priate to supply a few details concerning the catas-
trophe.
A more detailed breakdown of the evaluation re-
sults can be seen in Table 3. Increasing the thresh-
old for generating paraphrases tends to increase their
precision, up to a certain point. In general, the high-
est ranking structural paraphrases consisted of sin-
gle word paraphrases of prepositions, e.g., at ⇐⇒
in. Our algorithm noticed that different prepositions
were often interchangeable, which is something that
our human assessors disagreed widely on. Beyond a
certain threshold, the accuracy of our approach ac-
tually decreases.
5 Discussion
An obvious first observation about our algorithm is
the dependence on parse quality; bad parses lead to
many bogus paraphrases. Although the parse results
from the Link Parser are far from perfect, it is un-
clear whether other purely statistical parsers would
fare any better, since they are generally trained on
corpora containing a totally different genre of text.
However, future work will most likely include a
comparison of different parsers.
Examination of our results show that a better no-
tion of constituency would increase the accuracy
of our results. Our algorithm occasionally gener-
ates non-sensical paraphrases that cross constituent
boundaries, for example, including the verb of a
subordinate clause with elements from the matrix
clause. Other problems arise because our current al-
gorithm has no notion of verb phrases; it often gen-
erates near misses such as fail ⇐⇒ succeed, neglect-
ing to include not as part of the paraphrase.
However, there are problems inherent in para-
phrase generation that simple knowledge of con-
stituency alone cannot solve. Consider the following
two sentences:
John made out gold at the bottom of the well.
John discovered gold near the bottom of the well.
Which structural paraphrases should we be able to
extract?
made out X at Y⇐⇒ discovered X near Y
made out X⇐⇒ discovered X
at X⇐⇒ near X
Arguably, all three paraphrases are valid, although
opinions vary more regarding the last paraphrase.
What is the optimal level of structure for para-
phrases? Obviously, this represents a tradeoff be-
tween specificity and accuracy, but the ability of
structural paraphrases to capture long-distance re-
lationships across large numbers of lexical items
complicates the problem. Due to the sparseness of
our data, our algorithm cannot make a good deci-
sion on what constituents to generalize as variables;
naturally, greater amounts of data would alleviate
this problem. This current inability to decide on a
good “scope” for paraphrasing was a primary rea-
son why we were unable to perform a strict eval-
uation of recall. Our initial attempts at generating
a gold standard for estimating recall failed because
human judges could not agree on the boundaries of
paraphrases.
The accuracy of our structural paraphrases is
highly dependent on the corpus size. As can be seen
from the numbers in Table 1, paraphrases are rather
sparse—nearly 93% of them are unique. Without
adequate statistical evidence, validating candidate
paraphrases can be very difficult. Although our data
spareness problem can be alleviated simply by gath-
ering a larger corpus, the type of parallel text our
algorithm requires is rather hard to obtain, i.e., there
are only so many translations of so many foreign
novels. Furthermore, since our paraphrases are ar-
guably genre-specific, different applications may re-
quire different training corpora. Similar to the work
of Barzilay and Lee (2003), who have applied para-
phrase generation techniques to comparable corpora
consisting of different newspaper articles about the
same event, we are currently attempting to solve the
data sparseness problem by extending our approach
to non-parallel corpora.
We believe that generating paraphrases at the
structural level holds several key advantages over
lexical paraphrases, from the capturing of long-
distance relationships to the more accurate modeling
of context. The paraphrases generated by our ap-
proach could prove to be useful in any natural lan-
guage application where understanding of linguis-
tic variations is important. In particular, we are at-
tempting to apply our results to improve the perfor-
mance of question answering system, which we will
describe in the following section.
6 Paraphrases and Question Answering
The ultimate goal of our work on paraphrases is
to enable the development of high-precision ques-
tion answering system (cf. (Katz and Levin, 1988;
Soubbotin and Soubbotin, 2001; Hermjakob et al.,
2002)). We believe that a knowledge base of para-
phrases is the key to overcoming challenges pre-
sented by the expressiveness of natural languages.
Because the same semantic content can be expressed
in many different ways, a question answering sys-
tem must be able to cope with a variety of alternative
phrasings. In particular, an answer stated in a form
that differs from the form of the question presents
significant problems:
When did Colorado become a state?
(1a) Colorado became a state in 1876.
(1b) Colorado was admitted to the Union in 1876.
Who killed Abraham Lincoln?
(2a) John Wilkes Booth killed Abraham Lincoln.
(2b) John Wilkes Booth ended Abraham Lincoln’s
life with a bullet.
In the above examples, question answering sys-
tems have little difficulty extracting answers if the
answers are stated in a form directly derived from
the question, e.g., (1a) and (2a); simple keyword
matching techniques with primitive named-entity
detection technology will suffice. However, ques-
tion answering systems will have a much harder time
extracting answers from sentences where they are
not obviously stated, e.g., (1b) and (2b). To re-
late question to answers in those examples, a system
would need access to rules like the following:
X became a state in Y⇐⇒
X was admitted to the Union in Y
X killed Y⇐⇒ X ended Y’s life
We believe that such rules are best formulated at
the syntactic level: structural paraphrases represent
a good level of generality and provide much more
accurate results than keyword-based approaches.
The simplest approach to overcoming the “para-
phrase problem” in question answering is via key-
word query expansion when searching for candidate
answers:
(AND X became state)⇐⇒
(AND X admitted Union)
(AND X killed)⇐⇒
(AND X ended life)
The major drawback of such techniques is over-
generation of bogus answer candidates. For ex-
ample, it is a well-known result that query expan-
sion based on synonymy, hyponymy, etc. may actu-
ally degrade performance if done in an uncontrolled
manner (Voorhees, 1994). Typically, keyword-
based query expansion techniques sacrifice signifi-
cant amounts of precision for little (if any) increase
in recall.
The problems associated with keyword query ex-
pansion techniques stem from the fundamental de-
ficiencies of “bag-of-words” approaches; in short,
they simply cannot accurately model the semantic
content of text, as illustrated by the following pairs
of sentences and phrases that have the same word
content, but dramatically different meaning:
(3a) The bird ate the snake.
(3b) The snake ate the bird.
(4a) the largest planet’s volcanoes
(4b) the planet’s largest volcanoes
(5a) the house by the river
(5b) the river by the house
(6a) The Germans defeated the French.
(6b) The Germans were defeated by the French.
The above examples are nearly indistinguishable
in terms of lexical content, yet their meanings are
vastly different. Naturally, because one text frag-
ment might be an appropriate answer to a question
while the other fragment may not be, a question an-
swering system seeking to achieve high precision
must provide mechanisms for differentiating the se-
mantic content of the pairs.
While paraphrase techniques at the keyword-level
vastly overgenerate, paraphrase techniques at the
phrase-level undergenerate, that is, they are often
too specific. Although paraphrase rules can eas-
ily be formulated at the string-level, e.g., using
regular expression matching and substitution tech-
niques (Soubbotin and Soubbotin, 2001; Hermjakob
et al., 2002), such a treatment fails to capture im-
portant linguistic generalizations. For example, the
addition of an adverb typically does not alter the va-
lidity of a paraphrase; thus, a phrase-level rule “X
killed Y” ⇐⇒ “X ended Y’s life” would not be able
to match an answer like “John Wilkes Booth sud-
denly ended Abraham Lincoln’s life with a bullet”.
String-level paraphrases are also unable to handle
syntactic phenomenona like passivization, which are
easily captured at the syntactic level.
We believe that answering questions at level of
syntactic relations, that is, matching parsed rep-
resentations of questions with parsed representa-
tions of candidates, addresses the issues presented
above. Syntactic relations, basically simplified ver-
sions of dependency structures derived from the
Link Parser, can capture significant portions of the
meaning present in text documents, while providing
a flexible foundation on which to build machinery
for paraphrases.
Our position is that question answering should be
performed at the level of “key relations” in addition
to keywords. We have begun to experiment with re-
lations indexing and matching techniques described
above using an electronic encyclopedia as the test
corpus. We identified a particular set of linguistic
phenomena where relation-based indexing can dra-
matically boost the precision of a question answer-
ing system (Katz and Lin, 2003). As an example,
consider a sample output from a baseline keyword-
based IR system:
What do frogs eat?
(R1) Alligators eat many kinds of small animals
that live in or near the water, including fish, snakes,
frogs, turtles, small mammals, and birds.
(R2) Some bats catch fish with their claws, and a
few species eat lizards, rodents, birds, and frogs.
(R3) Bowfins eat mainly other fish, frogs, and cray-
fish.
(R4) Adult frogs eat mainly insects and other small
animals, including earthworms, minnows, and spi-
ders.
...
(R32) Kookaburras eat caterpillars, fish, frogs, in-
sects, small mammals, snakes, worms, and even
small birds.
Of the 32 sentences returned, only (R4) correctly
answers the user query; the other results answer a
different question—“What eats frogs?” A bag-of-
words approach fundamentally cannot differentiate
between a query in which the frog is in the subject
position and a query in which the frog is in the object
position. Compare this to the results produced by
our relations matcher:
What do frogs eat?
(R4) Adult frogs eat mainly insects and other small
animals, including earthworms, minnows, and spi-
ders.
By examining subject-verb-object relations, our
system can filter out irrelevant results and return
only the correct responses.
We are currently working on combining this
relations-indexing technology with the automatic
paraphrase generation technology described earlier.
For example, our approach would be capable of au-
tomatically learning a paraphrase like X eat Y ⇐⇒ Y
is a prey of X; a large collection of such paraphrases
would go a long way in overcoming the brittleness
associated with a relations-based indexing scheme.
7 Contributions
We have presented a method for automatically learn-
ing structural paraphrases from aligned monolingual
corpora that overcomes the limitation of previous
approaches. In addition, we have sketched how this
technology can be applied to enhance the perfor-
mance of a question answering system based on in-
dexing relations. Although we have not completed
a task-based evaluation, we believe that the ability
to handle variations in language is key to building
better question answering systems.
8 Acknowledgements
This research was funded by DARPA under contract
number F30602-00-1-0545 and administered by the
Air Force Research Laboratory.

References
Regina Barzilay and Lillian Lee. 2003. Learning to
paraphrase: An unsupervised approach using multiple-
sequence alignment. In Proceedings of HLT-NAACL
2003.
Regina Barzilay and Kathleen McKeown. 2001. Extract-
ing paraphrases from a parallel corpus. In Proceed-
ings of the 39th Annual Meeting of the Association for
Computational Linguistics (ACL-2001).
Mark Dras. 1999. Tree Adjoining Grammar and the Re-
luctant Paraphrasing of Text. Ph.D. thesis, Macquarie
University, Australia.
William A. Gale and Kenneth Ward Church. 1991. A
program for aligning sentences in bilingual corpora.
In Proceedings of the 29th Annual Meeting of the As-
sociation for Computational Linguistics (ACL-1991).
Ulf Hermjakob, Abdessamad Echihabi, and Daniel
Marcu. 2002. Natural language based reformulation
resource and Web exploitation for question answering.
In Proceedings of the Eleventh Text REtrieval Confer-
ence (TREC 2002).
Christian Jacquemin, Judith L. Klavans, and Evelyne
Tzoukermann. 1997. Expansion of multi-word terms
for indexing and retrieval using morphology and syn-
tax. In Proceedings of the 35th Annual Meeting of
the Association for Computational Linguistics (ACL-
1997).
Boris Katz and Beth Levin. 1988. Exploiting lexical
regularities in designing natural language systems. In
Proceedings of the 12th International Conference on
Computational Linguistics (COLING-1988).
Boris Katz and Jimmy Lin. 2003. Selectively using re-
lations to improve precision in question answering. In
Proceedings of the EACL-2003 Workshop on Natural
Language Processing for Question Answering.
Maria Lapata. 2001. A corpus-based account of reg-
ular polysemy: The case of context-sensitive adjec-
tives. In Proceedings of the Second Meeting of the
North American Chapter of the Association for Com-
putational Linguistics (NAACL-2001).
Dekang Lin and Patrick Pantel. 2001. DIRT—discovery
of inference rules from text. In Proceedings of the
ACM SIGKDD Conference on Knowledge Discovery
and Data Mining.
Dekang Lin. 1998. Extracting collocations from text cor-
pora. In Proceedings of the First Workshop on Com-
putational Terminology.
Fernando Pereira, Naftali Tishby, and Lillian Lee. 1993.
Distributional clustering of English words. In Pro-
ceedings of the 30th Annual Meeting of the Association
for Computational Linguistics (ACL-1991).
Daniel Sleator and Davy Temperly. 1993. Parsing En-
glish with a link grammar. In Proceedings of the Third
International Workshop on Parsing Technology.
Martin M. Soubbotin and Sergei M. Soubbotin. 2001.
Patterns of potential answer expressions as clues to the
right answers. In Proceedings of the Tenth Text RE-
trieval Conference (TREC 2001).
Ellen M. Voorhees. 1994. Query expansion using
lexical-semantic relations. In Proceedings of the
17th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
(SIGIR-1994).
