Towards Light Semantic Processing for Question Answering
Benjamin Van Durme , Yifen Huang , Anna Kup´s´c +, Eric Nyberg 
 Language Technologies Institute, Carnegie Mellon University
+Polish Academy of Sciences
fvandurme,hyifen,aniak,ehng@cs.cmu.edu
Abstract
The paper1 presents a lightweight knowledge-
based reasoning framework for the JAVELIN
open-domain Question Answering (QA) sys-
tem. We propose a constrained representation
of text meaning, along with a flexible unifica-
tion strategy that matches questions with re-
trieved passages based on semantic similarities
and weighted relations between words.
1 Introduction
Modern Question Answering (QA) systems aim at pro-
viding answers to natural language questions in an open-
domain context. This task is usually achieved by com-
bining information retrieval (IR) with information extrac-
tion (IE) techniques, modified to be applicable to unre-
stricted texts. Although semantics-poor techniques, such
as surface pattern matching (Soubbotin, 2002; Ravichan-
dran and Hovy, 2002) or statistical methods (Ittycheriah
et al., 2002), have been successful in answering fac-
toid questions, more complex tasks require a consider-
ation of text meaning. This requirement has motivated
work on QA systems to incorporate knowledge process-
ing components such as semantic representation, ontolo-
gies, reasoning and inference engines, e.g., (Moldovan et
al., 2003), (Hovy et al., 2002), (Chu-Carroll et al., 2003).
Since world knowledge databases for open-domain tasks
are unavailable, alternative approaches for meaning rep-
resentation must be adopted. In this paper, we present
our preliminary approach to semantics-based answer de-
tection in the JAVELIN QA system (Nyberg et al., 2003).
In contrast to other QA systems, we are trying to realize
a formal model for a lightweight semantics-based open-
domain question answering. We propose a constrained
semantic representation as well as an explicit unification
1The authors appear in alphabetical order.
framework based on semantic similarities and weighted
relations between words. We obtain a lightweight roboust
mechanism to match questions with answer candidates.
The organization of the paper is as follows: Section 2
briefly presents system components; Section 3 discusses
syntactic processing strategies; Sections 4 and 5 describe
our preliminary semantic representation and the unifica-
tion framework which assigns confidence values to an-
swer candidates. The final section contains a summary
and future plans.
2 System Components
The JAVELIN system consists of four basic components:
a question analysis module, a retrieval engine, a passage
analysis module (supporting both statistical and NLP
techniques), and an answer selection module. JAVELIN
also includes a planner module, which supports feedback
loops and finer control over specific components (Nyberg
et al., 2003). In this paper we are concerned with the two
components which support linguistic analysis: the ques-
tion analysis and passage understanding modules (Ques-
tion Analyzer and Information Extractor, respectively).
The relevant aspects of syntactic processing in both mod-
ules are presented in Section 3, whereas the semantic rep-
resentation is introduced in Section 4.
3 Parsing
The system employs two different parsing techniques:
a chart parser with hand-written grammars for ques-
tion analysis, and a lexicalized, broad coverage skipping
parser for passage analysis. For question analysis, pars-
ing serves two goals: to identify the finest answer focus
(Moldovan et al., 2000; Hermjakob, 2001), and to pro-
duce a grammatical analysis (f-structure) for questions.
Due to the lack of publicly available parsers which have
suitable coverage of question forms, we have manually
developed a set of grammars to achieve these goals. On
the other hand, the limited coverage and ambiguity in
these grammars made adopting the same approach for
passage analysis inefficient. In effect, we use two dis-
tinct parsers which provide two syntactic representations,
including grammatical functions. These syntactic struc-
tures are then transformed into a common semantic rep-
resentation discussed in Section 4.
( (Brill-pos VBN)
(adjunct (
(object (
(Brill-pos WRB)
(atype temporal)
(cat n)
(ortho When)
(q-focus +)
(q-token +)
(root when)
(tokens 1)))
(time +)))
(cat v)
(finite +)
(form finite)
(modified +)
(ortho founded)
(passive +)
(punctuation (
(Brill-pos ".")
(cat punct)
(ortho ?)
(root ?)
(tokens 6)))
(qa (
(gap (
(atype temporal)
(path (*MULT* adjunct
object))))
(qtype entity)))
(root found)
(subject (
(BBN-name person)
(Brill-pos NNP)
(cat n)
(definite +)
(gen-pn +)
(human +)
(number sg)
(ortho "Wendy’s")
(person third)
(proper-noun +)
(root wendy)
(tokens 3)))
(tense past)
(tokens 5))
Figure 1: When was Wendy’s founded: KANTOO f-
structure
3.1 Questions
The question analysis consists of two steps: lexical pro-
cessing and syntactic parsing. For the lexical process-
ing step, we have integrated several external resources:
the Brill part-of-speech tagger (Brill, 1995), BBN Identi-
Finder (BBN, 2000) (to tag named entities such as proper
names, time expressions, numbers, etc.), WordNet (Fell-
baum, 1998) (for semantic categorization), and the KAN-
TOO Lexifier (Nyberg and Mitamura, 2000) (to access a
syntactic lexicon for verb valence information).
The hand-written grammars employed in the project
are based on the Lexical Functional Grammar (LFG) for-
malism (Bresnan, 1982), and are used with the KANTOO
parser (Nyberg and Mitamura, 2000). The parser out-
puts a functional structure (f-structure) which specifies
the grammatical functions of question components, e.g.,
subject, object, adjunct, etc. As illustrated in Fig. 1, the
resulting f-structure provides a deep, detailed syntactic
analysis of the question.
3.2 Passages
Passages selected by the retrieval engine are processed
by the Link Grammar parser (Grinberg et al., 1995). The
parser uses a lexicalized grammar which specifies links,
i.e., grammatical functions, and provides a constituent
structure as output. The parser covers a wide range of
syntactic constructions and is robust: it can skip over un-
recognized fragments of text, and is able to handle un-
known words.
An example of the passage analysis produced by the
Link Parser is presented in Fig. 2. Links are treated as
predicates which relate various arguments. For exam-
ple, O in Fig. 2 indicates that Wendy’s is an object of the
verb founded. In parallel to the Link parser, passages are
tagged with the BBN IdentiFinder (BBN, 2000), in or-
der to group together multi-word proper names such as
R. David Thomas.
4 Semantic Representation
At the core of our linguistic analysis is the semantic rep-
resentation, which bridges the distinct representations of
the functional structure obtained for questions and pas-
sages. Although our semantic representation is quite sim-
ple, it aims at providing the means of understanding and
processing broad-coverage linguistic data. The represen-
tation uses the following main constructs:2
 formula is a conjunction of literals and represents
the meaning of the entire sentence (or question);
 literal is a predicate relation over two terms; in par-
ticular, we distinguish two types of literals: extrin-
sic literal, a literal which relates a label to a label,
and intrinsic literal, a literal which relates a label to
a word;
2The use of terminology common in the field of formal logic
is aimed at providing an intuitive understanding to the reader,
but is not meant to give the impression that our work is built on
a firm logic-theoretic framework.
+------------------------Xp------------------------+
| +-------MVp-------+ |
+--------Wd-------+ +------O------+ | |
| +-G-+---G--+---S---+ +--YS-+ +-IN+ |
| | | | | | | | | |
LEFT-WALL R. David Thomas founded.v Wendy ’s.p in 1969 .
Constituent tree:
(S (NP R. David Thomas)
(VP founded
(NP (NP Wendy ’s))
(PP in
(NP 1969)))
.)
Figure 2: R. David Thomas founded Wendy’s in 1969.: Link Grammar parser output
 predicate is used to capture relations between
terms;
 term is either a label, a variable which refers to a
specific entity or an event, or a word, which is either
a single word (e.g., John) or a sequence of words
separated by whitespace (e.g., for proper names such
as John Smith).
The BNF syntax corresponding to this representation
is given in (1).
(1) <formula> := <literal>+
<literal> := <pred>(<term>,<term>)
<term> := <label>|<word>
<word> := |[a-nA-Z0-9\s]+|
<label> := [a-z]+[0-9]+
<pred> := [A-Z_-]+
With the exception of the unary ANS predicate which
indicates the sought answer, all predicates are binary re-
lations (see examples in Fig. 3). Currently, most pred-
icate names are based on grammatical functions (e.g.,
SUBJECT, OBJECT, DET) which link events and entities
with their arguments. Unlike in (Moldovan et al., 2003),
names of predicates belong to a fixed vocabulary, which
provides a more sound basis for a formal interpretation.
Names of labels and terms are restricted only by the syn-
tax in (1). Examples of semantic representations for the
question When was Wendy’s founded? and the passage R.
David Thomas founded Wendy’s in 1969. are shown in
Fig. 4.
Note that our semantic representation reflects the
‘canonical’ structure of an active sentence. This design
decision was made in order to eliminate structural differ-
ences between semantically equivalent structures. Hence,
at the semantic level, all passive sentences correspond to
their equivalents in the active form. Semantic representa-
tion of questions is not always derived directly from the
f-structure. For some types of questions, e.g., definition
When was Wendy’s R. David Thomas founded
founded? Wendy’s in 1969.
ROOT(x6,jWendy’sj) ROOT(x6,jWendy’sj)
ROOT(x2,jfoundj) ROOT(x2,joundj)
ADJUNCT(x2,x1) ADJUNCT(x2,x1)
OBJECT(x2,x6) OBJECT(x2,x6)
SUBJECT(x2,x7) SUBJECT(x2,x7)
ROOT(x7,jR. David Thomasj)
TYPE(x2,jeventj) TYPE(x2,jeventj)
TENSE(x2,jpastj)
ROOT(x1,x9) ROOT(x1,j1969j)
TYPE(x1,jtimej) TYPE(x1,jtimej)
ANS(x9)
Figure 4: An example of question and passage semantic
representation
questions such as What is the definition of hazmat?, spe-
cialized (dedicated) grammars are used, which allows us
to more easily arrive at an appropriate representation of
meaning. Also, in the preliminary implementation of the
unification algorithm (see Section 5), we have adopted
some simplifying assumptions, and we do not incorpo-
rate sets in the current representation.
The present formalism can quite successfully handle
questions (or sentences) which refer to specific events or
relations. However, it is more difficult to represent ques-
tions like What is the relationship between Jesse Ventura
and Target Stores?, which seek a relation between enti-
ties or a common event they participated in. In the next
section, we discuss the unification scheme which allows
us to select answer candidates based on the proposed rep-
resentation.
5 Fuzzy Unification
A unification algorithm is required to match question rep-
resentations with the representations of extracted pas-
sages which might contain answers. Using a precursor
predicate example comments
ROOT ROOT(x13,jJohnj) the root form of entity/event x13
OBJECT OBJECT(x2,x3) x3 is the object of verb
or preposition x2
SUBJECT SUBJECT(x2,x3) x3 is the subject of verb x2
DET DET(x2,x1) x1 is a determiner/quantifier of x2
TYPE TYPE(x3,jeventj) x3 is of the type event
TENSE TENSE(x1,jpresentj) x1 is a verb in present tense
EQUIV EQUIV(x1,x3) semantic equivalence:
apposition: ”John, a student of CMU”
equality operator in copular sentences:
”John is a student of CMU”
ATTRIBUTE ATTRIBUTE(x1,x3) x3 is an adjective modifier of x1:
adjective-noun: ”stupid John”
copular constructions: ”John is stupid”
PREDICATE PREDICATE(x2,x3) copular constructions: ”Y is x3”
ROOT(x2,jbej) SUBJECT(x2,Y)
PREDICATE(x2,x3)
POSSESSOR POSSESSOR(x2,x4) x4 is the possessor of x2
”x4’s x2” or ”x2 of x4”
AND AND(x3,x1) ”John and Mary laughed.”
AND(x3,x2) ROOT(x1,jJohnj) ROOT(x2,jMaryj)
ROOT(x4,jlaughj) TYPE(x4,jeventj)
AND(x3,x1)
AND(x3,x2)
SUBJECT(x4,x3)
ANS ANS(x1) only for questions: x1 indicates the answer
Figure 3: Examples of predicates
to the representation presented above, we constructed
an initial prototype using a traditional theorem prover
(Kalman, 2001). Answer extraction was performed by at-
tempting a unification between logical forms of the ques-
tion and retrieved passages. Early tests showed that a uni-
fication strategy based on a strict boolean logic was not
as flexible as we desired, given the lack of traditional do-
main constraints that one normally possesses when con-
sidering this type of approach. Unless a retrieved pas-
sage exactly matched the question, as in Fig. 4, the sys-
tem would fail due to lack of information. For instance,
knowing that Benjamin killed Jefferson. would not an-
swer the question Who murdered Jefferson?, using a strict
unification strategy.
This has led to more recent experimentation with prob-
abilistic models that perform what we informally refer to
as fuzzy unification.3 The basic idea of our unification
strategy is to treat relationships between question terms
as a set of weighted constraints. The confidence score
assigned to each extracted answer candidate is related to
the number of constraints the retrieved passage satisfies,
along with a measure of similarity between the relevant
terms.
5.1 Definitions
In this section, we present definitions which are necessary
for discussion of the similarity measure employed by our
fuzzy unification framework.
Given a user query Q, where Q is a formula, we re-
trieve a set of passages P. Our task to is find the best
passage Pbest 2 P from which an answer candidate can
be extracted. An answer candidate exists within a pas-
sage P if the result of a fuzzy unification between Q and
P results in the single term of ANS(x0) being ground in a
term from P.
(2) Pbest = argmaxP2Psim(Q;P)
The restriction that an answer candidate must be found
within a passage P must be made explicit, as our no-
tion of fuzzy unification is such that a passage can unify
against a query with a non-zero level of confidence even
if one or more constraints from the query are left unsat-
isfied. Since the final goal is to find and return the best
possible answer, we are not concerned with those pas-
sages which seem highly related yet do not offer answer
candidates.
In Section 4, we introduced extrinsic literals where
predicates serve as relations over two labels. Extrinsic lit-
erals can be thought of as relations defined over distinct
3Fuzzy unification in a formal setting generally refers to a
unification framework that is employed in the realm of fuzzy
logics. Our current representation is of an ad-hoc nature, but
our usage of this term does foreshadow future progression to-
wards a representation scheme dependent on such a formal,
non-boolean model.
entities in our formula. For example, SUBJECT(x1;x2)
is an extrinsic literal, while ROOT(x1;jBenjaminj) is not.
The latter has been defined as an intrinsic literal in Sec-
tion 4 and it relates a label and a word.
This terminology is motivated by the intuitive distinc-
tion between intrinsic and extrinsic properties of an entity
in the world. We use this distinction as a simplifying as-
sumption in our measurements of similarity, which we
will now explain in more detail.
5.2 Similarity Measure
Given a set of extrinsic literals PE and QE from a pas-
sage and the question, respectively, we measure the sim-
ilarity between QE and a given ordering of PE as the
geometric mean of the similarity between each pair of
extrinsic literals from the sets QE and PE.
Let O be the set of all possible orderings of PE, O
an element of O, QEj literal j of QE, and Oj literal j of
ordering O. Then:
(3) sim(Q;P)= sim(QE;PE)
= maxO2O(Qnj=0 sim(QEj ;Oj)) 1n
The similarity of two extrinsic literals, lE and lE0, is
computed by the square root of the similarity scores of
each pair of labels, multiplied by the weight of the given
literal, dependent on the equivilance of the predicates
p;p0 of the respective literals lE;lE0. If the predicates are
not equivilant, we rely on the engineers tactic of assign-
ing an epsilon value of similarity, where  is lower than
any possible similarity score4. Note that the similarity
score is still dependent on the weight of the literal, mean-
ing that failing to satisfy a heavier constraint imposes a
greater penalty than if we fail to satisfy a constraint of
lesser importance.
Let tj and t0j be the respective j-th term of lE;lE0.
Then:
(4) sim(lE;lE0) = weight(lE) 
((sim(t0;t00) sim(t1;t01)) 12;p=p0
 ;otherwise
The weight of a literal is meant to capture the relative
importance of a particular constraint in a query. In stan-
dard boolean unification the importance of a literal is uni-
form, as any local failure dooms the entire attempt.5 In a
non-boolean framework the importance of one literal vs.
another becomes an issue. As an example, given a ques-
tion concerning a murder we might be more interested in
the suspect’s name than in the fact that he was tall. This
4The use of a constant value of  is ad hoc, and we are in-
vestigating more principled methods for assigning this penalty.
5That is to say, classic unification is usually an all or nothing
affair.
idea is similar to that commonly seen in information re-
trieval systems which place higher relative importance on
terms in a query that are judged a priori to posses higher
information value. While our prototype currently sets all
literals with a weight of 1.0, we are investigating methods
to train these weights to be specific to question type.
Per our definition, all terms within an extrinsic literal
will be labels. Thus, in equation (10), t0 is a label, as is
t1, and so on. Given a pair of labels, b and b0, we let I;I0
be the respective sets of intrinsic literals from the formula
containing b;b0 such that for all intrinsic literals lI 2I,
the first term of lI is b, and likewise for b0;I0.
Much like similarity between two formulae, the sim-
ilarity between two labels relies on finding the maximal
score over all possible orderings of a set of literals.
Now let O be the set of all possible orderings of I0, O
an element of O, Ij the j-th literal of I, and Oj the j-th
literal of O. Then:
(5) sim(b;b0) = maxO2O(Qnj=0 sim(Ij;Oj)) 1n
We measure the similarity between a pair of intrinsic
literals as the similarity between the two words multi-
plied by the weight of the first literal, dependent on the
predicates p;p0 of the respective literals being equivilant.
(6)
sim(lI;lI0) = weight(lI) 
(sim(t1;t01);p=p0
 ;otherwise
The similarity between two words is currently measured
using a WordNet distance metric, applying weights intro-
duced in (Moldovan et al., 2003). We will soon be inte-
grating metrics which rely on other dimensions of simi-
larity.
5.3 Example
We now walk through a simple example in order to
present the current framework used to measure the level
of constraint satisfaction (confidence score) achieved by
a given passage. While a complete traversal of even a
small passage would exceed the space available here, we
will present a single instance of each type of usage of the
sim() function.
If we limit our focus to only a few key relationships,
we get the following analysis of a given question and pas-
sage.
(7) Who killed Jefferson?
ANS(x0), ROOT(x1,x0), ROOT(x2,jkillj),
ROOT(x3,jJeffersonj), TYPE(x2,jeventj),
TYPE(x1,jpersonj), TYPE(x3,jpersonj), SUB-
JECT(x2,x1), OBJECT(x2,x3)
(8) Benjamin murdered Jefferson.
ROOT(y1,jBenjaminj), ROOT(y2,jmurderj),
ROOT(y3,jJeffersonj), TYPE(y2,jeventj),
TYPE(y1,jpersonj), TYPE(y3,jpersonj), SUB-
JECT(y2,y1), OBJECT(y2,y3)
Computing the similarity between two formulae,
(loosely referred to here by their original text), gives the
following:
(9) sim[jWho killed Jefferson?j,
jBenjamin murdered Jefferson.j] =
(sim[ SUBJECT(x2,x1), SUBJECT(y2,y1)] 
sim[ OBJECT(x2,x3), OBJECT(y2,y3)])12
The similarity between the given extrinsic literals shar-
ing the predicate SUBJECT:
(10) sim[SUBJECT(x2,x1), SUBJECT(y2,y1)] =
(sim[x2, y2] sim[x1, y1])12 
weight[SUBJECT(x2,x1)]
In order to find the result of this extrinsic similarity
evaluation, we need to determine the similarity between
the paired terms, (x1,y1) and (x2,y2). The similarity be-
tween x1 and y1 is measured as:
(11) sim[x2, y2] =
(sim[ROOT(x2,jkillj), ROOT(y2,jmurderj)] 
sim[TYPE(x2,jeventj), TYPE(y2,jeventj)])12
The result of this function depends on the combined
similarity of the intrinsic literals that relate the given
terms to values. The similarity between one of these in-
trinsic literal pairs is measured by:
(12) sim[ROOT(x2,jkillj), ROOT(y2,jmurderj)] =
sim[jkillj,jmurderj] weight[ROOT(x2,jkillj)]
Finally, the similarity between a pair of words is com-
puted as:
(13) sim[jkillj,jmurderj] = 0:8
As stated earlier, our similarity metrics at the word
level are currently based on recent work on WordNet dis-
tance functions. We are actively developing methods to
complement this approach.
6 Summary and Future Work
The paper presents a lightweight semantic processing
technique for open-domain question answering. We pro-
pose a uniform semantic representation for questions and
passages, derived from their functional structure. We also
describe the unification framework which allows for flex-
ible matching of query terms with retrieved passages.
One characteristics of the current representation is that
it is built from grammatical functions and does not uti-
lize a canonical set of semantic roles and concepts. Our
overall approach in JAVELIN was to start with the sim-
plest form of meaning-based matching that could extend
simple keyword-based approaches. Since it was possi-
ble to extract grammatical functions from unrestricted
text fairly quickly (using KANTOO for questions and the
Link Grammar parser for answer passages), this frame-
work provides a reasonable first step. We intend to extend
our representation and unification algorithm by incorpo-
rating the Lexical Conceptual Structure Database (Dorr,
2001), which will allow us to use semantic roles instead
of grammatical relations as predicates in the represen-
tation. We also plan to enrich the representation with
temporal expressions, incorporating the ideas presented
in (Han, 2003).
Another limitation of the current implementation is
the limited scope of the similarity function. At present,
the similarity function is based on relationships found
in WordNet, and only relates words which belong to the
same syntactic category. We plan to extend our similar-
ity measure by using name lists, gazetteers and statistical
cooccurrence in text corpora. A complete approach to
word similarity will also require a suitable algorithm for
reference resolution. Unrestricted text makes heavy use
of various forms of co-reference, such as anaphora, def-
inite description, etc. We intend to adapt the anaphora
resolution algorithms used in KANTOO for this purpose,
but a general solution to resolving definite reference (e.g.,
the use of “the organization” to refer to “Microsoft”) is a
topic for ongoing research.
Acknowledgments
Research presented in this paper has been supported in
part by an ARDA grant under Phase I of the AQUAINT
program. The authors wish to thank all members of the
JAVELIN project for their support in preparing the work
presented in this paper. We are also grateful to two anony-
mous reviewers, Laurie Hiyakumoto and Kathryn Baker
for their comments and suggestions for improving this
paper. Needless to say, all remaining errors and omis-
sions are entirely our responsibility.

References
BBN Technologies, 2000. IdentiFinder User Manual.
Joan Bresnan, editor. 1982. The Mental Representation
of Grammatical Relations. MIT Press Series on Cog-
nitive Theory and Mental Representation. The MIT
Press, Cambridge, MA.
Eric Brill. 1995. Transformation-based error-driven
learning and natural language processing: A case study
in part-of-speech tagging. Computational Linguistics,
21:543–565.
Jenifer Chu-Carroll, John Prager, Christopher Welty,
Krzysztof Czuba, and David Ferrucci. 2003. A multi-
strategy and multi-source approach to question an-
swering. In TREC 2002 Proceedings.
Bonnie J. Dorr. 2001. LCS Database Docu-
mentation. HTML Manual. available from
http://www.umiacs.umd.edu/˜bonnie/LCS Data-
base Documentation.html.
Christiane Fellbaum. 1998. WordNet: An Electronic Lex-
ical Database. MIT Press.
Dennis Grinberg, John Lafferty, and Daniel Sleator.
1995. A robust parsing algorithm for link grammars.
In Proceedings of the Fourth International Workshop
on Parsing Technologies, Prague, September.
Benjamin Han. 2003. Text temporal analysis. Research
status summary. Draft of January 2003.
Ulf Hermjakob. 2001. Parsing and question classifica-
tion for question answering. In Proceedings of the
Workshop on Open-Domain Question Answering at
ACL-2001.
Eduard Hovy, Ulf Hermjakob, and Chin-Yew Lin. 2002.
The use of external knowledge in factoid qa. In Pro-
ceedings of the TREC-10 Conference.
Abraham Ittycheriah and Salim Roukos. 2003. IBM’s
statistical question answering system – TREC-11. In
TREC 2002 Proceedings.
Abraham Ittycheriah, Martin Franz, and Salim Roukos.
2002. IBM’s statistical question answering system –
TREC-10. In TREC 2001 Proceedings.
John A. Kalman. 2001. Automated Reasoning with OT-
TER. Rinton Press.
Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada
Mihalcea, Roxana Girju, Richard Goodrum, and Vasile
Rus. 2000. The structure and performance of an open-
domain question answering system. In Proceedings of
the Conference of the Association for Computational
Linguistics (ACL-2000).
Dan Moldovan, Sanda Harabagiu, Roxana Girju, Paul
Morarescu, Finley Lacatusu, Adrian Novischi, Adriana
Badulescu, and Orest Bolohan. 2003. LCC tools for
question answering. In TREC 2002 Proceedings.
Eric Nyberg and Teruko Mitamura. 2000. The KAN-
TOO machine translation environment. In Proceed-
ings of AMTA 2000.
Eric Nyberg, Teruko Mitamura, Jaime Carbonell, Jaime
Callan, Kevyn Collins-Thompson, Krzysztof Czuba,
Michael Duggan, Laurie Hiyakumoto, Ng Hu, Yifen
Huang, Jeongwoo Ko, Lucian V. Lita, Stephen
Murtagh, Vasco Pedro, and David Svoboda. 2003. The
JAVELIN question answering system at TREC 2002.
In TREC 2002 Proceedings.
Deepak Ravichandran and Eduard Hovy. 2002. Learning
surface text patterns for a Question Answering system.
In Proceedings of the ACL Conference.
Martin M. Soubbotin. 2002. Patterns of potential answer
expressions as clues to the right answer. In TREC 2001
Proceedings.
