File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-0802_metho.xml
Size: 32,657 bytes
Last Modified: 2025-10-06 14:10:33
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0802"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Hybrid Systems for Information Extraction and Question Answering</Title> <Section position="3" start_page="0" end_page="10" type="metho"> <SectionTitle> 2 Ternary Expresions as Predicate- </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Argument Structures </SectionTitle> <Paragraph position="0"> Researchers like Lin, Katz and Litkowski have started to work in the direction of using NLP to populate a database of RDFs, thus creating the premises for the automatic creation of ontologies to be used in the IR/IE tasks. However, in no way RDFs and ternary expresions may constitute a formal tol suficient to expres the complexity of natural language texts.</Paragraph> <Paragraph position="1"> RDFs are asertions about the things (people, Webpages and whatever) they predicate about by aserting that they have certain properties with certain values. If we may agre with the fact that this is natural way of dealing with data handled by computers most frequently, it also a fact that this is not equivalent as being useful for natural language. The misconception sems to be deply embeded in the nature of RDFs as a whole: they are directly comparable to atribute-value pairs and DAGs which are also the formalism used by most recent linguistic unification-based gramars. From the logical and semantic point of view RDFs also resemble very closely first order predicate logic constructs: but we must remember that FOPL is as such insuficient to describe natural language texts.</Paragraph> <Paragraph position="2"> Ternary expresions(T-expresions), <subject relation object>.</Paragraph> <Paragraph position="3"> Certain other parameters (adjectives, posesive nouns, prepositional phrases, etc.) are used to create aditional T-expresions in which prepositions and several special words may serve as relations. For instance, the folowing In Litkowski's system the key step in their question-answering prototype was the analysis of the parse tres to extract semantic relation triples and populate the databases used to answer the question. A semantic relation triple consists of a discourse entity, a semantic relation which characterizes the entity's role in the sentence, and a governing word to which the entity stands in the semantic relation. The semantic relations in which entities participate are intended to capture the semantic roles of the entities, as generaly understod in linguistics. This includes such roles as agent, theme, location, maner, modifier, purpose, and time. Surogate place holders included are &quot;SUBJ,&quot; &quot;OBJ&quot;, &quot;TIME,&quot; &quot;NUM,&quot; &quot;ADJMOD,&quot; and the prepositions heading prepositional phrases. The governing word was generaly the word in the sentence that the discourse entity stod in relation to. For &quot;SUBJ,&quot; &quot;OBJ,&quot; and &quot;TIME,&quot; this was generaly the main verb of the sentence. For prepositions, the governing word was generaly the noun or verb that the prepositional phrase modified. For the adjectives and numbers, the governing word was generaly the noun that was modified.</Paragraph> </Section> <Section position="2" start_page="0" end_page="10" type="sub_section"> <SectionTitle> 2.1 Ternary Expressions are better than the </SectionTitle> <Paragraph position="0"> BOWs approach, but...</Paragraph> <Paragraph position="1"> People working advocating the supremacy of the Tes aproach were reacting against the Bag of Words aproach of IR/IE in which words were wrongly regarded to be entertaining a meaningful relation simply on the basis of topological criteria: normaly the distance criteria or the more or les proximity betwen the words to be related. Intervening words might have already ben discarded from the input text on the basis of stopword filtering. Stopwords list include al gramatical close type words of the language considered useles for the main purpose of IR/IE practitioners sen that they canot be used to denote concepts. Stopwords constitute what is usualy regarded the noisy part of the chanel in information theory. However, it is just because the redundancy of the information chanel is guaranted by the presence of gramatical words that the mesage gets apropriately computed by the subject of the comunication proces, i.e. human beings. Besides, entropy is not to be computed in terms of number of words or leters of the alphabet, but in number of semantic and syntactic relation entertained by open clas words (nouns, verbs, adjectives, adverbials) basicaly by virtue of closed clas words. Redundancy should then be computed on the basis of the ambiguity intervening when enumerating those relations, a very hard task to acomplish which has never ben atemped yet, at least to my knowledge.</Paragraph> <Paragraph position="2"> What people working with TEs noted was just the problem of encoding relations apropriately, at least some of these relations. The IR/IE BOWs aproach sufers (at least) from Reversible Arguments Problem (se [7]) - What do frogs eat? vs What eats frogs? The verb &quot;eat&quot; entertains asymetrical relations with its SUBJect and its OBJect: in one case we talk of the &quot;eater&quot;, the SUBJect and in another case of the &quot;eate&quot;, the OBJect. Other similar problems ocur with TEs when the two elements of the relation have the same head, as in: -The president of Rusia visited the president of China. Who visited the president? The question wil not be properly answered in lack of some clarification dialogue intervening, but the coresponding TEs should have more structure to be able to represent the internal relations of the two presidents. The asymetry of relation in transitive constructions involving verbs of acomplishments and achievements (or simply world-changing events) is however further complicated by a number of structural problems which are typicaly found in most languages of the world, the first one and most comon being Pasive constructions: i.John kiled Tom.</Paragraph> <Paragraph position="3"> i.Tom was kiled by a man.</Paragraph> <Paragraph position="4"> Who kiled the man? Answer to the question would be answered by &quot;John&quot; in case the information available was represented by sentence in i., but it would be answered by &quot;Tom&quot; in case the information available was represented by sentence i. Obviously this would hapen only in lack of suficient NLP elaboration: a to shalow aproach would not be able to capture presence of a pasive structure. We are here refering to &quot;Chunk&quot;-based aproaches those in which the object of computation is constituted by the creation of Noun Phrases and no atempt is made to compute clause-level structure.</Paragraph> <Paragraph position="5"> There is a certain number of other similar structure in texts which must be regarded as inducing into the same type of miscomputation: i.e. taking the surface order of NPs as indicating the dep intended meaning. In al of the folowing constructions the surface subject is on the contrary the dep object thus the Afected Theme or argument that sufers the efects of the action expresed by the governing verb rather than the Agent: Inchoatized structures; Ergativized structures; Impersonal structures Other important and typical structures which constitute problematic cases for a surface chunks based TEs aproach to text computation are the folowing ones in which one of the arguments is mising and Control should be aplied by a governing NP, they are caled in one definition Open Predicative structures and they are Relative clauses; Fronted Adjectival adjunct clauses; Infinitive clauses; Fronted Participial clauses,; Gerundive Clauses; Eliptical Clauses; Cordinate constructions In adition to that there is one further problem and is definable as the Factuality Prejudice: by colecting keywords and TEs people aply a Factuality Presuposition to the text they are mining: they believe that al terms being recovered by the search represent real facts. This is however not true and the problem is related to the posibility to detect in texts the presence of such semantic indicators as those listed here below: Negation; Quantification; Opaque contexts (wish, want); Future, Subjunctive Mode; Modality; Conditionals Finaly there is a discourse related problem and is the Anaphora Resolution problem which is the hardest to be tackled by NLP: it is a fact that anaphoric relations are the building blocks of cohesivenes and coherence in texts. Whenever an anaphoric link is mised one relation wil be asigned to a wrong refering expresion thus presumably jeopardising the posibility to answer a related question apropriately. This is we believe the most relevant topic to be put forward in favour of the ned to have symbolic computational linguistic procesing (besides statistical procesing).</Paragraph> </Section> </Section> <Section position="4" start_page="10" end_page="14" type="metho"> <SectionTitle> 3 GETARUNS - the NLUS </SectionTitle> <Paragraph position="0"> GETARUN, the System for Natural Language Understanding, produces a semantic representation in xml format, in which each sentence of the input text is divided up into predicate-argument structures where arguments and adjuncts are related to their apropriate head. Consider now a simple sentence like the folowing: (1) John went into a restaurant GETARUNS represents this sentence in diferent maners acording to whether it is operating in Complete or in Shalow modality. In turn the operating modality is determined by its ability to compute the curent text: in case of failure the system wil switch automaticaly from Complete to Partial/Shalow modality.</Paragraph> <Paragraph position="1"> The system wil produce a representation inspired by Situation Semantics[14] where reality is represented in Situations which are colections of Facts: in turn facts are made up of Infons which are information units characterised as folows: In adition each Argument has a semantic identifier which is unique in the Discourse Model and is used to individuate the entity uniquely. Also propositional facts have semantic identifiers asigned, thus constituting second level ontological objects. They may be &quot;quantified&quot; over by temporal representations but also by discourse level operators, like subordinating conjunctions and a performative operator if neded. Negation on the contrary is expresed in each fact.</Paragraph> <Paragraph position="2"> In case of failure at the Complete level, the system wil switch to Partial and the representation wil be deprived of its temporal and spatial location information. In the curent version of the system, we use Complete modality for tasks which involve short texts (like the students sumaries and text understanding queries), where text analyses may be supervisioned and updates to the gramar and/or the lexicon may be neded. For unlimited text from the web we only use partial modality. Evaluation of the two modalities are reported in a section below.</Paragraph> <Section position="1" start_page="10" end_page="10" type="sub_section"> <SectionTitle> 3.1 The Parser and the Discourse Model </SectionTitle> <Paragraph position="0"> As said above, the query building proces neds an ontology which is created from the translation of the Discourse Model built by GETARUNS in its Complete/Partial Representation. GETARUNS, is equiped with thre main modules: a lower module for parsing where sentence strategies are implemented; a midle module for semantic interpretation and discourse model construction which is cast into Situation Semantics; and a higher module where reasoning and generation takes place. The system works in Italian and English.</Paragraph> <Paragraph position="1"> Our parser is a rule-based deterministic parser in the sense that it uses a lokahead and a Wel-Formed Substring Table to reduce backtracking. It also implements Finite State Automata in the task of tag disambiguation, and produces multiwords whenever lexical information alows it. In our parser we use a number of parsing strategies and graceful recovery procedures which folow a strictly parameterized aproach to their definition and implementation. A shalow or partial parser is also implemented and always activated before the complete parse takes place, in order to produce the default baseline output to be used by further computation in case of total failure. In that case partial semantic maping wil take place where no Logical Form is being built and only refering expresions are aserted in the Discourse Model - but se below.</Paragraph> </Section> <Section position="2" start_page="10" end_page="11" type="sub_section"> <SectionTitle> 3.2 Lexical Information </SectionTitle> <Paragraph position="0"> The output of gramatical modules is then fed onto the Binding Module(BM) which activates an algorithm for anaphoric binding in LFG (se [13]) terms using f-structures as domains and gramatical functions as entry points into the structure. We show here below the architecture of the system. The gramar is equiped with a lexicon containing a list of 300 wordforms derived from Pen Trebank.</Paragraph> <Paragraph position="1"> However, morphological analysis for English has also ben implemented and used for OV words. The system uses a core fuly specified lexicon, which contains aproximately 10,00 most frequent entries of English.</Paragraph> <Paragraph position="2"> In adition to that, there are al lexical forms provided by a fuly revised version of COMLEX. In order to take into acount phrasal and adverbial verbal compound forms, we also use lexical entries made available by UPen and TAG encoding. Their gramatical verbal syntactic codes have then ben adapted to our formalism and is used to generate an aproximate subcategorization scheme with an aproximate aspectual clas asociated to it.</Paragraph> <Paragraph position="3"> Semantic inherent features for Out of Vocabulary words, be they nouns, verbs, adjectives or adverbs, are provided by a fuly revised version of WordNet - 270,00 lexical entries - in which we used 75 semantic clases similar to those provided by CoreLex. Subcategorization information and Semantic Roles are then derived from a carefuly adapted version of FrameNet and VerbNet. Our &quot;training&quot; corpus is made up of 20,00 words and contains a number of texts taken from diferent genres, portions of the UPen Trebank corpus, test-suits for gramatical relations, and sentences taken from COMLEX manual. An evaluation caried out on the Susan Corpus related GREVAL testsuite made of 50 sentences has ben reported lately [12] to have achieved 90% F-measure over al major gramatical relations. We achieved a similar result with the shalow cascaded parser, limited though to only SUBJect and OBJect relations on LFG-XEROX 70 corpus.</Paragraph> </Section> <Section position="3" start_page="11" end_page="12" type="sub_section"> <SectionTitle> 3.3 The Upper Module </SectionTitle> <Paragraph position="0"> GETARUNS, as shown in Fig.2 has a linguisticaly-based semantic module which is used to build up the Discourse Model. Semantic procesing is strongly modularized and distributed amongst a number of diferent submodules which take care of Spatio-Temporal Reasoning, Discourse Level Anaphora Resolution, and other subsidiary proceses like Topic Hierarchy which wil impinge on Relevance Scoring when creating semantic individuals. These are then aserted in the Discourse Model (hence the DM), which is then used to solve nominal coreference together with WordNet. Semantic Maping is performed in two steps: at first a Logical Form is produced which is a structural maping from DAGs onto of unscoped wel-formed formulas. These are then turned into situational semantics informational units, infons which may become facts or sits.</Paragraph> <Paragraph position="1"> In each infon, Arguments have each a semantic identifier which is unique in the DM and is used to individuate the entity. Also propositional facts have semantic identifiers asigned thus constituting second level ontological objects. They may be &quot;quantified&quot; over by temporal representations but also by discourse level operators, like subordinating conjunctions. Negation on the contrary is expresed in each fact. Al entities and their properties are aserted in the DM with the relations in which they are involved; in turn the relations may have modifiers sentence level adjuncts and entities may also have modifiers or atributes. Each entity has a polarity and a couple of spatiotemporal indices which are linked to main temporal and spatial locations if any exists; else they are linked to presumed time reference derived from tense and aspect computation. Entities are maped into semantic individuals with the folowing ontology: on first ocurence of a refering expresion it is aserted as an INDividual if it is a definite or indefinite expresion; it is aserted as a CLAS if it is quantified (depending on quantifier type) or has no determiner. Special individuals are ENTs which are asociated to discourse level anaphora which bind relations and their arguments.</Paragraph> <Paragraph position="2"> Finaly, we have LOCs for main locations, both spatial and temporal. Whenever there is cardinality determined by a digit, its number is plural or it is quantified (depending on quantifier type) the refering expresion is aserted as a SET. Cardinality is simply infered in case of naked plural: in case of colective nominal expresion it is set to 10, otherwise to 5. On second ocurence of the same nominal head the semantic index is recovered from the history list and the system checks whether it is the same refering expresion: - in case it is definite or indefinite with a predicative role and no atributes nor modifiers, nothing is done; - in case it has diferent number - singular and the one present in the DM is a set or a clas, nothing hapens; - in case it has atributes and modifiers which are diferent and the one present in the DM has none, nothing hapens; - in case it is quantified expresion and has no cardinality, and the one present in the DM is a set or a clas, again nothing hapens.</Paragraph> <Paragraph position="3"> In al other cases a new entity is aserted in the DM which however is also computed as being included in (a superset of) or by (a subset of) the previous entity.</Paragraph> <Paragraph position="4"> The uper module of GETARUNS has ben evaluated on the basis of its ability to perform anaphora resolution and to individuate refering expresions, with a corpus of 40,00 words: it achieved 74% F-measure.</Paragraph> <Paragraph position="5"> 4. Two experiments with GETURANS As an example of the shalow system we discus here below the analysis of a newspaper article which as would usualy be the case has a certain number of pronominal expresions, which modify the relevance of lexical descriptions in the overal procesing for the search of either &quot;Named Entities&quot; or simply entities individuated by comon nouns. If the count is based solely on lexical lemata and not on the presence of coreferential pronominal expresions, the results wil be heavily biased and certainly wrong. Here is the text: propaganda with boring content and largely ignoring interactivity.&quot; 7.The report concludes: &quot;The new media is a way for them to get closer to the public without necesarily alowing the public to become overly familiar in return.</Paragraph> <Paragraph position="6"> 8.The authors - Rachel Gibson and Stephen Ward - go on to state that this may be because parties stil regard the web as an electionering tol, rather than as a democratic device.</Paragraph> <Paragraph position="7"> 9.They said: &quot;Very few ofered original material, or changed their sites noticeably over the course of the campaign.</Paragraph> <Paragraph position="8"> 10.Inded, a large majority of local sites were realy no more than static electronic brochures.&quot; 1.They dub this &quot;rather disapointing&quot;, but praise the Liberal Democrats as &quot;clearly the most active&quot; with around 150 sites. The report concludes: &quot;Parties, as with the general public, ned incentives to use the technology.</Paragraph> <Paragraph position="9"> 12.As yet, there sems more to lose and les to gain if they make mistakes experimenting with the technology.&quot; We highlighted pronominal expressions in bold. In a BOWs approach, the count for most relevant topics is solely based on lexical descriptions and &quot;party, internet&quot; are computed as the most important key-words. However, after the text has been passed by the partial semantic analysis, &quot;researcher, author&quot; come up as important topics. We report here below the output of the Anaphora Resolution module: in interaction with the Discourse Model where semantic indices are asserted for each entity. Sentence numbers are taken from the text. We report Anaphora Resolution decisions: in particular in sentences where a pronoun is coreferred to an antecedent, the antecedent is set as current Main Topic and its semantic ID is used.</Paragraph> <Paragraph position="10"> 1. state(1, change) topics: main:party, secondary: internet topics(1, main, id1; secondary, id2; potential, id3) 2. state(2, continue) topics: main:party, secondary: survey topics(2, main, id1; secondary, id7; potential, id2) 3. state(3, retaining) topics: main: researcher, secondary: party topic(3, main, id18; secondary, id1; , id19) 4. Anaphora Resolution: their resolved as researcher state(4, continue) topics: main: researcher, secondary: contest topics(4, main, id18; secondary, id26; potential, id27) 5. state(5, retaining) topics: main: report, secondary: researcher topics(5, main, id7; secondary, id18; potential, id1) 6. Anaphora Resolution: it resolved as report state(6, continue) topics: main: report, secondary: party topics(6, main, id7; secondary, id1; potential, id40) 7. state(7, continue) topics: main: report, secondary: party topics(7, main, id7; secondary, id1; potential, id2) 8. The authors - Rachel Gibson and Stephen Ward - go on to state that this may be because parties stil regard the web as an electionering tol, rather than as a democratic device.</Paragraph> <Paragraph position="11"> Anaphora Resolution: this resolved as 'discourse bound' state(8, retaining) topics: main: author, secondary: report topics(8, main, id54; secondary, id7; potential, id5) 9. Anaphora Resolution: they resolved as author state(9, continue) topics: main: author, secondary: material topics(9, main, id54; secondary, id61; potential, id62) 10. state(10, continue) topics: main: author, secondary: site topics(10, main, id54; secondary, id67; potential, id68) 1. Anaphora Resolution: this resolved as 'discourse bound'; they resolved as author state(1, retaining) topics: main: author, secondary: active topics(1, main, id54; secondary, id71; potential, id72) 12. Anaphora Resolution: they resolved as party state(12, continue) topics: main: party, secondary: mistake topics(12, main, id1; secondary, id78)</Paragraph> </Section> <Section position="4" start_page="12" end_page="13" type="sub_section"> <SectionTitle> 4.1 The First Experiment: Anaphora Resolution in Technical Manuals </SectionTitle> <Paragraph position="0"> We downloaded the only frely available corpus anotated with anaphoric relations, i.e. Wolverhampton's Manual Corpus made available by Prof. Ruslan Mitkov on his website. The corpus contains text from Manuals at the folowing adres, htp:/clg.wlv.ac.uk/resources/corpus.html We reported in Tab. 2 the general data of the Coreference Corpus. As can be easily noted, there is no direct relationship existing betwen the number of refering expresions and the number of corefering expresions. We asume that the higher the number of corefering expresions in a text the higher is the cohesion achieved. Thus the text identified as CDROM has a very smal number of corefering expresions if compared to the total number of refering expresions. The proportion of refering expresions to words and of corefering expresions to refering expresions is reported in percent value in table 3. where the most highly cohesive texts are highlighted in italics; highly non cohesive texts are highlighted in bold: The final results are reported in the folowing figure where we plot Precision and Recal for each text and then the comprehensive values.</Paragraph> </Section> <Section position="5" start_page="13" end_page="14" type="sub_section"> <SectionTitle> 4.2 GETARUNS approach to WEB-Q/A </SectionTitle> <Paragraph position="0"> Totaly shalow aproaches when compared to ours wil always be lacking suficient information for semantic procesing at propositional level: in other words, as hapens with our &quot;Partial&quot; modality, there wil be no posibility of checking for precision in producing predicate-argument structures.</Paragraph> <Paragraph position="1"> Most systems would use some Word Matching algorithm to count the number of words apearing in both question and the sentence being considered after striping stopwords: usualy two words wil match if they share the same morphological rot after some steming has taken place. Most QA systems presented in the literature rely on the clasification of words into two clases: function and content words. They don't make use of a Discourse Model where input text has ben transformed via a rigorous semantic maping algorithm: they rather aces taged input text in order to sort best matched words, phrases or sentences acording to some scoring function. It is an acepted fact that introducing or increasing the amount of linguistic knowledge over crude IR-based systems wil contribute substantial improvements. In particular, systems based on simple Named-Entity identification tasks are to rigid to be able to match phrase relations constraints often involved in a natural language query.</Paragraph> <Paragraph position="2"> We raise a number of objections to these aproaches: first objection is the imposibility to take into acount pronominal expresions, their relations and properties as belonging to the antecedent, if no head transformation has taken place during the analysis proces.</Paragraph> <Paragraph position="3"> Another objection comes from the treatment of the Question: it is usualy the case that QA systems divide the question to be answered into two parts: the Question Target represented by the wh- word and the rest of the sentence; otherwise the words making up the yes/no question are taken in their order, and then a match takes place in order to identify most likely answers in relation to the rest/whole of the sentence except for stopwords. However, it is just the semantic relations that ned to be captured and not just the words making up the question that mater. Some systems implemented more sophisticated methods (notably [8;9;10]) using syntactic-semantic question analysis. This involves a robust syntactic-semantic parser to analyze the question and candidate answers, and a matcher that combines wordand parse-tre-level information to identify answer pasages more precisely.</Paragraph> </Section> <Section position="6" start_page="14" end_page="14" type="sub_section"> <SectionTitle> 4.3 A Prototype Q/A system for the web </SectionTitle> <Paragraph position="0"> We experimented our aproach over the web using 450 factoid questions from TREC. On a first run the base system only used an of-the-shelf tager in order to recover main verb from the query. In this way we managed to get 67% corect results, by this meaning that the corect answer was contained in the best five snipets selected by the BOWs system on the output of Gogle API. However, only 30% of the total corect results had the right snipet ranked in position one.</Paragraph> <Paragraph position="1"> Then we aplied GETARUNS shalow on the best five snipets with the intent of improving the automatic ranking of the system and have the best snipet always position as first posibility. Here below is a figure showing the main components for GETARUNS based analysis.</Paragraph> <Paragraph position="2"> We wil present two examples and discus them in some detail. The questions are the folowing ones: Q: Who was elected president of South Africa in 194? A: Nelson Mandela Q: When was Abraham Lincoln born? A: Lincoln was born February_12_1809 The answers produced by our system are indicated after each question. Now consider the best five snipets as filtered by the BOWs system: yesterday caled by the South African government and the Transitional Executive Council to smoth the way for the peaceful reincorporation of the homeland into South Africa folowing the resignation of Oupa Gqozo as president.</Paragraph> <Paragraph position="3"> Notice snipet n.1 where two presidents are present and two dates are reported for each one: however the relation &quot;president&quot; is only indicated for the wrong one, Mbeki and the system rejects it. The answer is colected from snipet no.4 instead. As a mater of fact, after computing the ADM, the system decides to rerank the snipets and use the contents of snipet 4 for the answer. Now the second question: when/WRB was/VBD abraham/N lincoln/N born/VBN Main keywords: abraham lincoln Verb rots: bear Gogle search: abraham lincoln born 1. Abraham Lincoln was born in a log cabin in Kentucky to Thomas and Nancy Lincoln.</Paragraph> <Paragraph position="4"> 2. Two months later on February 12, 1809, Abraham Lincoln was born in a one-rom log cabin near the Sinking Spring.</Paragraph> <Paragraph position="5"> 3. Abraham Lincoln was born in a log cabin near Hodgenvile, Kentucky.</Paragraph> <Paragraph position="6"> 4.Lincoln himself set the date of his birth at feb_ 12, 1809, though some have atempted to disprove that claim .</Paragraph> <Paragraph position="7"> 5. A. Lincoln ( February 12, 1809 April 15, 1865 ) was the 16/th president of the United States of America.</Paragraph> <Paragraph position="8"> In this case, snipet n.2 is selected by the system as the one containing the required information to answer the question. In both cases, the answer is built from the ADM, so it is not precisely the case that the snipets are selected for the answer: they are nonetheles reranked to make the answer available.</Paragraph> </Section> </Section> <Section position="5" start_page="14" end_page="15" type="metho"> <SectionTitle> 5. System Evaluation </SectionTitle> <Paragraph position="0"> After runing with GETARUNS, the 450 questions recovered the whole of the original corect result 67% from first snipet.</Paragraph> <Paragraph position="1"> The complete system has ben tested with a set of texts derived from newspapers, narative texts, children stories. The performance is 75% corect. However, updating and tuning of the system is required for each new text whenever a new semantic relation is introduced by the parser and the semantics does not provide the apropriate maping. For instance, consider the case of the constituent &quot;holes in the tre&quot;, where the syntax produces the apropriate structure but the semantics does not map &quot;holes&quot; as being in a LOCATion semantic relation with &quot;tre&quot;. In lack of such a semantic role information a dumy &quot;MODal&quot; wil be produced which however wil not generate the adequate semantic maping in the DM and the meaning is lost.</Paragraph> <Paragraph position="2"> As to the partial system, it has ben used for DUC sumarization contest, i.e. it has run over aproximately 1 milion words, including training and test sets, for a number of sentences totaling over 50K. We tested the &quot;Partial&quot; modality with an aditional 90,00 words texts taken from the testset made available by DUC 202 contest. On a preliminary perusal of samples of the results, we calculated 85% Precision on parsing and 70% on semantic maping. However evaluating ful results requires a manualy anotated database in which al linguistic properties have ben carefuly decided by human anotators. In lack of such a database, we are unable to provide precise performance data. The system has also ben used for the RTE Chalenge and performance was over 60% corect [1].</Paragraph> </Section> class="xml-element"></Paper>