File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-1606_evalu.xml
Size: 5,133 bytes
Last Modified: 2025-10-06 13:59:04
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1606"> <Title>Normalization and Paraphrasing Using Symbolic Methods</Title> <Section position="7" start_page="0" end_page="0" type="evalu"> <SectionTitle> 4 Evaluation </SectionTitle> <Paragraph position="0"> We decided to perform two kinds of evaluation First, we wanted to check if our system performs correctly the extraction of the selected information.</Paragraph> <Paragraph position="1"> Second, we wanted to verify the impact of the normalization and the corpus oriented paraphrase modules in the obtained results.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Performance of the whole system for </SectionTitle> <Paragraph position="0"> information extraction In order to evaluate the results of the information extraction system, we apply the full chain of information extraction on an unseen collection of 30 texts describing toxic substances. Then we associate the output predicates to the corresponding texts and ask each of the five evaluators to compare six pairs of texts/predicates. We ask them to read carefully the texts and to fill a table which covers the different types of information in scope, i.e substance, physical form, colour, odor, synonyms, physical properties, and use. For each topic, they have to express what is missing, superfluous or wrong in the list of predicates, compared to the original texts. We consider one missing answer for each missing information detected by the evaluators. And we consider an incorrect response for each information that had been extracted by the system and that did not correspond to any realization in text. We then compute precision and recall, obtaining the following results: We obtain a high precision result which could be expected considering our IE methodology. In most of the cases, when the information has been extracted, it is correct. However, most of the problems are a consequence of insufficient coverage of both the extraction grammar (problems with structural ambiguity) and domain-knowledge. The main sources of errors which have been identified during the evaluation comes from : Coordination detection problems. For example, from the sentence Hexachlorobutadiene is also used as a solvent, and to make lubricants, in gyroscopes, as a heat transfer liquid, and as an hydraulic fluid. the system detects only one &quot;use&quot; of the element: USE(Hexachlorobutadiene,solvent,as,NONE,NONE), because the complex coordination has not been solved.</Paragraph> <Paragraph position="1"> Scope of the extraction: from the sentence Nitrobenzene is used in the manufacture of dyes, the system extracts USE(Nitrobenzene,manufacture,in,NONE,NONE), because the PP of dyes was not expected in the structure of the USE predicate.</Paragraph> <Paragraph position="2"> Domain-knowledge coverage: form the sentence Acetone completely miscible in water and soluble in organics., the system extract PROP-ERTY(Acetone,dissolve,in,organic,NONE), because soluble is encoded as a property equivalent to dissolve in the lexical relations for paraphrasing. However, it should also extract PROP-ERTY(Acetone,mix,in,water,NONE), but miscible was not coded as a possible chemical property adjective.</Paragraph> <Paragraph position="3"> From the evaluation results, it appears that further developments need to focus on recall improvement.</Paragraph> <Paragraph position="4"> This could be achieved by: extending our paraphrase detection module: Some equivalences have not been yet considered. For instance, take fire which did not appear in the working corpus, appeared in the test corpus. This expression had not been coded as a possible equivalent of burn, therefore expected information about the physical property of burning for a given element is missing when this property is expressed in the text by take fire; enriching the ontological knowledge of the domain; null Improving structural ambiguity resolution: Coordination and PP attachment resolution could be improved by the development of more fine-grained semantic and ontological resources. null</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Impact of the normalization and corpus </SectionTitle> <Paragraph position="0"> oriented paraphrase modules This second experiment was intended to verify in what extent the normalization and paraphrase detection module affect the results obtained in the previous evaluation. This test was performed taking away from the complete processing chain, the modules described in sections 3.2 and 3.3.2. The results show that we only obtained about 60% of the predicates found in the first version. In other words, without these processing steps, recall decreases in a dramatic way. All predicates found in this second experiment were also found in the first. Missing predicates in the second experiment were the most complex to extract (i.e. USE, PROPERTY, ORIGIN), since they intensively involve reformulations and lexical equivalences. null</Paragraph> </Section> </Section> class="xml-element"></Paper>