File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1621_intro.xml
Size: 3,243 bytes
Last Modified: 2025-10-06 14:04:00
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1621"> <Title>Lexical Reference: a Semantic Matching Subtask</Title> <Section position="4" start_page="172" end_page="172" type="intro"> <SectionTitle> 2 Background </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="172" end_page="172" type="sub_section"> <SectionTitle> 2.1 Term Matching </SectionTitle> <Paragraph position="0"> Thesaurus-based term expansion is a commonly used technique for enhancing the recall of NLP systems and coping with lexical variability. Expansion consists of altering a given text (usually a query) by adding terms of similar meaning.</Paragraph> <Paragraph position="1"> WordNet is commonly used as a source of related words for expansion. For example, many QA systems perform expansion in the retrieval phase using query related words based on WordNet's lexical relations such as synonymy or hyponymy (e.g (Harabagiu et al., 2000; Hovy et al., 2001)). Lexical similarity measures (e.g. (Lin, 1998)) have also been suggested to measure semantic similarity. They are based on the distributional hypothesis, suggestingthatwordsthatoccurwithinsimilar contexts are semantically similar.</Paragraph> </Section> <Section position="2" start_page="172" end_page="172" type="sub_section"> <SectionTitle> 2.2 Textual Entailment </SectionTitle> <Paragraph position="0"> TheRecognisingTextualEntailment(RTE-1)challenge (Dagan et al., 2006) is an attempt to promote anabstractgenerictaskthatcapturesmajorsemantic inference needs across applications. The task requires to recognize, given two text fragments, whether the meaning of one text can be inferred (entailed) from another text. Different techniques and heuristics were applied on the RTE-1 dataset to specifically model textual entailment. Interestingly, a number of works (e.g. (Bos and Markert, 2005; Corley and Mihalcea, 2005; Jijkoun and de Rijke, 2005; Glickman et al., 2006)) applied or utilized lexical based word overlap measures. Various word-to-word similarity measures where applied, including distributional similarity (such as (Lin, 1998)), web-based co-occurrence statistics and WordNet based similarity measures (such as (Leacock et al., 1998)).</Paragraph> </Section> <Section position="3" start_page="172" end_page="172" type="sub_section"> <SectionTitle> 2.3 Paraphrase Acquisition </SectionTitle> <Paragraph position="0"> A substantial body of work has been dedicated to learning patterns of semantic equivalency between different language expressions, typically considered as paraphrases. Recently, several works addressed the task of acquiring paraphrases (semi-) automatically from corpora. Most attempts were based on identifying corresponding sentences in parallel or 'comparable' corpora, where each corpus is known to include texts that largely correspond to texts in another corpus (e.g. (Barzilay and McKeown, 2001)). Distributional Similarity was also used to identify paraphrase patterns from a single corpus rather than from a comparable set of corpora (Lin and Pantel, 2001). Similarly, (Glickman and Dagan, 2004) developed statistical methods that match verb paraphrases within a regular corpus.</Paragraph> </Section> </Section> class="xml-element"></Paper>