File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0910_intro.xml
Size: 4,902 bytes
Last Modified: 2025-10-06 14:02:32
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0910"> <Title>Paraphrastic Grammars</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A salient feature of natural language is that it allows paraphrases that is, it allows different verbalisations of the same content. Thus although the various verbalisations in (1) may have different pragmatic or communicative values (with respect for instance to topicalisation, presuppositions or focus/ground partitioning), they all share a core semantic content, the content approximated by a traditional montagovian compositional semantics.</Paragraph> <Paragraph position="1"> (1) a. La croisi`ere co^ute cher.</Paragraph> <Paragraph position="2"> Lit. the cruse is expensive b. Le co^ut de la croisi`ere est 'elev'e.</Paragraph> <Paragraph position="3"> Lit. the cost of the cruse is high c. La croisi`ere a un co^ut 'elev'e Lit. the cruse has a high cost Linguists have long noticed the pervasiveness of paraphrases in natural language and attempted to caracterise it. Thus for instance Chomsky's &quot;transformations&quot; capture the relation between one core meaning (a deep structure in Chomsky's terms) and several surface realisations (for instance, between the passive and the active form of the same sentence) while (Mel'Vcuk, 1988) presents sixty paraphrastic rules designed to account for paraphrastic relations between sentences.</Paragraph> <Paragraph position="4"> More recently, work in information extraction (IE) and question answering (QA) has triggered a renewed research interest in paraphrases as IE and QA systems typically need to be able to recognise various verbalisations of the content. Because of the large, open domain corpora these systems deal with, coverage and robustness are key issues and much on the work on paraphrases in that domain is based on automatic learning techniques. For instance, (Lin and Pantel, 2001) acquire two-argument templates (inference rules) from corpora using an extended version of the distributional analysis in which paths in dependency trees that have similar arguments are taken to be close in meaning. Similarly, (Barzilay and Lee, 2003) and (Shinyanma et al., 2002) learn sentence level paraphrase templates from a corpus of news articles stemming from different news source. And (Glickman and Dagan, 2003) use clustering and similarity measures to identify similar contexts in a single corpus and extract verbal paraphrases from these contexts.</Paragraph> <Paragraph position="5"> Such machine learning approaches have known pros and cons. On the one hand, they produce large scale resources at little man labour cost. On the other hand, the degree of descriptive abstraction offered by the list of inference or paraphrase rules they output is low.</Paragraph> <Paragraph position="6"> We chose to investigate an alternative research direction by aiming to develop a &quot;paraphrastic grammar&quot; that is, a grammar which captures the paraphrastic relations between linguistic structures1. Based on a computational grammar that associates natural language expressions with both a syntactic and a semantic representation, a paraphrastic gram1As we shall briefly discuss in section 4, the grammar is developed with the help of a meta-grammar (Candito, 1999) thus ensuring an additional level of abstraction. The metagrammar is an abstract specification of the linguistic properties (phrase structure, valency, realisation of grammatical functions etc.) encoded in the grammar basic units. This specification is then compiled to automatically produce a specific grammar.</Paragraph> <Paragraph position="7"> mar is a grammar that moreover associates paraphrases with the same semantic representation. That is, contrary to machine learning based approaches which relate paraphrases via sentence patterns, the paraphrastic grammar approach relates paraphrases via a common semantic representation. In this way, the paraphrastic approach provides an interesting alternative basis for generation from conceptual representations and for the inference-based, deep semantic processing of the kind that is ultimately needed for high quality question answering.</Paragraph> <Paragraph position="8"> Specifically, we aim at developing a paraphrastic grammar for French, based on the Tree Adjoining Grammar (TAG) developed for this language by Anne Abeill'e (Abeill'e, 2002).</Paragraph> <Paragraph position="9"> The paper is structured as follows. We start by proposing a typology of the paraphrastic means made available by natural language. We then show how this typology can be used to develop a testsuite for developing and evaluating a paraphrastic grammar. Finally, we highlight some of the issues arising when developing a paraphrastic grammar.</Paragraph> </Section> class="xml-element"></Paper>