File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1202_intro.xml
Size: 4,054 bytes
Last Modified: 2025-10-06 14:03:53
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1202"> <Title>Measuring MWE Compositionality Using Semantic Annotation</Title> <Section position="4" start_page="2" end_page="3" type="intro"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> In recent years, various approaches have been proposed to the analysis of MWE compositionality. Many of the suggested approaches employ statistical algorithms.</Paragraph> <Paragraph position="1"> One of the earliest studies in this area was reported by Lin (1999) who assumes that &quot;non-compositional phrases have a significantly different mutual information value than the phrases that are similar to their literal meanings&quot; and proposed to identify non-compositional MWEs in a corpus based on distributional characteristics of MWEs. Bannard et al. (2003) tested techniques using statistical models to infer the meaning of verb-particle constructions (VPCs), focus- null In this lexicon, many MWEs are encoded as templates, such as driv*_* {Np/P*/J*/R*} mad_JJ, which represent variational forms of a single MWE, For further details, see Rayson et al., 2004.</Paragraph> <Paragraph position="2"> ing on prepositional particles. They tested four methods over four compositional classification tasks, reporting that, on all tasks, at least one of the four methods offers an improvement in precision over the baseline they used.</Paragraph> <Paragraph position="3"> McCarthy et al. (2003) suggested that compositional phrasal verbs should have similar neighbours as for their simplex verbs. They tested various measures using the nearest neighbours of phrasal verbs and their simplex counterparts, and reported that some of the measures produced results which show significant correlation with human judgments. Baldwin et al. (2003) proposed a LSA-based model for measuring the decomposability of MWEs by examining the similarity between them and their constituent words, with higher similarity indicating the greater decomposability. They evaluated their model on English noun-noun compounds and verb-particles by examining the correlation of the results with similarities and hyponymy values in WordNet. They reported that the LSA technique performs better on the low-frequency items than on more frequent items. Venkatapathy and Joshi (2005) measured relative compositionality of collocations having verb-noun pattern using a SVM (Support Vector Machine) based ranking function. They integrated seven various collocational and contextual features using their ranking function, and evaluated it against manually ranked test data. They reported that the SVM based method produces significantly better results compared to methods based on individual features.</Paragraph> <Paragraph position="4"> The approaches mentioned above invariably depend on a variety of statistical contextual information extracted from large corpus data. Inevitably, such statistical information can be affected by various uncontrollable &quot;noise&quot;, and hence there is a limitation to purely statistical approaches.</Paragraph> <Paragraph position="5"> In this paper, we contend that a manually compiled semantic lexical resource can have an important part to play in measuring the compositionality of MWEs. While any approach based on a specific lexical resource may lack generality, it can complement purely statistical approaches by importing human expert knowledge into the process. Particularly, if such a resource has a high lexical coverage, which is true in our case, it becomes much more useful for dealing with general English. It should be emphasized that we propose our semantic lexical-based approach not as a substitute for the statistical approaches.</Paragraph> <Paragraph position="6"> Rather we propose it as a potential complement to them.</Paragraph> <Paragraph position="7"> In the following sections, we describe our experiment and explore this approach to the issue of automatic estimation of MWE compositionality. null</Paragraph> </Section> class="xml-element"></Paper>