File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1812_concl.xml

Size: 1,562 bytes

Last Modified: 2025-10-06 13:53:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1812">
  <Title>An Empirical Model of Multiword Expression Decomposability</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> While evaluation pointed to a moderate correlation between LSA similarities and occurrences of hyponymy, we have yet to answer the question of exactly where the cutoffs between simple decomposable, idiosyncratically decomposable and non-decomposable MWEs lie. While it would be possible to set arbitrary thresholds to artificially partition up the space of MWEs based on LSA similarity (or alternatively use statistical tests to derive confidence intervals for similarity values), we feel that more work needs to be done in establishing exactly what different LSA similarities for different MWEconstituent word combinations mean.</Paragraph>
    <Paragraph position="1"> One area in which we plan to extend this research is the analysis of MWEs in languages other than English. Because of LSA's independence from linguistic constraints, it is equally applicable to all languages, assuming there is some way of segmenting inputs into constituent words.</Paragraph>
    <Paragraph position="2"> To summarise, we have proposed a constructioninspecific empirical model of MWE decomposability, based on latent semantic analysis. We evaluated the method over English NN compounds and verbparticles, and showed it to correlate moderately with WordNet-based hyponymy values.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML