File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/p99-1081_intro.xml

Size: 2,059 bytes

Last Modified: 2025-10-06 14:06:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="P99-1081">
  <Title>An Unsupervised Model for Statistically Determining Coordinate Phrase Attachment</Title>
  <Section position="4" start_page="0" end_page="610" type="intro">
    <SectionTitle>
2 Background
</SectionTitle>
    <Paragraph position="0"> The statistical model must determine the probability of a given CP attaching either high (H) or low (L), p( attachment I phrase). Results shown come from a development corpus of 500 phrases of extracted head word tuples from the WSJ TreeBank \[MSM93\]. 64% of these phrases attach low and 36% attach high. After further development, final testing will be done on a separate corpus.</Paragraph>
    <Paragraph position="1"> The phrase: Previous work has used corpus-based approaches to solve the similar problem of prepositional phrase attachment. These have included backed-off \[CB 95\], maximum entropy \[RRR94\], rule-based \[HR94\], and unsupervised (busloads (of ((executives) and (their wives))) gives the 6-tuple: L busloads of executives and wives  where, a = L, nl = busloads, p = of, n2 = executives, cc = and, n3 = wives. The CP attachment model must determine a for all (nl p n2 cc n3) sets. The attachment decision is correct if it is the same as the corresponding decision in the TreeBank set.</Paragraph>
    <Paragraph position="2"> The probability of a CP attaching high is conditional on the 5-tuple. The algorithm presented in this paper estimates the probability: regular expressions that replace noun and quantifier phrases with their head words. These head words were then passed through a set of heuristics to extract the unambiguous phrases. The heuristics to find an unambiguous CP are: * wn is a coordinating conjunction (cc) if it is tagged cc.</Paragraph>
    <Paragraph position="4"> The parts of the CP are analogous to those of the prepositional phrase (PP) such that {nl,n2} - {n,v} and n3 - p. JAR98\] determines the probability p(v,n,p,a). To be consistent, here we determine the probability p(nl, n2, n3, a).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML