File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2231_metho.xml

Size: 14,992 bytes

Last Modified: 2025-10-06 14:15:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2231">
  <Title>Structural Disambiguation Based on Reliable Estimation of Strength of Association Haodong Wu Eduardo de Paiva Alves</Title>
  <Section position="3" start_page="1416" end_page="1417" type="metho">
    <SectionTitle>
2 Class-based Estimation of
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="1416" end_page="1417" type="sub_section">
      <SectionTitle>
Strength of Association
</SectionTitle>
      <Paragraph position="0"> The strength of association (SA) may be measured using the frequencies of word co-occurrences in large corpora. For instance, Church and Hanks (1990) calculated SA in terms of mutual information between two words</Paragraph>
      <Paragraph position="2"> here N is the size of the corpus used in the estimation, f(Wl, w2) is the frequency of the cooccurrence, f(wl) and f(w2) that of each word.</Paragraph>
      <Paragraph position="3"> When no co-occurrence is observed, SA may be estimated using the frequencies of word classes that contain the words in question. The mutual information in this case is estimated by:</Paragraph>
      <Paragraph position="5"> here Cl and C2 are the word classes that respectively contain Wl and w2, f(C1) and f(C2) the numbers of occurrences of all the words included in the word classes C1 and C2, and f(C1, C2) is the number of co-occurrences of the word classes C1 and C2.</Paragraph>
      <Paragraph position="6"> Normally, the estimation using word classes needs to select classes, from a taxonomy, for which co-occurrences are significant. We use t-scores for this purpose 1 .</Paragraph>
      <Paragraph position="7"> For a class co-occurrence (C1,C2), the t-score may be approximated by:</Paragraph>
      <Paragraph position="9"> We use the lowest class co-occurrence for which the confidence measured with t-scores is above a threshold 2. Given a co-occurrence containing the word w, our method selects a class for w in the following way:  If i ~ n goto step 3.</Paragraph>
      <Paragraph position="10"> Otherwise exit.</Paragraph>
      <Paragraph position="11"> Step 6: Select the class C i to replace w. Let us see what this means with an example. Suppose we try to estimate SA for (produce, telephone) 3. See Table 1. Here f(v), f(n) and f(vn) axe the frequencies for the verb produce, classes for the noun telephone, and co-occurrences between the verb and the classes for telephone, respectively; and t is the t-score 4. 'The t-score (Church and Mercer, 1993) compares the hypothesis that a co-occurrence is significan~ against the null hypothesis that the co-occurrence can be attributed to chance.</Paragraph>
      <Paragraph position="12">  where i=l,2,...m, j-l,2,...,n, to calculate strengths of lexical associations. But our experiments show that upper classes of a verb are very unreliable to be used to measure the strengths. The reason may be that, unlike nouns, the verbs would not have a &amp;quot;neat&amp;quot; hierarchy or that the upper classes of a verb become too general as they contain too many concepts underneath them. Because of this observation, we use, for the classes of a  verb classes for telephone f(v) f(n) f(vn) t-score produce concrete thing 671 18926 100 -4.6 produce inanimate object 671 5593 69 0.83 produce implement/tool 671 2138 35 1.91 produce machine 671 664 19 2.86 produce communication machine 671 83 1 0.25 produce telephone 671 24 0 Table 1 Estimation of (produce telephone) The lowest class co-occurrence (produce, communication machine) has a low t-score and produces a bad estimation. The most frequent co-occurrence (produce, concrete thing) has a low t-score also reflecting the fact that it may be attributed to chance. The t-scores for (produce, machine) and (produce, implement/tooO are high and show that these co-occurrences are significant. Among them, our method selects the lowest class co-occurrence for which the t-score is above the threshold: (produce, machine).</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="1417" end_page="1420" type="metho">
    <SectionTitle>
3 Disambiguation Using
Class-Based Estimation
</SectionTitle>
    <Paragraph position="0"> We now apply our method to estimate SA for two different types of syntactic constructions and use the results in resolving structural ambiguities. null</Paragraph>
    <Section position="1" start_page="1417" end_page="1419" type="sub_section">
      <SectionTitle>
3.1 Disambiguation of Dependency
Relations in Japanese
</SectionTitle>
      <Paragraph position="0"> Identifying the dependency structure of a Japanese sentence is a difficult problem since the language allows relatively free word orders. A typical dependency relation in Japanese appears in the form of modifier-particle-modificand triplets. When a modifier is followed by a number of possible modificands, verb, the verb itself or, when it does not give us a good result, only the lowest class of the verb in calculating the strength of association (SA). Thus, for an example, the verb eat has a sequence of eat ~ ingest ~ put something into body --%... --&amp;quot; event -&amp;quot; concept in the class hierarchy, but we use only eat and ingest for the verb eat when calculating SA for (eat, apple).</Paragraph>
      <Paragraph position="1"> there arise situations in which syntactic roles may be unable to determine the dependency relation or the modifier-modificand relation. For instance, in ' ~ 0 '(vigorous) may modify either ' q~ ~' (middle aged) or' tll~ ' ( health care).</Paragraph>
      <Paragraph position="2"> But which one is the modiflcand of' ~ ~ ~ 0 ' ? We solve the ambiguity comparing the strength of association for the two or more possible dependency relations.</Paragraph>
      <Paragraph position="3"> Calculation of Strength of Association We calculate the Strength of Association (SA) score for modifier - particle - modi ficand by: SA(rn /;p... m.) = log2 \ /(C.,li.r)/(p..trn.) \] (a) where Cmfie r stands for the classes that include the modifier word, Part is the particle following the modifier, mc the content word in the modificand phrase, and f the frequency.</Paragraph>
      <Paragraph position="4"> Let us see the process of obtaining SA score in an example ( ~ - C/)~- ~ ( ) (literally: professor - subject.marker - work). To calculate the frequencies for the classes associated with ' ~ ', we obtain from the Co-occurrence Dictionary (COD) 5 the number of occurrences for (w- 3 C/-SCOD and CD are provided by Japan Electronic Dictionary Research Institute (EDR, 1993). COD contains the frequencies of individual words and of the modifier- null &lt; ), where w can be any modifier. We then obtain from the Concept Dictionary (CD) 6 the closes that include ' $~' and then sum up all the occurrences of words included in the classes. The relevant portion of CD for ' $~' in ( ~ -$~-~ &lt; ) is shown in Figure 1. The numbers in parenthesis here indicate the summed-up frequencies. null We then calculate the t-score between ' $~&lt; ' and all the classes that include' ~ '. See  The t-score for the co-occurrence of the modifier and particle-modificand pair, '~}~' and '~)~-~ &lt; ', is higher than the threshold when ' ~' is replaced with \[~J~C~_t~)kr~\].</Paragraph>
      <Paragraph position="5"> Using (4), the strength of ~sociation for the co-occurrence of ( ~ - ~)~ - ~ &lt; ) is calculated from the SA between the cl~s \[~R~lJ'C~_?cgk~\] and , ~_~&lt;.' When the word in question has more than one sense, we estimate SA corresponding to each sense and choose the one that results in the highest SA score. For instance, we estimate SA between ' ~' and the various senses of ' ~ &lt; ', and choose the highest value: in this case the one corresponding to the sense 'to be employed.' Determination of Most Strongly Associated Structure After calculating SA for each possible construction, we choose the construction with highest SA score as the most probable strucpm-ticle-modificand triplets in a corpus that includes 220,000 parsed Japanese sentences.</Paragraph>
      <Paragraph position="6"> 6 CD provides a hierarchical structure of concepts corresponding to all the words in COD. The number of concepts in CD is about 400,000.</Paragraph>
      <Paragraph position="7"> ture. See the following example:</Paragraph>
      <Paragraph position="9"> .technic:al progress work people stress |nnovatlon Here, the arrows show possible dependency relations, the numbers on the arrows the estimated SA, and the thick arrows the dependency with highest mutual information that means the most probable dependency relation. In the example, ' ~d:~ ~' modifies ' j~A.'C ' and ' ~ &lt; ' modifes ' A '. The estimated mutual information for ( ~g~#~, ~A,~C ) is 2.79 and that for ( ff~ i, A ) is 6.13. Thus, we choose ' ~_/,~C ' as the modificand for ' ~$C/ ' and' ,k ' as that for ' ~ i ' In the example shown in Figure 2, our method selects the most likely modifier-modificand relation.</Paragraph>
      <Paragraph position="10"> Experiment Disambiguation of dependency relations was done using 75 anlbiguous constructions from Fukumoto (1992). Solving the ambiguity in the constructions involves choosing among two or more modifier-particle-modificand relations. The training data consists of all 568,000 modifier-particle-modificand triplets in COD.</Paragraph>
      <Paragraph position="11"> Evaluation We evaluated the performance of our method comparing its results with those of other methods using the same test and training data. Table 3 shows the various results (success rates). Here, (1) indicates the performance obtained using the principle of Closest Attachment (Kimball, 1973); (2) shows the performance obtained using the lowest observed class co-occurrence (Weischedel et al., 1993); (3) is the result from the maximum mutual information over all pairs of classes corresponding to the words in the co-occurrence (Resnik, 1993; Alves, 1996); and (4) shows the performance of our method 7.</Paragraph>
      <Paragraph position="12"> 7The precision is for the 1.28 default threshold. The precision was 81.2% and 84.1% when we set the threshold to .84 and .95. In all these cases the coverage was 92.0%.  (3) person (3) I human or similar (42) I AM (39) human defined by race or origin (3) Japanese (2) worker (5) person defined by role (I) person defined by position ..deg. ....... (I) slave (0) professor  (1) closest attachment 70.6% (2) lowest classes 81.2% (3) maximum MI 82.6% (4) our method 87.0%  Table 3 Results for determining dependency relations Closest attachment (1) has a low performance since it fails to take into consideration the identity of the words involved in the decision. Selecting the lowest classes (2) often produces unreliable estimates and wrong decisions due to data sparseness. Selecting the classes with highest mutual information (3) results in overgeneralization that may lead to incorrect attachments. Our method avoids both estimating from unreliable classes and overgeneralization and results in better estimates and a better performance. null A qualitative analysis of our results shows two causes of errors, however. Some errors occurred when there were not enough occurrences of the particle-modificand pattern to estimate any of the strength of association necessary for resolving ambiguity. Other errors occurred when the decision could not be made without surrounding context.</Paragraph>
    </Section>
    <Section position="2" start_page="1419" end_page="1420" type="sub_section">
      <SectionTitle>
3.2 Prepositional Phrase Attachment
</SectionTitle>
      <Paragraph position="0"> in English Prepositional phrase (PP) attachment is a paradigm case of syntactic ambiguity. The most probable attachment may be chosen comparing the SA between the PP and the various attachment elements. Here SA is measured by:  where Cw stands for the class that includes the word w and f is the frequency in a training data containing verb-nounl-preposition-noun2 constructions.</Paragraph>
      <Paragraph position="1"> Our method selects from a taxonomy the classes to be used to calculate the SA score and  then chooses the attachment with highest SA score as the most probable.</Paragraph>
      <Paragraph position="2"> Experiment We performed a PP attachment experiment on the data that consists of all the 21,046 semantically annotated verb-nounpreposition-noun constructions found in EDR English Corpus. We set aside 500 constructions for test and used the remaining 20,546 as training data. We first performed the experiment using various values for the threshold. Table 4 shows the results. The first line here shows the default which corresponds to the most likely attachment for each preposition. For instance, the preposition of is attached to the noun, reflecting the fact that PP's led by of are mostly attached to nouns in the training data. The 'confidence' values correspond to a binomial distribution and are given only as a reference s. confidence t coverage precision success  various thresholds for t-score The precision grows with t-scores, while coverage decreases. In order to improve coverage, when the method cannot find a class co-occurrence for which the t-score is above the threshold, we recursivcly tried to find a co-occurrence using the threshold immediately smaller (see Table 4). When the method could not find co-occurrences with t-score above the smallest threshold, the default was used. The overall success rates are shown in &amp;quot;success&amp;quot; column in Table 4.</Paragraph>
      <Paragraph position="3"> SAs another way of reducing the sparse data problem, we clustered prepositions using the method described in &amp;quot;~Vu and Furugori (1996). Prepositions like synonyms and antonyms are clustered into groups and replaced by a representative preposition (e.g., till and pending are replaced by until; amongst, amid and amidst are replaced by among.).</Paragraph>
      <Paragraph position="4"> Evaluation We evaluated the performance of our method comparing its results with those of other methods with the same test and training data. The results are given in Table 5. Here, (5) shows the performance of two native speakers who were just presented quadruples of four head words without surrounding contexts.</Paragraph>
      <Paragraph position="5">  The lower bound and the upper bound on the performance of our method seem to be 59.6% scored by the simple heuristic of closest attachment (1) and 87.0% by human beings (4). Obviously, the success rate of closest attachment (1) is low as it always attaches a word to the noun without considering the words in question. The unanticipated low success rate of human judges is partly due to the fact that sometimes constructions were inherently ambiguous so that their choices differed from the annotation in the corpus.</Paragraph>
      <Paragraph position="6"> Our method (4) performed better than the lowest classes method (2) and maximum MI method (3). It owes mainly to the fact that our method makes the estimation from class co-occurrences that are more reliable.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML