File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/96/c96-1014_evalu.xml

Size: 5,173 bytes

Last Modified: 2025-10-06 14:00:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1014">
  <Title>Integrating Syntactic and Prosodic Information for the Efficient Detection of Empty Categories</Title>
  <Section position="7" start_page="74" end_page="75" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> In order to approximate the usefulness of prosodic information to reduce the number of verb trace hypotheses for the parser we examined a corpus of 104 utterances with prosodic amlotations denoting the probability of a syntactic boundary after every given word. For every node whose $3 boundary probability exceeds a certain threshold wdue, we considered the hypothesis that this node is followed by a verb trace. These hypotheses were then rated valid or invalid by the grammar writer.</Paragraph>
    <Paragraph position="1"> Note that such a setting where a position in the input is annotated with scores representing the respective boundary probabilities is much more robust w.r.t unclear classification results than a pure binary 'boundary-vs.-nonboundary' distinction.</Paragraph>
    <Paragraph position="2"> The observations were rated according to the ff~llowing scheme~:  identification of possible verb trace positions.</Paragraph>
    <Paragraph position="3"> where:</Paragraph>
    <Paragraph position="5"> In practice this means that the number of locations where the parser has to assume the presence 2XO position means that the relewLnt position is occupied by a XO gap, XO prop. means that the classifier l)roposes an X0 ~tt this position.</Paragraph>
    <Paragraph position="6"> of a verb trace could be reduced from 1121 to 412 while only 6 necessary trace positions remMned unmarked. These results were obtained from a corpus of spoken utterances many of which contained several independent phrases and sentences.</Paragraph>
    <Paragraph position="7"> These segments, however, are also often separated by an S3-boundary, so that the error rate is likely to drop considerably if a segmentation of utterances into syntacticMly well-formed phrases is performed prior to the trace detection. Since cases where the verb trace is not located at the end of a sentence (i.e. where extraposition takes place) involve a highly characteristic categorial context, we expect a further improvement if the trace/notrace classification based on prosodic information is combined with a language model.</Paragraph>
    <Paragraph position="8"> The problem with the approach described above is that a careful estimation of the threshold value is necessary and tiffs threshold may vary from speaker to speaker or between certain discourse situations. Furthermore the analysis fails in those cases where tile correct position is rated lower titan this value, i,e. where the parser does not consider the correct trace position at all. Thus, in a second experiment we examined how the syntactically correct verb trace position is ranked among the positions proposed by the prosody module w.r.t, its S3-boundary probability. If the correct position turns out to be consistently ranked among the positions with the highest $3 probability within a sentence then it might be preferable for the parsing module to consider the $3 positions in descending order rather than to introduce traces for all positions ranked above a threshold.</Paragraph>
    <Paragraph position="9"> For the second experiment we considered only those segments in the input that represent V2 clauses, i.e. we assumed that the input has been segmented correctly. Within these sentences we ranked all the spaces between words according to the associated $3 probability and determined the rank of tile correct verb trace position. When performing this test on 134 sentences the following picture emerged:  verb trace position within a sentence according to the $3 probability.</Paragraph>
    <Paragraph position="10"> Table 5 shows that in the majority of cases the position with the highest $3 probability turns out to be the correct one. It has to be added though, that in many cases the correct verb trace position is at the end of the sentence which is often very reliably marked with a prosodic phrase boundary, even if this sentence is uttered in a sequence together with other phrases or sentences. This end-of-sentence marker will be assigned a higher $3 probability in most cases, even if the correct verb trace position is located elsewhere.</Paragraph>
    <Paragraph position="11">  In a third experiment finally we were interested in the overall speedup of the processing module that resulted form our approach. In order to estimate this, we parsed a corpus of 109 turns in two different settings: While in the first round the threshold value was set as described above, we selected a value of 0 for the second pass. The parser thus had to consider every postion in the input as a potential head trace location just as if no prosodic information about syntactic boundaries were available at all. It turns out then (cf. table (6)) that employing prosodic information reduces the parser runtime for the corpus by about  parsing batch-jobs with and without the use of prosodic information, resp.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML