File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/p04-1005_evalu.xml

Size: 2,483 bytes

Last Modified: 2025-10-06 13:59:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1005">
  <Title>A TAG-based noisy channel model of speech repairs</Title>
  <Section position="4" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> This section describes how we evaluate our noisy model. As mentioned earlier, following Charniak and Johnson (2001) our test data consisted of all Penn III Switchboard tree-bank sw4[01]*.mrg les. However, our test data di ers from theirs in that in this test we deleted all partial words and punctuation from the data, as this results in a more realistic test situation.</Paragraph>
    <Paragraph position="1"> Since the immediate goal of this work is to produce a program that identi es the words of a sentence that belong to the reparandum of a repair construction (to a rst approximation these words can be ignored in later processing), our evaluation focuses on the model's performance in recovering the words in a reparandum. That is, the model is used to classify each word in the sentence as belonging to a reparandum or not, and all other additional structure produced by the model is ignored.</Paragraph>
    <Paragraph position="2"> We measure model performance using standard precision p, recall r and f-score f, measures. If nc is the number of reparandum words the model correctly classi ed, nt is the number of true reparandum words given by the manual annotations and nm is the number of words the model predicts to be reparandum words, then the precision is nc=nm, recall is nc=nt, and f is 2pr=(p + r).</Paragraph>
    <Paragraph position="3"> For comparison we include the results of running the word-by-word classi er described in Charniak and Johnson (2001), but where partial words and punctuation have been removed from the training and test data. We also provide results for our noisy channel model using a bigram language model and a second trigram model where the twenty most likely analyses are rescored. Finally we show the results using the parser language model.</Paragraph>
    <Paragraph position="4">  The noisy channel model using a bigram language model does a slightly worse job at identifying reparandum and interregnum words than the classi er proposed in Charniak and Johnson (2001). Replacing the bigram language model with a trigram model helps slightly, and parser-based language model results in a signi cant performance improvement over all of the others. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML