File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2905_concl.xml

Size: 1,994 bytes

Last Modified: 2025-10-06 13:54:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2905">
  <Title>Using Soundex Codes for Indexing Names in ASR documents</Title>
  <Section position="12" start_page="0" end_page="0" type="concl">
    <SectionTitle>
7 Conclusions and Future Directions
</SectionTitle>
    <Paragraph position="0"> In this paper we highlighted an important problem that occurs with names in ASR text. We showed how a name may be spelt differently by humans. In ASR the same name had many more different spellings.</Paragraph>
    <Paragraph position="1"> We proposed a simple indexing strategy for names, wherein a name was indexed by its Soundex code. We found that our strategy did not work for ASR, but the problem was not with the approach, but because we could not do a good job of identifying names in ASR text.If we could detect names with reasonable accuracy in ASR text we should be able to achieve reasonable improvement. We did not have a named entity recognizer that performed well on ASR text. We therefore verified our idea on news-wire text, which is grammatical, well punctuated text. In the news-wire domain, in spite of there being reasonable consistency in spellings of names, we get about 10% improvement in minimum cost, and a consistent improvement at all points in the ROC curve. Hence, a simple technique like Soundex served as a useful normalization technique for names. We proposed alternative mechanisms that could be applied to ASR text, wherein all OOV words could be normalized by their Soundex codes. We also outlined further directions for research in the way that approximate string matching may be used.</Paragraph>
    <Paragraph position="2"> We think the general results of past works that has considered the problems due to ASR errors to be insignificant cannot be assumed to transfer across to other problems.</Paragraph>
    <Paragraph position="3"> There will arise situations when this problem is material and research needs to be done in this direction.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML