File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/a00-2038_evalu.xml

Size: 4,272 bytes

Last Modified: 2025-10-06 13:58:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2038">
  <Title>A New Algorithm for the Alignment of Phonetic Sequences</Title>
  <Section position="6" start_page="292" end_page="293" type="evalu">
    <SectionTitle>
5 Evaluation
</SectionTitle>
    <Paragraph position="0"> The best alignments are obtained when local comparison is used. For example, when aligning En- null glish grass with Latin gramen, it is importa~nt to match only the first three segments in each word; the remaining segments are unrelated. ALINE obviously does not know the particular etymologies, but it can make a guess because \[s\] and \[m\] are not very similar phonetically. Only local alignment is able to distinguish between the essential and non-essential correspondences in this case (Table 5).</Paragraph>
    <Paragraph position="1"> The operations of compression and expansion prove to be very useful in the case of complex correspondences. For example, in the alignment of Latin factum with Spanish hecho, the affricate \[if\] should be linked with both \[k\] and \[t\] rather than with just one of them, because it originates from the merger of the two consonants. Note that taking a se- null Latin gramen obtained with global, semiglobal, and local comparison. The double bars delimit the aligned subsequences.</Paragraph>
    <Paragraph position="2">  Covington ' s alignments ALINE' s alignments</Paragraph>
    <Paragraph position="4"> blow : flare b l - - o w II f 1 a r e - II b l o II w f 1 fi II re full : pl~nus f - - u 1 II p 1 ~ n u s II f u 1 II p 1 II ~nus fish : piscis f - - i ~ II p i s k i s \]l f i ~ II p i s II kis I: ego a y II ay II e g o - II e II go tooth : dentis - t u w 0 II t uw 0 d e n t i - s den II t i s  quence of substitution and deletion as compression is unsatisfactory because it cannot be distinguished from an actual sequence of substitution and deletion. ALINE posits this operation particularly frequently in cases of diphthongization of vowels (see the alignments in Table 6).</Paragraph>
    <Paragraph position="5"> Covington's data set of 82 cognates provides a convenient test for the algorithm. His English/Latin set is particularly interesting, because these two languages are not closely related. Some of the alignments produced by Covington's algorithm and ALINE are shown in Table 6. ALINE accurately discards inflectional affixes in piscis and flare. In fish/piscis, Covington's aligner produces four alternative alignments, while ALINE selects the correct one. Both algorithms are technically wrong on tooth/dentis, but this is hardly an error considering that only the information contained in the phonetic string is available to the aligners. On Covington's Spanish/French data, ALINE does not make any mistakes. Unlike Covington's aligner, it properly aligns \[1\] in arbol with the second \[r\] in arbre. On his English/German data, it selects the correct alignment in those cases where Covington's aligner produces two alternatives. In the final, mixed set, ALINE makes a single mistake in daughter/thugat~r, in which it posits a dropped prefix rather than a syncopated syllable; in all other cases, it is fight on target. Overall, ALINE clearly performs better than Covington's aligner.</Paragraph>
    <Paragraph position="6"> Somers (1999) tests one version of his algorithm, CAT, on the same set of cognates. CAT employs binary, rather than multivalued, features. Another important characteristic is that it pre-aligns the stressed segments in both sequences. Since CAT distinguishes between individual consonants, in some cases it produces more accurate alignments than Covington's aligner. However, because of its pre-alignment strategy, it is guaranteed to produce wrong alignments in all cases when the stress has moved in one of the cognates. For example, in the Spanish/French pair cabeza/cap, it aligns \[p\] with \[0\] rather than \[b\] and falls to align the two \[a\]'s. The problem is even more acute for closely related languages that have different stress rules. 8 In contrast, ALINE does not even consider stress, which, in the context of diachronic phonology, is too volatile to depend on. Except for the single case of daughter/thugat~r, ALINE produces better alignments than Somers's algorithm.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML