File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1102_intro.xml

Size: 1,965 bytes

Last Modified: 2025-10-06 14:06:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1102">
  <Title>Measuring Dialect Distance Phonetically</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Motivation
</SectionTitle>
    <Paragraph position="0"> Dialectologists frequently speak of the range of dialects they describe as a &amp;quot;continuum&amp;quot;, 1 which suggests a need t0 supersede the inherently discrete method of isogl0sses. Dialectologists have long recognized the need for alternative notions of dialectal relationships (Durand (1889),p.49).</Paragraph>
    <Paragraph position="1"> 1For example,.Tait ol/Inuit: ';a fairly unbroken chain of dialects \[...\] the furthest extremes of the continuum being unintelligib!e to one another&amp;quot; (Tait (1994), p.3) It is furthermore the case that a sensitive measure of dialectal distance could have broad application to questions in sociolinguistics and historical linguistics, e.g. the significance of political boundaries, the effect of the media, etc.</Paragraph>
    <Paragraph position="2"> Levenshtein distance is a measure of string distance that has been applied to problems in speech recognition, bird song ethology, and genetics. It is presented in (Kruskal, 1983), and may be understood as the cost of (the least costly set of) operations mapping from one string to another.</Paragraph>
    <Paragraph position="3"> Kessler (1995) applied Levenshtein distance to Irish Gaelic dialects with remarkable success, and Nerbonne et al. (1996) extended the application of his techniques to Dutch dialects, similarly with respectable results. Although Kessler and Nerbonne et al. (1996) experimented with more sensitive measures, their best results were based on calculations of phonetic distance in which phonetic overlap was binary: nonidentical phones contribute to phonetic distance, identical ones do not. Thus the pair \[a,t\] count as different to the same degree as</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML