File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/n01-1015_intro.xml
Size: 3,916 bytes
Last Modified: 2025-10-06 14:01:12
<?xml version="1.0" standalone="yes"?> <Paper uid="N01-1015"> <Title>Re-Engineering Letter-to-Sound Rules</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 4. For reasons of space efficiency, certain com- </SectionTitle> <Paragraph position="0"> putations are deferred until run-time (Mohri et al., 1996; Mohri et al., 2000), with a significant impact on time efficiency.</Paragraph> <Paragraph position="1"> While there is a clear need for human expert knowledge (Sproat et al., 1998, 75ff.), those experts should not have to deal with the performance aspects of the knowledge representation. Ideally we would like to use a knowledge representation that is both time and space efficient and can be constructed automatically from individually meaningful features supplied by human experts. For practical reasons we have to be content with methods that address the efficiency issues and can make use of explicitly represented knowledge from legacy systems, so that moving to a new way of building TTS systems does not entail starting over from scratch.</Paragraph> <Paragraph position="2"> As a case study of how this transition might be achieved we took the letter-to-phoneme rules for French in the TTS system described in (Sproat, 1998) and proceeded to 1. Construct a lexicon using the existing system.</Paragraph> <Paragraph position="3"> 2. Produce an alignment for that lexicon.</Paragraph> <Paragraph position="4"> 3. Convert the aligned lexicon into training instances for an automatically induced classifier. 4. Train and evaluate decision trees.</Paragraph> <Paragraph position="5"> By running the existing system on a small newspaper corpus (ca. 1M words of newspaper text from Le Monde) and eliminating abbreviations we obtained a lexicon of about 18k words. This means that the performance of the automatically trained system built from this lexicon is relative to the existing system.</Paragraph> <Paragraph position="6"> The key steps, aligning the lexicon and building a training set, are described in detail in Sections 2 and 3 below.</Paragraph> <Paragraph position="7"> Our choice of decision trees was motivated by their following desirable properties: 1. Space and time efficiency, provided the feature functions can be represented and computed efficiently, which they can be in our case.</Paragraph> <Paragraph position="8"> 2. Generality.</Paragraph> <Paragraph position="9"> 3. Symbolic representation that can easily be inspected and converted.</Paragraph> <Paragraph position="10"> The first property addresses the efficiency requirements stated above: if every feature function can be computed in time O(f), where the function f does not involve the height of the decision tree h, then the classification function represented by the decision tree can be computed in time O( n:h f(n)) = O(f) if feature values can be mapped to child nodes in constant time, e. g. through hashing; and similarly for space.</Paragraph> <Paragraph position="11"> The other properties justify the use of decision trees as a knowledge representation format. In particular, decision trees can be converted into implicational rules that an expert could inspect and can in principle be compiled back into finite-state machines (Sproat and Riley, 1996), although that would re-introduce the original efficiency problems. On the other hand, finite-state transducers have the advantage of being invertible, which can be exploited e. g. for testing hand-crafted rule sets. We use a standard decision tree learner (Quinlan, 1993), since we believe that it would be premature to investigate the implications of different choices of machine learning algorithms while the fundamental question of what any such algorithm should use as training data is still open. This topic is explored further in Section 5. Related work is discussed in Section 6.</Paragraph> </Section> class="xml-element"></Paper>