File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/81/p81-1004_intro.xml
Size: 1,732 bytes
Last Modified: 2025-10-06 14:04:22
<?xml version="1.0" standalone="yes"?> <Paper uid="P81-1004"> <Title>PERFORMANCE COMPARISON OF COMPONENT ALGORITHMS FOR THE PHONEMICIZATION OF ORTHOGRAPHY</Title> <Section position="2" start_page="0" end_page="19" type="intro"> <SectionTitle> LEXICON AFFIX STRIPPER I LETTER TO SOUND I LEXICAL STRESS l ALLOPHONICS </SectionTitle> <Paragraph position="0"> output: /huw~v3/~&quot; Several research systems are of this general design, including Allen's MITALK system, the TTS-X prototype at Telesensory Systems, and Llberman,s proper name phonemicizer.</Paragraph> <Paragraph position="1"> The most popular text-to-phoneme desi@n is the NRL approach, which has only two components, of which only the first is presented in detail and evaluated by Elovitz. The original NRL system is: input: &quot;word&quot; The very great advantage of the MRL approach, is the unified treatment ofletter sequences, affixes, and whole words. There is exactly one pass through a word, left to right, in which the maximum string starting with the leftmost unphonemicized character is matched. These strings are sometimes whole words, sometimes affixes, and sometimes consonant or vowel sequences or word fragments like &quot;BUIL&quot;. The main constraint of the system is its greatest attraction: the unity and simplicity of the code that scans the word and accesses a single table of letter strings. In contrast to this, the MITALK system, for instance, has one module and associated table structure for lexlcal decomposition of whole words, another module for stripping common affixes, and a third module for translating consonant and vowel sequences that remain in the pseudo-root of the word.</Paragraph> </Section> class="xml-element"></Paper>