File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/83/p83-1014_abstr.xml

Size: 4,451 bytes

Last Modified: 2025-10-06 13:46:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="P83-1014">
  <Title>A Finite-Slate Parser for Use in Speech Recognition</Title>
  <Section position="2" start_page="94" end_page="95" type="abstr">
    <SectionTitle>
2..3 Feature Manipulation
</SectionTitle>
    <Paragraph position="0"> Although &amp;quot;pure&amp;quot; unaugmented finite state grammars may be adequate fur speech applications (in the weak generative capacity sense), \[ may, nevertheless, wish to introduce additional mechanism in order to account for agreement facts in a natural way. As discussed above, the formulation of the homorganic rule in (15) is unattractive because it splits the rule into three cases, one for each place of articulation. It would be preferable to state the agreement constraint just once, by defining a homorganic nasal cluster to be a nasal cluster \]0. I personally hold a much more controversial posution, that tinite state grammars are sufficient for most. if not nil, natural language )-asks \[3\].</Paragraph>
    <Paragraph position="1">  subject to phlcc assimilation. In my language of matrix operations, I can say just exactly that: (16) (setq homorganic-na~l-cluster-lattice (M&amp; nasal-cluster-lattice place-assimilation)) where M&amp; (element-wise intersection) implements the subject to constraint. Nasal-cluster and place-assimilation are defined as:  In this way. M&amp; seems to be an attractive solution to the agreement problem.</Paragraph>
    <Paragraph position="2"> In addition, M&amp; might also shed some light on co-articulation, another problem of'feature spreading'. Co-articulation (articulation of multiple phonemes at the same time) makes it extremely difficult (perhaps impossible) to segment the speech waveform into phonemeco-articulation, Fujimura su~csts that place, manner and other articulatory features be thought of as asynchronous processes, which have a certain amotmt of freedom to overlap in time.</Paragraph>
    <Paragraph position="3"> (tSa) &amp;quot;Speech is commonly viewed as the result of concatenating phonetic segments. In most discussions of the temporal structure of speech, a segment in such a model is assumed to represent a phoneme-sized phonetic unit. which possesses an inherent \[invariantj target value in terms of articulation or acoustic manifestation. Any deviation from such an interpretation of observed phenomena requires special attention ... \[Biased on some preliminary results of X-ray microbeam studies \[which associate lip, tongue and jaw movements with phonetic events in the utteranceJ, it will be suggested that understanding articulator'/ processes, which are inherently multi-dimensional \[and (more or less) asynchrouousl, may be essential for a successful description of temporal structures of speech.&amp;quot; \[9 p. 66\] In light of Fujimura's suggestion, I might re-interpret my parser as a highly parallel feature-based asynchronous architecture. For example. the parser can process homorganic nasal clusters by processing place and manner phrases in parallel, and then synchronizing the results at the coda node with M&amp;. That is, (17a) can be computed in parallel with (17b). mid then the rcsulLs are aligned whcn the coda is computed with (16), as illustrated below for the word tent. Imagine that the front end produces the following analysis: (19) t a n t dental: I-I I .....</Paragraph>
    <Paragraph position="4"> vowel: I-..I stop: I.I I ..... I nasalization: I..I where many of the ~atures overlap m an asynchronous way. The parser will correctly locate the coda by intersecting the nasal cluster lattice (computed with (17a)) with the homorganic lattice (computed with (17b)).</Paragraph>
    <Paragraph position="5"> (20) t a n t nasal cluster: I ....... J homonganJc: I ..... I coda: I ..... I This parser is a bold departure from a standard practice in two respects: (1) the input stream is feature-based rather than segmental, and (2) the output parse is a heterarchy of overlapping constituents (e.g., place and manner phrases) as opposed to a list of hierarchical parse-trees. \[ find these two modifications most exciting and worthy of further investigation.</Paragraph>
    <Paragraph position="6"> In summary, two points have been made. \[:irst. I suggested the use of parsing techniques at the segmental/feature level in speech applications. Secondly, I introduced M&amp; as a possible solution to the agreement/co-articulation problem.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML