XML Viewer - c90-2011

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2011_metho.xml
Size: 13,075 bytes
Last Modified: 2025-10-06 14:12:26
<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-2011">
  <Title>An Augmented Chart Data Structure with Efficient Word Lattice Parsing Scheme In Speech Recognition Applications</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. The Augmented Chart
</SectionTitle>
    <Paragraph position="0"> The conventional chart parsing algorithm was designed to parse a sequence of words. In this section the chart is augmented for parsing word lattices. The purpose is to efficiently and accurately find out all grammatically valid sentence hypotheses and their sentence structures from a given word lattice based on a grammar.</Paragraph>
    <Paragraph position="1"> A word lattice W is a partially ordered set of word hypotheses, W = {w 1 ..... win}, where each word hypothesis w i, i=l .... ,m, is characterized by begin, the beginning point, end, the ending point, cat, the category, phone, the associated phonemes, and name, the word name of the word hypothesis. These word hypotheses are sorted in the order of their ending points; that is, for every pair of word hypotheses w i and wj, i&lt;j implies end(wi) &lt;= end(wj). Also, two word hypotheses w i and wj are said to be connected if there is no other word hypothesis located exactiy between the boundaries of the two word hypotheses, i.e., if w i _&lt; wj and there does not exist any other word hypothesis w k such that w i &lt; w k _&lt;wj, where w i _&lt; wj fff end(wi) &lt;= begin(wj). A sentence hypothesis is then a sequence of connected word hypotheses selected from the given word lattice, and a sentence hypothesis is grammatical valid only if it can be generated by a grammar. As an example, a sample word lattice constructed for demonstration purpose is shown on the top of Fig. 1, in which only the word sequence &amp;quot;Tad does this.&amp;quot; is a valid sentence hypothesis.</Paragraph>
    <Paragraph position="2"> The augmented chart is a directed uncyclic graph specified by a two-tuple &lt;V, E&gt;, where V is a sequence of vertices and E is a set of edges. Each vertex in V represents an end point of some word hypotheses in the input word lattice, while the edge set</Paragraph>
    <Paragraph position="4"> is divided into three disjoint groups: inactive, active and jump edges. As were used in a conventional chart, an inactive edge is a data structure to represent a completed constituent, while an active edge represents an incomplete constituent which needs some other complete constituents to compose a larger one. A jump edge, however, is a functional edge which links two different edges to indicate their connection relation (described below) and guide the parser to search through all edges connected to each active edge during parsing. The pailial ordering relation among the edges in the augmented chart can first be defined according to the order of the boundary vertices. Two edge E i and Ej are then said to be connected (i.e. EConn(E i, Ej) = true) only when the end vertex of one of them is the begin vertex of the other, or there exists a jump edge linking them together. For example, in the chart representation of the sample word lattice in Fig: 1 (on the bottom of the figure, the details will be explained in the next section), EConn(E 3, E 6) = true due to the existence of J~np3 linking E 3 and E 6, but EConn(E 1 ,</Paragraph>
    <Paragraph position="6"> Fig.1 In this figure, on the top is a set of overlapped word hypotheses which are assumed to be produced by an acoustic signal processor in speech recognition, where each rectangular shape denotes the time segment of the acoustic signal for the word hypothesis and above it is the 5-tuple information, from left to right, i.e., begin, end, cat, phone and name, respectively; on the middle are the sorted wbp's; and on the bottom is the resulting initial chart.</Paragraph>
    <Paragraph position="7"> E6) = false due to E 3 and E 4 existing in between. This jump edge and the new connection relation is the primary difference between the conventional chart and our augmented chart.</Paragraph>
    <Paragraph position="8">  3. The Mapping from a Word Lattice to the Augmented Chart  Before parsing is performed, any input word lattice has to be mapped to the augmented chart. At the beginning of the mapping procedure, we have to first consider a situation in which additional word hypotheses should be inserted into the input lattice to avoid any important word being missed in the sentence. A good example for such situation is in Fig. 2 where the time segment for the word hypothesis w i (the word &amp;quot;same&amp;quot;) is from 10 to 20, and that for wj (the word &amp;quot;message&amp;quot;) is from 14 to 30. Apparently for this situation four cases are all possible: w i is a correct word but wj is not, wj is correct but w i is not, both w i and wj are correct because they share a common phoneme (m) in the co-articulated continuous acoustic signal, or both w i and wj are not correct. A simple approach to be used here is that two additional word hypotheses Wil (also &amp;quot;same&amp;quot;, but from 10 to 17) and wj 1 (also &amp;quot;message&amp;quot;, but from 17 to 30) are inserted into the word lattice W, such that all the above four possible cases will be properly considered during parsing and no any word will be missed.</Paragraph>
    <Paragraph position="9"> wi\[ same \] &amp;quot;iFsame I</Paragraph>
    <Paragraph position="11"> After the above additional word hypotheses insertion, every boundary point (either beginning or ending) of any word hypothesis of W should then be mapped to a vertex in the chart. All these word boundary points (wbp's) have to be first sorted into an ordered sequence (indicated by a function Order(x), where x is any wbp); the definition of Order(x) is as follows. To any pair of wbp's x and y, if x and y are distinct then their order is based on order in time; if x and y are identical then the begi,ming wbp (denoted by b) L,; after the ending wbp (denoted by e). For each wbp x, the corresponding vertex is then assigned depending on its preceding wbp y as described below.</Paragraph>
    <Paragraph position="12"> As was shown in Fig. 3, for totally four possible cases of x and y, i.e. bb (y is a beginning wbp and x is also&amp;quot; a beginning wbp), be, eb, ee, only for the case be (y is a beginning wbp but x an ending wbp), two different vertices should be assigned to x and y to preserve the ord.::ring relation between the corresponding word hypotheses of x and y. But in all the other three cases, x and y can l:u'. given the same vertex. Let the function Vertex(x) denotes this assignment.</Paragraph>
    <Paragraph position="13"> case (i) bb c~oe (h ~) be</Paragraph>
    <Paragraph position="15"> Vertex assignment of the word boundary points Now, for each word hypothesis w i , an initial inactive edge can be constructed. The function Edge(w i) for a word hypothesis w i is then exactly specified by the two vertices assigned to the two wbp's of w i , i.e. Edge(w i) = &lt; Vertex(begin(wi)), Vertex(end(wi))&gt;. Finally, for any pair of vertices v i and vj, if there isn't any complete initial inactive edge existing between them, a jump edge from v i to vj is constructed to link v i and vj. Using the above procedure, Fig. 1 also shows the mapping results of the sample word lattice. The sorted wbp's (specified by a time scale and whether it is a beginning or ending wbp) are on the middle of the figure, and the resulting initial chart is on the bottom. It can be shown that the above mapping procedure has the following nice properties: first, the ordering and connection relations among all word hypotheses in the word lattice can be completely preserved among the corresponding edges in the augmented chart; second, when the input word lattice can be reduced to a simple sequence of word hypotheses, the augmented chart representation can also be reduced to a conventional chart representation.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. The Augmented Chart Parsing and Some
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Further Extensions
</SectionTitle>
      <Paragraph position="0"> The fundamental principle of chart parsing is: Whenever an active edge A is connected to an inactive edge I which satisfies A's conditions for extensions, a new edge N covering both is built. Now, in the augmented chart parsing this principle is still held; except that the inactive edge I doesn't have to share the same vertex with the active edge A; instead it can be separated from the active edge A, as long as there exists a jump edge linking edges A and I. The augmented chart parsing scheme proposed here is not only very useful and efficient to rule-based grammar applications, but is equally useful and efficient in other applications such as a lexicalized grammar (e.g.</Paragraph>
      <Paragraph position="1"> 4 63 ,HPSG(Pollard, 1987) ) in which the syntactical relationships are stated as part of the lexical description, and in the augmented chart the structures to be assigned to the input may be extended to attribute-value matrices (complex feature structures) instead of syntactic parsing trees and the recognition algorithm may rely on the head-driven slot and filler principle instead of derivation oriented recognition.</Paragraph>
      <Paragraph position="2"> Such an extension is in fact straightforward.</Paragraph>
      <Paragraph position="3"> Furthermore, in some other approaches to increase the flexibility of the slot and filler principle, such as island parsing (Stock, 1988) and discontinuous segmented parsing (Hellwig, 1988), the augmented chart proposed here can also be easily extended and applied.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. Some Preliminary Experimental Results
</SectionTitle>
    <Paragraph position="0"> In order to see how the above, concept &amp;quot;for augmented chart parsing works, a bottom-up and left-to-right parser based on the proposed augmented chart (also capable of perforating conventional chart parsing) has been implemented and tested in some preliminary experiments. The test data base includes a large number of Chinese word lattices obtained from an acoustic signal processor which recognizes Mandarin speech. Due to the existence of large number of homonyms in Chinese language and uncertainty and errors in speech recognition, very high degree of Iexical ambiguity exists in the input lattices. One example of such Chinese word lattice is in Fig. 4. The results show that, all possibte constituents for the input word lattice can be constructed and no any constituent needs to be built more than once using the augmented chart parsing. According to the experimental results, the edge reduction ratio (the ratio of the total number of edges built in the augmented chart parsing to the total number of edges built in conventional chart parsing) is on the order of 1/30 ~ 1/80 for our input Chinese word lattices.</Paragraph>
    <Paragraph position="1"> Although this ratio depends seriously on the degree of ambiguity of the input word lattices, the computation complexity can always be reduced significantly.</Paragraph>
    <Paragraph position="2"> ~.3 ~g.1 ~c~ce Fig.4 An example in Mandarin Chinese is given here. It is obtained from the Chinese sentence utterance: ni-3 'you' shr-4 'are' yi-2 'a' jia-4 'set' huei-4 'can' tieng-1 'listen to' guo-2 iu-3 'Mandarin' de-5 'which' dian-4 nan-3 'computer' (you are a computer which can listen to Mandarin, ~ ~ -~--~~-~\[~,~.'~j~ ), where the syllables are represented in Mandarin Phonetic Symbols II (MPS-II) with the integers (1 to 5) indicating the tone. The possible word hypotheses are shovm above where the horizontal axis denotes the time ordering of the syllables and the vertical scale shows the corresponding word hypotheses for the syllables, in which only those denoted by &amp;quot;*&amp;quot; are correct words. In this example all the syllables are actually clearly identified and correctly recognized and therefore all word hypotheses are in fact well aligned in boundaries, except that two syllables (the first syllable hi-3 and the sixth syllable tieng-1) are confused by a second candidate (li-3 and tiang-1, respectively).</Paragraph>
    <Paragraph position="3"> Therefore the ambiguity is primarily due to the large number of homonyms in Chinese language. The line segments under each word hypothesis indicates whether the word hypothesis is composed of one or two syllables. In our analysis, as many as 470 sentence hypotheses are obtained from this example word lattice with most syllables correctly recognized, and the experimental results show that for this example 64 5 ~totally 58132 edges have to be built in conventional chart parsing, while only 925 edges are necessary in the. augmented chart parsing. The edge reduction ratio for this example is 1/62.8.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML