XML Viewer - c90-3042

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-3042_metho.xml
Size: 13,506 bytes
Last Modified: 2025-10-06 14:12:31
<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-3042">
  <Title>Bi-directional LR Parsing fi'om an Anchor Word for Speech Recognition</Title>
  <Section position="3" start_page="0" end_page="13" type="metho">
    <SectionTitle>
2. Background: Generalized LR Parsing
</SectionTitle>
    <Paragraph position="0"> The LR parsing technique was originally developed for compilers of programming languages arid has been extended for Natural Language Processing \[8\]. The LR parsing analyzes the input sequence from left to right with no backtracking by looking at the parsing table constructed from the context-free grammar rules in advance. An example grammar and its parsing table are shown in Figure 2-1 and Figure 2-2 respectively.</Paragraph>
    <Paragraph position="1"> Entries &amp;quot;s n&amp;quot; in the action table (the left part of the table) indicate the action &amp;quot;shift one word from input buffer onto the stack and go to state n&amp;quot;. Entries &amp;quot;r n&amp;quot; indicate the action &amp;quot;reduce constituents on the stack using rule n&amp;quot;. The entry &amp;quot;acc&amp;quot; stands for the action &amp;quot;accept&amp;quot;, and blank spaces represent &amp;quot;error&amp;quot;. '$' in the action table is the end-of-input symbol. The goto table (the right part of the table) decides to which state the parser should go after a reduce action.</Paragraph>
    <Paragraph position="2"> The LR parsing table in Figure 2-2 is different from the regular LR tables utilized by compilers of programming i 237 languages in that there are multiple entries, called conflicts, on the row of slate 9. While the encountered entry has only one action, parsing proceeds exactly the same way as the normal LR parsing. In case there are multiple actions in one entry, it executes all the actions with the graph-structured stack \[8\]. The bi-directional GLR parsing method begins at an arbitrary spot of the input, while the conventional GLR parsing analyzes the input sequence only from left to right.</Paragraph>
    <Paragraph position="3">  (i) S --&gt; NP VP (2) NP --&gt; n (3) NP --&gt; NP PP (4) VP --&gt; v NP (5) PP --&gt; p NP</Paragraph>
    <Paragraph position="5"> In this section we describe the bi-directional GLR parsing algorithm ,and an example of parsing a word lattice.</Paragraph>
    <Paragraph position="6"> 3.1. Reverse LR table Bi-directional GLR parsing uses a reverse LR table besides a standard LR table. The reverse LR table is constructed from the context-free gralnmar in which the order of right-hand-side symbols is reversed in each rule. For example, the grammar in Figure 3-1 is the set of reverse rules built from the exmnple grammar in Figure 2-1. Its parsing table (Figure 3-2), which is a reverse LR table, is constructed from the reversed grammar in Figure 3-1.</Paragraph>
    <Paragraph position="8"> (2) NP --&gt; n (3) NP --&gt; PP NP (4) VP --&gt; NP v (5) PP --&gt; NP p</Paragraph>
    <Paragraph position="10"> Here we describe the algorithm for parsing the lattice starting from an anchor symbol and exp~mding in both left ,and right directions.</Paragraph>
    <Paragraph position="11">  Parsing Procedure: 1. Choose the anchor symbol A from the lattice. 2. Because A is a terminal symbol, the initial state(s) are determined from the action table.  Note that only the states in which the shift action(s) are performed are valid. There are two kinds of starting states: * initial states for left-to-right p,'u'sing from the standard LR table * initial states for right-to-left parsing from the reverse LR table Start GLR parsing from the initial states in both directions independently until the reduce action is suspended due to the lack of the reduce constituents. (Since the parsing starts in the middle of the input, this could happen unless A is located on the edge of the lattice.) The standard LR table is used when the parsing proceeds from left to right and the reverse LR table is used when the parse proceeds in the opposite direction.</Paragraph>
    <Paragraph position="12"> 3. Perform the suspended reduce action when the same number reduce action from the other direction is ready.</Paragraph>
    <Paragraph position="13"> Here we show how this procedure works in parsing the 238 2 lattice in Figure 3-3 using the grammars and the tables in Figures 2-1, 2-2, 3-1 and 3-2. In parsing a lattice, the juncture verifier JUNCT(Wi, Wi ) should be prepared which returns TRUE if W i and Wj can abut. 1</Paragraph>
    <Paragraph position="15"> First we choose the most probable word from the lattice, i.e. W-2 (v). The standard LR table indicates that v is expected at lhe states 2, 3, 8, and 9. Only the state 3 is valid because the other states require reduce actions which need previous words. Thus the parse starts from state 3.</Paragraph>
    <Paragraph position="16"> &amp;quot;Itae current word v is shifted and the next state 6 is de, termined which is expecting n. Figure 3-4 shows this situation.</Paragraph>
    <Paragraph position="17"> We consult the reverse LR table in the same way.</Paragraph>
    <Paragraph position="18"> Namely the right-to-left parse starts from the state 2 and the next state _7 it; decided after v is shifted. (Figure 3-5. States numbers and the expecting terminals for the left-bound parsing are written hi italic fonts with underscore bars.) Here we perform the right-to-left parse first. State 7 is ready for the reduce action 4 by n. But the action &amp;quot;reduce 4&amp;quot; can not be performed now even on the assumption that JUNCT(W-1, W-2) returns TRUE, because the current stack does not contain enough reduce constituents. That means the reduce action 4 is suspended until the left-to-right parsing is ready for the.reduce action 4.</Paragraph>
    <Paragraph position="19"> Therefore we proceed with the right-bound parsing now.</Paragraph>
    <Paragraph position="20"> W.-3 (n) is expected by state 6. On the assumption that JUNCT(W-2, W-3) returns TRUE, n is shifted and the new state 2 is determined from the left-to-right action table (Figure 3-6).</Paragraph>
    <Paragraph position="21"> The new state 2 is ready for the reduce action 2 (NP --&gt; n) by v, p, $. On the assumption that JUNCT(W-3, W.4) returns TRUE, this reduce action is performed. The left-to-right goto table indicates that the new state is 10.  shifted and the new state 5 is determined (Figure 3--8).</Paragraph>
    <Paragraph position="22"> The parse continues in this way (Figure 3-9 - Figure 3-12).</Paragraph>
    <Paragraph position="23"> In Figure 3-12 the new state 10 is ready for the reduce</Paragraph>
    <Paragraph position="25"> the action &amp;quot;reduce 4&amp;quot; is performed. The next state 7 is also ready for the reduce action by $. But this reduce action (s --&gt; NP VP) is interrupted because the parsing stack does not have enough constituents. At this point the suspended right-to-left parse can be resumed because the suspended action &amp;quot;reduce 4&amp;quot; is done. The new state number 5 is determined from the right-to-left goto table. (Figure 3-13) The first word W-1 is expected by state _5. On the assumption that JUNCT(W-1, W-2) returns TRUE, W-I is shifted and the new state number 3 is detemfined from the reverse LR table. (Figure 3-14) The new state 3_ is ready for the reduce action by v, p and  State 10 is ready for the reduce action by $. Thus the action &amp;quot;reduce 1 (S --&gt; vP NP)&amp;quot; is performed, which indicates that the suspended left-to-right action &amp;quot;reduce 1&amp;quot; is also done. (Figure 3-16 shows the end of parsing.) 240 4</Paragraph>
    <Section position="1" start_page="13" end_page="13" type="sub_section">
      <SectionTitle>
3.3. Bi-directional GLR from Multiple Anchors
</SectionTitle>
      <Paragraph position="0"> We have considered the parse from one anchor word in the previous example. The bi-direcfional GLR can be started from more than one word in the following way.</Paragraph>
      <Paragraph position="1"> \[l\] Provide each word with its starting states for both right-bound and left-bound parsing from the action tables.</Paragraph>
      <Paragraph position="2"> \[2\] Start bi-directional GLR parsing from each word in parallel.</Paragraph>
      <Paragraph position="3"> \[3\] At the reached skate s i, check if there any nontenninals already exist which s i is expecting according to the goto table \[along the row of state s i under the column labeled with the nonterminal symbol\]. (Since parsing proceeds in parallel, the nonterminal may have been created  already.) If JUNCT(current-word,previously-creatednonterrninal) returns TRUE, shift this nonterminal onto the current word just tile same way as the standard &amp;quot;shill action&amp;quot; for terminals. Note that this &amp;quot;nonterminal shift action&amp;quot; does not prevent the reguhtr shift/reduce/accept actions at state Si. 2 3.4~ Parsing Words in Order of Probability  In the previous section we showed that the parsing cm~ start from multiple anchors. This assures that tile parse can start from any word in any order. This parsing method is very suitable :for speech recognition, because the parsing can proceed in tile order of probability of each word in the lattice.</Paragraph>
      <Paragraph position="4"> 3.5, Parsing Incomplete Lattice In the previous example the lattice contained every necessary word. If the lattice is complete, the generalized LR parsing method suffices \[91. It is often the case, however, that some words are missing in the output from the speech recognizer. In an attempt to use the generalized LR parsing technique for parsing an incomplete lattice \[6\] or for parsing a noisy input sequence \[5\], all possibly viable symbols are checked. Especially, handling missing symbols in the e~ly slage of parsing requires a lot of search. The bi-directional GLR parsing can handle missing words more elegantly in that only highly plat, siNe missing candidates are explored as follows.</Paragraph>
      <Paragraph position="5"> Suppose W-4(&amp;quot;p&amp;quot;) is missing from the lattice in Figure 3-3 3 . In parsing the lattice in the order of probability, the 2lxt practice, however, regular shift actions do not have to be Ixzffommd in many cases, because the nonterminals previously created are likely to have a high score due to the fact that the parse starts with anchor symbols. This heuristic method can reduce search.</Paragraph>
      <Paragraph position="6"> 3Such function words as prepositions and articles are likely to be missing in speech recognition.</Paragraph>
      <Paragraph position="7"> pzu:se is suspended after W-3 is shifted. At this moment tl~c left-to-right parsing is expecting &amp;quot;p&amp;quot; as the following word of W-3 and the right-to-left parsing is expecting &amp;quot;p&amp;quot; as the previous word of W-5. Therefore we can assuredly predict &amp;quot;p&amp;quot; is missing between W-3 and W-5.</Paragraph>
      <Paragraph position="8"> In case more th,'m one word is missed in the gap, creating expected dummy words tentatively from one side or both from left side and from right side can solve the problem. A top-down speech input verifier which checks the likelihood of dummy words should be incorporated, because search may grow significantly by indiscreet creation of dummy words.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="13" end_page="13" type="metho">
    <SectionTitle>
4. Parsing Noisy Speech Input
</SectionTitle>
    <Paragraph position="0"> Saito et al. implemented the system which parses the noisy speech input \[15\]. In that system the parser analyzes the phoneme sequence from left to right as exploring the possibilities of substituted, inserted, and missing phonemes.</Paragraph>
    <Paragraph position="1"> Consequently a much bigger search was required than conventional text parsing. Thus the efficient GLR parsing technique was adopted. Since the parse proceeds strictly from left to right pruning the low-scored partial parses, it is sometimes hard to parse the speech input whose beginning part is very noisy. For example, the speech input &amp;quot;ROEAIBIGAZUZIQKISURU&amp;quot; (the correct phoneme sequence is &amp;quot;OYAYUBIGAZUKIZUKISURU&amp;quot; which means &amp;quot;I have a burning pain in the thumb.&amp;quot;) can not lv parsed correctly by the GLR parser, because of the noisy initial part. To apply the bi-directional parsing technique to this problem, we need to make a word lattice from the phoneme sequence, because (r) The current speech recognition device \[3\] does not give us the probability of each phoneme in the sequence.</Paragraph>
    <Paragraph position="2"> . A single phoneme is too primitive to be an anchor symbol.</Paragraph>
    <Paragraph position="3"> The word lattice built from the phoneme sequence &amp;quot;ROEAIBIGAZUZIQKISURU&amp;quot; is shown in Figure 4-1. This lattice clearly shows that the correct parse</Paragraph>
  </Section>
  <Section position="5" start_page="13" end_page="13" type="metho">
    <SectionTitle>
5 24i
</SectionTitle>
    <Paragraph position="0"> We tested 125 sentences (5 speakers spoke 25 sentences.) in the domain of doctor-patient conversation. 111 sentences were parsed correctly by the regular GLR method (recognition rate: 89.6 %). 6 more sentences were parsed correctly by the bbdirectional parsing of the word lattice (recognition rate: 93.6 %). The remaining 8 sentences were very badly pronounced, in which content words are missing. It is necessary to ask the speaker to say the sentence again or to only speak the unclear portion.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML