XML Viewer - h90-1003

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/h90-1003_metho.xml
Size: 8,307 bytes
Last Modified: 2025-10-06 14:12:32
<?xml version="1.0" standalone="yes"?>
<Paper uid="H90-1003">
  <Title>Efficient, High-Performance Algorithms for N-Best Search</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. Forward-Backward Search
</SectionTitle>
    <Paragraph position="0"> The time-synchronous beam search follows a large number of theories on the off chance that they will get better during the remainder of the sentence. Typically, we must keep over 1000 theories to guarantee finding the highest answer.</Paragraph>
    <Paragraph position="1"> In some sense the computation for all but one answer will have been wasted.</Paragraph>
    <Paragraph position="2"> We need a way to speed up the beam search without causing search errors. We could prune out most of the choices if we only knew the correct answer ahead of time or if we could look ahead at the remainder of the sentence. Several papers have described fast match schemes that look ahead (incurring a delay) to determine which words are likely (e.g. \[4\]). The basic idea is to perform some approximate match that can be used to eliminate most of the possible following words. However, since we cannot tell when words end in continuous speech, the predictions of the score for each word is quite approximate. In addition, even if a word matches well we cannot tell whether the remainder of the sentence will be consistent with that word without looking further ahead and incurring a longer delay.</Paragraph>
    <Paragraph position="3"> Let us consider the time-synchronous forward pass. The score at any given state and time at(s) is the probability of the input up to time t, summed over all of the paths that get to state s at t. When these scores are normalized they give the relative probability of paths ending at this state as opposed to paths ending at any other state. These forward pass probabilities are the ideal measure to predict which theories in a backward search are expected to score well.</Paragraph>
    <Paragraph position="4"> Figure 6 illustrates several paths from the beginning of an utterance to different states at time t, and several theories from the end of the utterance T backward to time t. From  where 7t(s) is the probability of the data given all paths through state s, divided by the probability of the data for all paths, which is the probability that slate s is appropriate at time t. aT is derived from the forward pass. Of course if we have already gone through the whole utterance in the forward direction we already know the most likely sentence.</Paragraph>
    <Paragraph position="5"> Now let us consider a practical Forward-Backward Search algorithm. First we perform a forward pass over the whole utterance using a simplified acoustics or language model. In each fran~ we save the highest forward probability and the probabilities of all words that have ending scores above the pruning beamwidth. Typically this includes about 20 words in each frame. Then we perform a search in the backward direction. This search uses the normal beam search within words. However, whenever a score is about to be transfered backwards through the language model into the end of a word we first check whether that word had an ending score for that frame in the forward pass. That is we ask, &amp;quot;Was there a reasonable path from the beginning of the utterance to this time ending with this word?&amp;quot; Again, referring to Figure 6, the backward theory that is looking for word  ward scores for the same state and time are added to predict final score for each theory extension.</Paragraph>
    <Paragraph position="6"> d cannot find any corresponding forward score, and so is aborted. When there is a score, as in the cases for words a,b,c, then we multiply the present backward score of the theory,/3t(s) by the forward pass score for this word; at(s), divided by the whole sentence score, aT. Only if this ratio is greater than the pruning beamwidth do we extend the theory backwards by this word. For example, although the backward theory looking for word c has a good score, the corresponding forward score c' is not good, and the product may be pruned out.</Paragraph>
    <Paragraph position="7"> The Forward-Backward search is only useful ff the forward pass is faster than the backward would have been. This can be true if we use a different grammar, or a less expensive acoustic model. If the forward acoustic models or language model is different than in the backward pass, then we must reestimate txa, before using it in the algorithm above. For simplicity we estimate txT at each time t as at(t) = max at(s) maxB (s) the product of the maximum state scores in each direction.</Paragraph>
    <Paragraph position="8"> (Note that since the two maxima are not necessarily on the same state it would be more accurate to use</Paragraph>
    <Paragraph position="10"> forcing the two states to be the same. However, since most of the active states are internal to words, this would require a large computation and also require that we had stored all of the state scores in the forward direction for every time.) We observe that the average number of active phoneme arcs in the backward direction is reduced by a factor of 40 (e.g. from. 800 to 20) - with a corresonding reduction in computation and with no increase in search errors.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Uses of Forward-Backward Search
</SectionTitle>
      <Paragraph position="0"> As stated above, this algorithm is only useful when the forward pass can be computed differently (much more quicldy) than the backward (real) search. For example, we could use a null grammar in the forward direction and a more complex grammar in the backward search. We have used this extensively in our past work with very large RTN grammars or high-order statistical grammars \[7\]. When no grammar is used in the forward pass we can compact the entire dictionary into a phonetic tree, thereby greatly reducing the computation for large dictionaries.</Paragraph>
      <Paragraph position="1"> A variation on the above use is to use a simpler acoustic model in the forward direction. For example restricting the model to triphones within words, using simpler HMM topologies, etc.</Paragraph>
      <Paragraph position="2"> A second use is for real-time computation of the N Best sentences \[1\]. First we perform a normal 1-Best search forward. The best answer can be processed by NL immediately (on another processor) while we perform the N-Best search backwards. We find that the backward N-Best search is sped up by a factor of 40 when using the forward pass scores for pruning. Thus the delay until we have the remainder of the answers is usually quite short. If the delay is less than the time required to process the first answer through NL, then we have lost no time.</Paragraph>
      <Paragraph position="3"> Finally, we can use the Forward-Backward Search to greatly reduce the time needed for experiments. Experiments involving expensive decoding conditions can be reduced from days to hours. For example all of the experirnents with the Word-Dependent and Lattice N-Best algorithms were performed using the Forward-Backward Search. \</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="10" type="metho">
    <SectionTitle>
4. Conclusion
</SectionTitle>
    <Paragraph position="0"> We have considered several approximations to the exact Sentence-Dependent N-Best algorithm, and evaluated them thoroughly. We show that an approximation that only separates theories when the previous words are different allows a significant reduction in computation, makes the algorithm scalable to long sentences and less susceptable to pruning errors, and does not increase the search errors measurably.</Paragraph>
    <Paragraph position="1"> In contrast, the Lattice N-Best algorithm, which is still less expensive, appears to miss twice as many sentences within the N-Best choices.</Paragraph>
    <Paragraph position="2"> We have introduced a new two-pass search strategy called the Forward-Backward Search, which is generally applicable to a wide range of problems. This strategy increases the speed of the recognition search by a factor of 40 with no additional pruning errors observed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML