File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/c92-3143_abstr.xml

Size: 11,696 bytes

Last Modified: 2025-10-06 13:47:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-3143">
  <Title>COUPLING AN AUTOMATIC DICTATION SYSTEM WITH A GRAMMAR CHECKER</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
COUPLING AN AUTOMATIC DICTATION SYSTEM WITH A GRAMMAR CHECKER
</SectionTitle>
    <Paragraph position="0"> Jean-Pierre CHANOD, Marc EL-BEZE, Sylvle GUILLEMIN-LANNE IBM France, Paris Scientific Center Automatic dictation systems (ADS) are nowadays powerful and reliable. However, some Inadequacies of the underlying models still cause errors. In this paper, we are essentially interested in the language model implemented In the linguistic component, and we leave aside the acoustic module. More precisely, we aim at Improving this linguistic model by coupling the ADS with a syntactic parser, able to diagnose and correct grammatical errors.</Paragraph>
    <Paragraph position="1"> We describe the characteristics of such a coupling, and show how the performance of the ADS improves with the actual coupling realized for French between the Tangora ADS and the grammar checker developed at the IBM France Scientific Center.</Paragraph>
    <Paragraph position="2"> Description of the Tangora system The Tangora system is implemented on a personal computer IBM PSI2 or IBM RS/6000. A vocal I/O card is added, as well as a specialized card equipped with two micro-processors, which provide the needed power for the decoding algorithms. The programs are written In assembly or C.</Paragraph>
    <Paragraph position="3"> The multi-lingual aspect of the Tangora system (DeGennaro 91) constitutes a major asset. Indeed, It was Initially conceived for English (Averbuch, 87) by the F. Jellnek team (IBM T. J. Watson Research Center), but It was adapted since to process Italian, German and French Inputs. As a whole, the average error rate is close to 5%. But problems specific to each language require adapted solutions.</Paragraph>
    <Paragraph position="4"> The user is required to train the system by uttering 100 sentences during an enrollment phase, and to manage slight pauses between two words. For the French system, liaisons at this time are prohibited.</Paragraph>
    <Paragraph position="5"> Architecture of the system The voice signal is submitted to a chain of signal processing, in order to extract acoustic parameters from the sound wave.</Paragraph>
    <Paragraph position="6"> Thus, the data flow is reduced from 30,000 to 100 bytes per second. Two passes of acoustic evaluation are performed: a relatively gross pass (so-called Fast Match) selects a first list of candidate words (around 500 words); this list is further reduced thanks to the language model (see below)~ so that only a small number of remaining candidates are submitted to a second, more precise, acoustic pass (socalled Detailed Match). Storage constraints as we!l as the methods used to provide the language model explain that the size of the dictionary is limited to about 20,000 entries. The decoding algorithm This algorithm determines the more likely uttered sequence of words. It works from left to right by combining the various scores estimated by the acoustic and linguistic models, according to a so-called stack decoding strategy. At this stage, the elementary operation consists tn expanding the best existing hypothesis which Is not yet expanded, i. e. It consists In keeping the sentence segment, which, followed by the contemplated current word, Is rated with the highest likelihood.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Methods
</SectionTitle>
      <Paragraph position="0"> If one formulates the problem of speech recognition according to an Information theory approach, one naturally chooses probabillstic models among all available language models (Jeltnek, 76). The trlgram (Cerf, 90), trlPOS 1 (Derouault, 84), or trilemma (Derouault, 90) models offer ways of estimating the probability of any sequence of words. For instance, formula of the trlgram model:</Paragraph>
      <Paragraph position="2"> The analysis of decoding errors show that half of them are due to the acoustic model, the other half being associated with the I Model baled on triplets of parts of AOooeh (POS), ACTES DE COLING-92, NANTI'S, 23 28 AoOr 1992 9 4 0 PRO\[:. OF COLING-92, NArcr~s. AUG. 23-28, 1992 language model. Actually, the number of homophones being quite high (2.6) In an inflected language such as French, it Is clear that no acoustic model, as perfect as It may be, can produce a satisfactory decoding without the support of a language model.</Paragraph>
      <Paragraph position="3"> Power and limitations of probabilistic language models Probablllstlc language models are powerful enough to considerably reduce ambiguities that the acoustic model alone cannot solve.</Paragraph>
      <Paragraph position="4"> However, they suffer from punctual Imperfections that are bound to their formulation. This Is clearly shown by testing a probablllstlc model on the lattice formed by the set of the homophones of the words of every sentence. The decoding obtained by searching for the maxlreum likelihood path (Cerf, 91) gives an error rate close to 3%, thus showing some of the Inadequacies of the probablllstlc language models.</Paragraph>
      <Paragraph position="5"> Besides, and agatn for reliability reasons, statistics need to be gathered from large learning corpora (tens or even hundreds of millions words). In spite of all the preliminary cleaning that may be done (automatic correction of typos, tripled consonants for Instance), such a huge corpus contains a certain number of grammatical errors, that Introduce noise In the model.</Paragraph>
      <Paragraph position="6"> Probablllstlc estlmatlons are produced by counting triplets of words or grammatical classes, tn any of the trtgram, triPeS or trllemma models, a word Is generally predicted according to the two preceding words, classes or lemmas only. However, grammatical rules may apply to larger frames. Not only the rules often apply to words located out of the window used by the probabtllstlc model, but also grammatically significant words are to be found either In previous or In posterior position. Let us mention, as Illustrations, some phenomena for which the probablllstlc model does not fit: * Adverbs and complements constitute an obstacle to tile transfer of information on gender, number and person, while this Information Is needed to choose between different homophones, as In: I~ COMMISSION charg(ie d' 6tabllr un plan de aoutlen global aux populotlone des terrltolres occup~m s&amp;quot; est RdUNIE dlmanche,  Increase the distance between elemeuts which must agree: Plusloun= PARTI5 d'oppo=lUo, de gaucho, notammant Io paHl commu= nlate, PARTAGENT co point de rue.</Paragraph>
      <Paragraph position="7"> Predicting a word thanks to tim preceding words does not allow the system to appropriately control person agreement when the subject follows the verb. Example: Quo aont DEVENUS los prlnelpaux PROTAGONtSTES de la vlctolre du onze novombre? Moreover, some confusions due to homophony induce changes of grammatical category, that require a complete Interpretation of the sentence to be properly diagnosed, as in &amp;quot;et&amp;quot;/'est ~ (conjunction/verb) or &amp;quot;&amp;&amp;quot;l&amp;quot;a ~ (preposition/verb).</Paragraph>
      <Paragraph position="8"> Coupling the ADS with the grammar checker To bring a solution to the problems described above, we propose to perform a grammatical analysis after the decoding operation. The grammatical analysis applies to the best of the hypotheses selected by the ADS. It serves as a basis to diagnose grammatical errors and te suggest corrections 2 .</Paragraph>
      <Paragraph position="9"> The syntactic parser must prove powerful and reliable enough to effectively Improve the performance of the ADS. It must provide a broad coverage, In order to cope with a large variety of texts, the source and the domain of which are not known In advance. It must also compute a global analysis of the sentence In order to fill the deficiencies of the probablllstlc model.</Paragraph>
      <Paragraph position="10"> Description of the syntactic parser The syntactic parser we use meets the requirements described above (Chased el). It is actually conceived to provide the global syntactic analysis of extremely diversified texts.</Paragraph>
      <Paragraph position="11"> It is based on an original linguistic ~rategy developed by Karen Jonson for US English (Heldorn 132, Jonson, 8G). The parser Initially e A similar approach was tested in English, but only to detect grammatically incorre~ct ~nionceB (Bellegarda 92) AcrEs DE COLING-92, NANTES, 23-28 AO~r 1992 9 4 1 PROC. o=: (;OI.ING-92, NANTES, AUG. 23-28, 1992 compute8 a syntactic sketch, which represents the likeliest syntactic surface structure of the sentence; at this stage, such phenomena as coordinations, ellipses, interpolated clauses, If not totally resolved, do not block the parsing. The analysis Is based on the so-called relaxed approach, which consists in rejecting linguistic constraints which, as pertinent as they may be In descriptive linguistics, are rarely satisfied strlcto sansu In the surface structures of free texts. This strategy proves to broaden the coverage of the grammar as well as it allows the parser to deal with erroneous texts.</Paragraph>
      <Paragraph position="12"> Architecture of the parser:.</Paragraph>
      <Paragraph position="13"> The system is written in PLNLP  corrected forms.</Paragraph>
      <Paragraph position="14"> Indeed, some other techniques are also used. Strong syntactic constraints are relaxed during a second pass; It allows the system to detect errors which induce major syntactic changes (for Instance confusion &amp;quot;et/est&amp;quot;), whim forbidding undesired or too numerous parses. Fitted parses are computed In case the global analysis falls (Jansen, 83) and multiple parses are ranked thanks to specific procedures (Heldorn, 76). This last point allows the system to automatically select the strongest hypothesis, according to the linguistic features (Including the grammar errors) of the syntactic trees.</Paragraph>
      <Paragraph position="15"> Adaptation of the parser to the ADS As mentioned above, many grammatical errors In written French are actually caused by homophones (gender, number agreement, confusion between Infinitive and past participle, &amp;quot;chantez/chanter', %t/esf&amp;quot;, etc.). The parser, Initially built for written French, Is thus well prepared to detect errors produced by an ADS.</Paragraph>
      <Paragraph position="16"> It can however be adapted to the specific needs of the ADS, by adding specific procedures (detection of ill-recognized frozen phrases, etc.), and by filtering out non-homophonic corrections, or corrections which do not belong to the list of candidates initially proposed by the ADS.</Paragraph>
      <Paragraph position="17"> Indeed, post-processing procedures are largely used to diagnose errors after the syntactic tree has been computed. This offers the Immense advantage of making the system evolutionary: It can be easily modified, In order to Improve the scope of the detections. This made the adaptation of the grammar checker to the ADS quite straightforward.</Paragraph>
      <Paragraph position="18"> Description of the processing chain In case of the ADS, the coupling Is done by a simple call to the parser for each sentence. In case of the homophone scheme, the diagram of the processing chain Is shown In the following figure: = The=e 50,000 lemmae produce about 350,000 inflected forms, which largely exceeds the 20,000 forms uemd by the Tangora system.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML