File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/w97-1104_metho.xml
Size: 13,153 bytes
Last Modified: 2025-10-06 14:14:51
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-1104"> <Title>Prediction of Vowel and Consonant Place of Articulation</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Criteria for building an acoustic device for </SectionTitle> <Paragraph position="0"> communication Our approach will be based purely on physical laws and communication theory principles, and not on any specific characteristics of the human production and/or perception systems. The acoustic device is shown in Fig.1. It is a closed-open tube corresponding to the vocal folds at one end and to the lip opening at the other, respectively. Commands deform the area function of the acoustic tube consequently producing changes in the acoustic signal. These commands constitute our phonological repertoire and may be viewed as analogous to the 'gestures' proposed by Browman and Goldstein (1989).</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Commands Acoustic </SectionTitle> <Paragraph position="0"> Lexical 1(5, 6, 7) I Isignal (1) memory (8) \] : Acoustic tube I Articulatory-acoustic relation (2, 3, 4) The criteria used to build communication device are the following: * acoustic contrast: maximization of a 'good' the acoustic contrast of the sounds produced (1) in order to have a good signal-to-noise ratio. The spectral peaks (formants) having maximum energy are retained as acoustic parameters; * least effort: this refers to the efficiency of the articulatory-acoustic relations (2) with monotonicity (3) and orthogonality (4) of the relations. The use of an efficiency criterion corresponds to increasing the role of the dynamics; * simplicity: with respect to the commands of the deformation of the acoustic tube, the commands must be simple J(5) (straight deformations), few in number (6), and with a reduced number of degrees (a) of constriction (7). This is based on the assumption that fewer commands make for smaller demands on memory resources (8) that may be engaged in the learning and eventual mastery of these commands in the phonological acquisition process.</Paragraph> <Paragraph position="1"> An algorithm to automatically and efficiently deform the area function of an acoustic tube in order to increase or decrease the frequency of a formant or combination of several formants has been proposed elsewhere Carr6 et al. (1994; 1995). As an example, figure 2 shows the automatic evolution of the area (b) function of the tube from a closed-open neutral configuration, for increasing and decreasing F2. Four main regions naturally emerge which are not of equal length. The /a/ vowel is an automatic consequence of a back constriction associated with a front cavity, and, the/i/vowel of a front constriction associated with a back cavity (anti-symmetrical behavior), a pharynx cavity, thus, being automatically obtained. If, however, the initial configuration is that of a tube closed at both ends (i.e. closed-closed) the/u/ (c) vowel is automaiicaUy obtained with a central constriction (symmetrical behavior). A summary of the main conclusions that may be drawn from the above manipulations are as following: * configurations using the maximum acoustic contrast criterion correspond to those for the three vowels/a, i, u/of the vowe\[ triangle; * the deformation of the tube is minimum, because of the use of the sensitivity function (Fant and Pauli, 1974) and thus efficient; (d) * the deformation 'commands (or gestures) are simple (recti-linear), limited in number (only one in the case of figure 2, for the making of a back constriction is automatically as~sociated with a front cavity and vice-versa, as i 9 human), and applied at specific places called distinctive regions.</Paragraph> <Paragraph position="2"> function A(x) of a uniform acoustic tube (U) for increasing and decreasing (upper part) F2.</Paragraph> <Paragraph position="3"> (c)-(d) Corresponding three formant variations and formant trajectories in the F,-F2 plane (lower part). In summary, characteristics important and integral to a 'good' acoustic communication device were obtained when an acoustic tube underwent specific deformations. It is this type of tube, sostructured in regions, that forms the basis of the Distinctive Region Model (DRM) (Carr6 and Mrayati, 1992; Mrayati, et al., 1988). Deformation gestures and efficient places of deformation (the regions) are thus deduced from acoustic theory.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 The Distinctive Region Model (DRM) </SectionTitle> <Paragraph position="0"> The DRM model proposed in 1988 (Mrayati, et al., 1988) is structured in regions, the limits of which correspond to the zero-crossings of the sensitivity function computed on a uniform closed-open tube (figure 3).</Paragraph> <Paragraph position="1"> The DRM is a 2-region model when only the first formant is controlled. It is a 4-region model when the first two formants (F~ and F2) are controlled (note that this 4-region structure is automatically obtained with a maximum acoustic contrast algorithm - see figure 2). It is an 8-region model when the first three formants (F~, F2, F3) are controlled. Thus, increased control of the number of formants entails an increase in complexity and topological accuracy. Varying the cross-sectional area of any one region maximizes the formant frequency variations, with each binary combination of a positive or negative formant variation corresponding to a specific region. Thus, it is hypothesized that the regions of the model could represent the best places of articulation for vowels and consonants.</Paragraph> <Paragraph position="2"> Critics of this model, may claim that defining regions from the sensitivity functions for a uniform tube AF F~ would limit the functioning domain for small l perturbations. But synergetic command of the regions allows one to maintain monotonic formant variations as well as the region structure, as shown in figure 2.</Paragraph> <Paragraph position="4"> eight regions of the model.</Paragraph> <Paragraph position="5"> Note that the region boundaries correspond to a percentage of the total length of the tube. Since the physical length of the acoustic tube is around 19.5 cm for the vowel /u/ and 17.5 cm for the vowel /i/, the place of articulation would thus move according to the pronounced vowel. However, it is the 'effective' length that one must take into account ; for, in addition to the physical length, Lp, effective length incorporates the length, Lr, which corresponds to the radiation effect.</Paragraph> <Paragraph position="6"> Now then, Lr is imore or less proportional to the lip opening. For the vowel/i/, Lr=2 cm, and for the vowel /u/, Lr=0 cm. Thus the effective length remains almost constant (Mrayat i, et al., 1990) and the places of articulation, fixe& with region 7 always corresponding to the teeth (figure 4). Region 8 (with the radiation effect) corresponds to the lips, regions 3, 4, 5, 6 to the tongue, the region 1 to the larynx. Regions 2 and 7, of small length, are acoustically unimportant.</Paragraph> <Paragraph position="7"> As such, a closed-open model may not be employed for an/u/production because the vocal tract is closed at the source and more or less closed at the lips. A closed-close.d model would be more convenient.</Paragraph> <Paragraph position="8"> The behavior of such a model is symmetrical, allowing for a central constriction to be automatically derived by the closed-closed model.</Paragraph> <Paragraph position="9"> the central constriction. For consonant production, the 8-region model is needed (taking into account the first three formants). Thus, prediction of consonant production requires a more complex model than that for the vowel production (insofar as the number of degrees of constriction is not taken into account).</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Vowel places of articulation </SectionTitle> <Paragraph position="0"> The DRM model with its places of articulation was used to produce vowels. For example, figure 6 shows the execution of a command (using a closed-open DRM model) to pass from a front constriction to a back constriction along with a labial command. The acoustic results of these commands are also shown in the FrFz plane. The positions of the vowels obtained with such a closed-open model are given. Thus, to produce vowels the first two formants appear to be sufficient, the third one either being deduced from the first two (Ladefoged and Harshman, 1979) or is a speaker specific</Paragraph> <Paragraph position="2"> The places: of articulation predicted by the model as the bes~t places for maximum formant variations are represented in figure 5. It can be observed that the three main places of articulation for r vowel production may be predicted with the 4-region model (taking into account the first two formants). The closed-open model predicts the front constriction and the back constriction; the closed-closed model predicts</Paragraph> <Paragraph position="4"> formant trajectories in the Fj-F2 plane. The trajectory /uf/ was obtained with a central constriction (closedclosed model).</Paragraph> <Paragraph position="5"> The consequences of a change from a closed-open to a closed-closed configuration of the DRM model have been studied elsewhere (for more details, see Carr6 and Mrayati (1995)). Intermediate places of articulation between central and back and central and front are obtained. This approach allows for a good prediction of vocalic systems (Cart6, 1996).</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Consonant places of articulation </SectionTitle> <Paragraph position="0"> model with its uniform configuration. Using the places of articulation defined by this model (taking into account the first three formants), we were able to obtain the formant transitions measured by Ohman (1966) (Carri and Chennoukh, 1995). The use of the region 7 gives rise to the production of consonants perceived as labial. Thus, the classification proposed by Mrayati et al. (1988, figure 18) has been revised. Considering the vowel/El as close to the neutral partly contributed to the misclassification. The patterns of Delattre (1955) corresponding to/EcE/where c is a plosive consonant are obtained by the model when only regions 5, 6 and 8 are used. The regions 3 and 4 of the back part of the model correspond to the places of articulation of pharyngeal plosives (AI Dakkak, et al., 1994).</Paragraph> <Paragraph position="1"> ...... ..317.C ....... ........ &quot; ! ...... i....J....~...J ...... .~.....J ............... \[ ........... \[~'.{.~...i N\[K: X.:, , ..... .......... : ...... : ::: ..... ~'~ ~'~ ,~'f~'~&quot;~#', : t'~l * . ...... ......... ..: ..............</Paragraph> <Paragraph position="2"> Since the regions of the DRM model can be obtained from the formant variations of a uniform tube and since these regions indeed correspond to the places of articulation, we may hypothesize the following: * the bursts (Blumstein and Stevens, 1979; Stevens and Blumstein, 1981) associated with the plosives are consequences of closure-opening actions. Indeed, it has beeen shown that if the bursts are important, they are not determinant for plosive identification in comparison with the formant transitions (Kewley-Port and Pisoni, 1983; Walley and Carrell, 1983). * the fricative consonants which are generally characterized by their noise sources and not by their forrnant transitions have to be studied according to another criterion.</Paragraph> <Paragraph position="3"> * the third formant may be essential for differentiating certain consonants like/d/and/g/(the slopes of the first and second forrnant would not be sufficient).</Paragraph> <Paragraph position="4"> Indeed, it is noted in Harris et al. (1958) that the consonant /d/ cannot be obtained in the vocalic context /i/ without the third formant. The authors concluded that << The effects of third formant cues are independent of the two-formant patterns to which they are added. When a third formant cue enhances the perception of a particular phoneme, it typically does not do so equally at the expense of the other response alternatives >>. Similarly, in categorical perception experiments, Godfrey et al.</Paragraph> <Paragraph position="5"> (1981) studied the boundary between/d/and/g/by changing the rate and direction of the third formant only, while the first two formants were changed for a/b/-/d/contrast. Additional experiments have to be undertaken to further explore this point.</Paragraph> <Paragraph position="6"> * the use of a uniform tube allowed for all combinations of formant variations (positive and negative), that form the basis of the DRM structure, to be revealed. Does this imply that perceptual identification is carried out with reference to a uniform state? This point is worth studying further.</Paragraph> </Section> class="xml-element"></Paper>