File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/w94-0205_metho.xml
Size: 51,435 bytes
Last Modified: 2025-10-06 14:13:53
<?xml version="1.0" standalone="yes"?> <Paper uid="W94-0205"> <Title>LEXICAL PHONOLOGY AND SPEECH STYLE: USING A MODEL TO TEST A THEORY</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. INTRODUCTION </SectionTitle> <Paragraph position="0"> The theory of lexical phonology (LP) seeks to explain the inter-relationships between morphology and phonology by allocating some of the phonological processes to the dictionary or lexicon in which the morphemes reside. The functions of brackets and boundary symbols found in other phonological representations are subsumed into the framework of the system.</Paragraph> <Paragraph position="1"> The domains of both morphological and phonological rules within the lexicon are subdivided into strata which define both the type of morphological process applicable and the mode of operation (i.e. whether cyclic or non.cyclic) of the associated phonological rules. Processes applied on early strata are invisible to those of later strata through the application of the &quot;Bracket Erasure Convention&quot;.</Paragraph> <Paragraph position="2"> A computational system, LexPhon, has been devised to demonstrate the principles of LP. LexPhon comprises a level-ordered phonological interpreter and a set of linguistic tools for reformulating the stratum control, developing phonological rule sets and for devising phoneme inventories, including natural class categorisation tools for making explicit the conjunctive relationships between the elements of the inventories derived.</Paragraph> <Paragraph position="3"> Aspects of LP incorporated into LcxPhon are the Strict Cyclicity Condition, Bracket Erasure and both cyclic and noncyclic control mechanisms. Syllabification is based on the Maximal Onset Principle and fully-specified underlying representations are assumed. The number of strata and attribution of each stratum to either cyclic or noncyclic control is variable by the user.</Paragraph> <Paragraph position="4"> Ordered sets of phonological rules are allocated to strata.</Paragraph> <Paragraph position="5"> LexPhon is used for comparison of potential structures for the level-ordered frameworks within which the phonological rules operate and for alternative formulations and assignments of the rules themselves.</Paragraph> <Paragraph position="6"> Conclusions can be drawn about the effectiveness of each from the broad phonemic transcriptions resulting from applying any given setting of the system to large quantities of data and from the processing traces also provided.</Paragraph> <Paragraph position="7"> An example is given of the application of LexPhon to the study of speech style, using data from reported studies of English and Mexico City Spanish and its ability to model the alternative surface forms evidenced by the data, through the re-allocation of the same set of phonological rules to different strata, is evaluated.</Paragraph> </Section> <Section position="4" start_page="0" end_page="44" type="metho"> <SectionTitle> 2. LEXlCAL PHONOLOGY (LP) </SectionTitle> <Paragraph position="0"> Lexical phonology (Kiparsky, 1982; 1982b; Halle and Mohanan, 1985, Booij and Rubach, 1987; and many others) is a recent approach to generative phonology (Chomsky and Halle, 1968) in which the application of the phonological rules, which map from the abstract, underlying form of a word to its surface, phonetic realisation, is tied to the morphological word formation ~ processes taking place.</Paragraph> <Paragraph position="1"> Morphological and phonological rules are seen as part of the lexicon and are grouped into strata. The order of interaction between morphological and phonological components is stratum-depndant. Thus, LP ties the application of phonological rules to morphological structure in a particularly close way. It provides a highly structured framework in which to develop and evaluate the rules and representations for phonological transformations. However, while there is considerable support for the theory of stratification of morphological rules and the cyclic nature of the application of some phonological rules, opinion is divided as to the number of strata required and the level(s) to which each of the identified phonological rules is assigned in any given language and further, as to whether such higher level control structures are universal (Booij and Rubach, 1987) or language specific (Kiparsky, 1982, Halle and Mohanan, 1985).</Paragraph> <Paragraph position="2"> In addition to linear representations, LP accounts of linguistic phenomena are frequently formulated within the formats provided by autosegmental and metrical, or nonlinear phonology (for examples see Tranel, 1987; van der Hulst and Smith, 1982; Halle and Vernaud, 1982; Leben, 1982; etc.). LP descriptions usually adopt the two principle tiers from autosegmental phonology, the melody and the skeleton, to represent the data. Further layers and/or the hierarchical representations of metrical phonology may be introduced when required.</Paragraph> <Paragraph position="3"> The melody corresponds roughly to the linear representation of segments whereas the skeleton carries the positional or timing information in x-slots. The x-slots, sometimes further distinguished as C and V slots, generally correspond one-to-one with the segments and are anchored to elements of the melody tier. However, in the case of long vowels, for instance, two skeletal slots may be attached to the same vowel. Empty slots and floating segments are also permitted by this theory.</Paragraph> <Paragraph position="4"> Both morphological and phonological higher level structures may be attached to the skeleton tier allowing other levels of organisation, such as onset and rime within syllables, to be represented, if required.</Paragraph> <Paragraph position="5"> Nonlinearity of both morphology and phonology is incorporated into LP through the structuring of the lexicon and the independence of syllabification from the word-formation process. For LexPhon, the concept of timing slots has influenced the choice of multiple segment descriptions for long vowels and diphthongs.</Paragraph> <Paragraph position="6"> Syllabification, where needed, is derived from the current segmental representation which thus reflects both the melody and skeleton tiers.</Paragraph> <Paragraph position="7"> However, by permitting features and their domains to be independent of the segment, autosegmental phonology also allows for the description of feature and tonal units smaller than a single segment. This aspect has not been utilised in the current system.</Paragraph> </Section> <Section position="5" start_page="44" end_page="81" type="metho"> <SectionTitle> 3. THE LEXPHON SYSTEM </SectionTitle> <Paragraph position="0"> LexPhon is implemented in Prolog, using LPA MacPROLOG and is based on principles of lexical phonology.</Paragraph> <Paragraph position="1"> LexPhon is a collection of interactive linguistic tools comprising a phonological processor, or interpreter, sets of processes for creating and updating ph onem e inventories, phonological rule-sets and LP control structures and investigative tools for exploring relationships within the phoneme inventory.</Paragraph> <Paragraph position="2"> The phonological processor takes as input the underlying forms of the morphemes believed to contribute to a particular word and produces, as output, a broad phonemic transcription of the resulting word. To do this, the processor applies sets o f phonological rules to the developing word string, at each level, or stratum, of morphological development, in order to demonstrate the transformations required to produce the attested surface representations of the utterance.</Paragraph> <Paragraph position="3"> LexPhon allows the user to experiment with underlying forms, the structure of the rules to be applied, the ordering of the rules, their application within cyclic or noncyclic stratum control and the definitions of the segments to be manipulated. The investigative tools allow the segment definitions of the phoneme inventory to be interrogated to find groups of segments answering particular feature descriptions and to find alternative descriptions of specified groups of segments.</Paragraph> <Paragraph position="4"> All the software implemented for configuring the processor, updating the databases, investigating the properties of the selected phoneme inventory and applying the processor to the phonological data has been collected into a single system which, in common with most other Prolog software, is of a hierarchical structure. In general, only the predicates at the leaves of the hierarchy perform direct actions on the data, the intermediate levels being present to select the appropriate action to be taken. The top level of the hierarchy is accessible to the user via a pull-down top-bar menu containing five sections of commands which control the The first section of the menu holds commands which control the application of the Interpreter to data. The system can be called to process pre-prepared data files.</Paragraph> <Paragraph position="5"> These can be selected by name from a display menu or, if the user does not wish to make an explicit selection, a default set of data files can be selected by the system for demonstration purposes, on a rota.</Paragraph> <Paragraph position="6"> Alternatively, the user may select an interactive application, providing the data in response to prompts by the system.</Paragraph> <Paragraph position="7"> Alterations to the Control Structure, the Phonological Rule database and to the Phoneme Inventory may be made through the commands provided in the second, third and fourth sections of the control menu, respectively. Commands are provided to allow the user to list the phoneme and rule descriptions represented by the appropriate sections of the database and to alter or replace them as necessary. The Interpreter commands may then be called to demonstrate the effects of the changes.</Paragraph> <Paragraph position="8"> The commands provided in the last section of the menu allow the user to investigate relationships between the members of the current Phoneme Inventory.</Paragraph> <Paragraph position="9"> The user can choose to analyse this database by entering a list of segments by name and receiving one or more categofisations by common feature values or by entering a list of features with associated values and receiving a list of all segments in the current inventory which satisfy those values.</Paragraph> <Section position="1" start_page="45" end_page="47" type="sub_section"> <SectionTitle> 3.1 Representations </SectionTitle> <Paragraph position="0"> There are basically four different types of data upon which the processor must operate.</Paragraph> <Paragraph position="1"> The first and most obvious one is the word string, the representation of the word itself at each of its levels during the word formation process. Associated with this is the phoneme representation, as each of the fundamental segments, or phonemes, which make up the word string has a complex structure which serves to establish its relationship to all the other phonemes used by the language under study.</Paragraph> <Paragraph position="2"> Within the theory of LP both phonological rules and strata have paramount significance because it is through their interaction that the surface alternations between morphologically related words of the language are explained.</Paragraph> <Paragraph position="3"> Although the representation of the affixation process is itself inherent in this interaction process, an affix morpheme representation has also been devised for LexPhon, to enable the user to dictate the levels at which affixation data is incorporated into the word string when the full underlying form is presented to the system by means of a data file.</Paragraph> <Paragraph position="4"> Phonemes are represented by simple predicates of two arguments, the name, which is a phonetic character, and the feature specification. The phoneme inventory is fully-specified for all the features used by Halle and Mohanan (!985) in their description of LP for English. The specification is implemented as a list of negative and/or positive value elements to be matched against a list of feature names to obtain the complete description. This means that 2-valued logic ~s c~,ploy~ ~ queW t~e database.</Paragraph> <Paragraph position="5"> Two separate lists of feature ~raes are available so that vowels can be specified for tenseness whereas consonants are specgfied for stridency. This is a co:~p:o:~:~s~ ~zkzt~9~ to balance the objectives c~ ~ct ~r.ovi,$~ng values for irreleve.nt fea~,~':~s, wkE~ a:Y=e~::~:~ as closely as possible tc X~e ~ba~u~e ~e~ employed by Halle arid ~C/~har,~an, and ~i~.e computational efficiency oblained by having both the minimal number of features and feature matrices of the same dimensior~s for all segments and r~.ay ke~vc ~r~gu~'.~t~c implications (Wi~ia~z, ~.~>~. ......</Paragraph> <Paragraph position="6"> Example of vowel: anterior, -high, +tow, -buck, -round,-nasal, +cont~nuant, +voiced, -tense.</Paragraph> <Paragraph position="7"> Example of consonant: -syllabic, &quot;~- c ~.~ ~ ~ ~~ ~ ~ ~ anterior, +high, -low, -back, -round, -nasal, -can'~nuant, -voiced, +strident Vowels do not have ~ fee.Sure specification for length as a~ ~he v~wc~s ~n the inventory are 'short ' and 'pure '. Diphthongs and long vowels are represented as two consecutive segments.</Paragraph> <Paragraph position="8"> If the application of a phono\[eg~cal ruse, whose conditions are ofi~e~.~ ~a~!sfied, would result in the fcr~a~c~, c~' r.. seg~c~* with a feature description which does hoe currently appear in the phoneme ~avemory then the segment is created and added to the inventory. The name for the new ~%~nent ~ generated by concatena1~,g ~7~?: ~f ~e segment to which ~ransfo~.~\[ena were applied in order to construct i~, wit~ a un~3~e number. This enables the user ~ be abte to trace the processes whack ~ed to its formation.</Paragraph> <Paragraph position="9"> Word Strings are ordered lists of elements which represent the current state of the word form, or working string, at any stage of the derivational process.</Paragraph> <Paragraph position="10"> e.g.</Paragraph> <Paragraph position="11"> :: (i:~,~ ~:~ ~,~ ~ ~ba u nd a ry\] ~c~:on to segment names for each p~::.~:~c.~ !'cT7.:e.e~t*d, t~.e word string may c ............. , ~&.,.~, ....... c~o,~ary markers and :.:..~.:~.~:~ ..... ~f~::.xat~on which has ,~.~ ........ ~ ....... &quot;:~ .... ~&quot;~ ~u~ent stratum. Each s~:.':: =:. ~:=:~ ~.c~::, ~:.~ application predicate wi~:L:::. ~ ~:~:um ~akes an input string and ins~nt~,atcs a variable to the string which is the ~:c~ul~ of its application. The pb.o~.~,~t ~C/, 'p~Ie', predicates operate ........ ~'~<&quot;~ of :k~ current word string EFt, ~', .: ............... w~,f.c~,7. ~5~:.~ &quot;:~:. ~.~ context required by :~...~, ..~.:~r~. ~e result of their t~c~:.:.:.~.~f.~z .~ ~ ~ec:acemcnt word stnng ~,=~'i~ ~f~r ~ ~ffixation at ~~: 2 = \[wordboundary, ~ t, ~, ~, ~, ~ordboundary\] ~; ........ ~ .... ~&quot;~'~'~*&quot;~ ~'~|~ &quot; PS -) I = ..... ~ ~:-,,o ::.~ :% n, n, n, p, p, ~&quot; ~!, ~ i'. (;~ ~ = \[I, I, n, T~is vowel-shift rules applies when its co~en~ ma~chc~ ~ two instances of the same V@V.~3~ ~ ~ ~ &quot; ~5c ..... ~ ~ each other. In this c.~: ....... .~ ...... , %.:~ ~.p~es to vowels which 7~,,,, ~ ~ ~ ~:~ ,,_~.~, 0:.~o~ c:~,~=~ ~e~C/ value of the feature IdA. T~C 1~, c:m.r .high becomes +high or +,~iJ~ ~-~.~.~,,~ ...... ~i~h. The replacement seg~.C/~:~ ~ cre~ed by applying a ' ...... ~ ....... ' ~#~&quot;~ which takes the value ....... ~.,,::,, .... ~atches the context ~::.~ ~. ~.~.~ c. c:~..~z~ges, in this case PS, \[high, p~. ,.,.~,~ ~,~ f~ is 'p', for positive, ..... ~'~ deg f~= high in E is 'n' (for The degr~ccns~c~' predicate reports the l~ 9f both ~n~ source phoneme (PS in this ex~p~e) and the resulting phone which it haa created (x in this example) and lists the full set of feature values for the new phone. If the phone matches a description in the current phoneme inventory, its label is the name of that phoneme. Thus, in the output of both the rule (shown as the partial string beginning with the start of the context) and the stratum, two instances of the phoneme x replace the two instances of the phoneme E found in the input string.</Paragraph> <Paragraph position="12"> In the case of data provided from data files, the word string may also contain atomic elements representing word and compound boundaries and affixes to be applied at particular strata. The actual working string(s) consists of any consecutive sequence, within the word string, of segment names and 'new' boundary markers (i.e. the boundary markers which denote junctures within the current stratum).</Paragraph> <Paragraph position="13"> e.g.</Paragraph> <Paragraph position="14"> lwordboundary, compboundary, g, r, c, c, n, compboundary, h, ~, ~, s, compboundary, los2, nE:s2, wordboundary\] The level of affixation of an atomic morpheme is denoted by a number concatenated to the end of the string of segments which represents its underlying form. This is matched to a list of potential affixation types which forms one of the parameters to each stratum definition predicate. Markers denoting compound boundaries must be inserted before, between and after, the lists of elements defining the components to be attached by compounding, to ensure that associated affixes are conjoined in the correct order. In this case more than one current working string may be being processed at any one time within the word string.</Paragraph> <Paragraph position="15"> As with the interactive word string representation, an atomic structure called 'new' is included in each affix segment list when the affixation process takes place and the morpheme is expanded into its component segment names, to provide an explicit juncture which may be required for the context of some of the phonological transformation rules.</Paragraph> <Paragraph position="16"> e.g.</Paragraph> <Paragraph position="17"> \[wordboondary, g, r, ee, x, n, Dew, h, a, o, s, new, 1, c, s, new, n, E, s, wordboundary\] These extra items are deleted at the end of the stratum in which they are introduced to ensure that the junctures from that stratum arc not visible to later strata.</Paragraph> </Section> <Section position="2" start_page="47" end_page="47" type="sub_section"> <SectionTitle> 3.2 Phonological Rules Phonological Transformation Rules may </SectionTitle> <Paragraph position="0"> involve replacement, transposition, deletion or insertion of segments.</Paragraph> <Paragraph position="1"> Each rule is represented in the data-base by a Prolog predicate with six arguments.</Paragraph> <Paragraph position="2"> e.g.</Paragraph> <Paragraph position="3"> prule(3, \[I, s, xl Rest\], \[x, z, xl Rest\], 's -> z / z ... x . ', scc, nsyll):-!.</Paragraph> <Paragraph position="4"> The first parameter is the rule number &quot; which is used in rule ordering and in allocation of rules to strata. The second and third parameters represent the string to be replaced by the rule and the substitution string respectively. Each consist of a list of one or more phoneme names (and possibly new morpheme boundary and/or syllable boundary atoms) or variables, designating the segments undergoing transformation and their contexts, and a list variable to be instantiated to the remainder of the working string. The list variable ensures that the remaining part of the word list, which has not taken part in the transformation, is maintained in its current form as part of the word list.</Paragraph> <Paragraph position="5"> Where a segment is represented by a variable or a feature bundle, the predicate will normally have one or more clauses to be invoked to satisfy the conditions on that segment. Thus, in the general case, the head of the prule predicate will be followed by a list of clauses representing the conditions to be satisfied in order for the rule application to succeed and the operations to be performed to instantiate all the relevant information for its consequences to be implemented.</Paragraph> <Paragraph position="6"> e.g. the diphthongization rule prule(46,\[Phon,Phon l Rest\], \[Phon,NewphoniRest\], 'V -> \[+high,-low,ixround\] / \[V, +vowel, txback,-high\] .... ', scc, nsyll) :feature(Phon,consonant al,n), alpha(back,Val,Phon), feature(Phon,high,n), !, reconstruct(Phon,Newphon, Features,\[high,p,low,n,round, * Val\]).</Paragraph> <Paragraph position="7"> Each rule also has a parameter which is the atomic form of the linguistic description of the rule. This is used for display purposes in listing the rule and, in the case of a rule which has been generated interactively, will be the actual string from which the rule has been generated by the system.</Paragraph> <Paragraph position="8"> The last two parameters which must be specified for each rule denote whether it obeys the Strict Cyclicity Condition and whether it requires that the string to which it is to be applied has been syllabified prior to application. The SCC marker must be set explicitly by the user when generating the rule but the syllabification marker will be set automatically during interactive rule generation if the context for the rule contains a syllable boundary marker.</Paragraph> <Paragraph position="9"> The rules in phonological rule database are explicitly ordered by means of their identifying numbers. Phonological rules can be represented disjunctively in the database, if necessary, by asserting a series of rules in the Prolog code, all allocated to the same rule number.</Paragraph> <Paragraph position="10"> Explicit variables can be attached to segments to indicate that two or more segments must be identical in a particular context.</Paragraph> <Paragraph position="11"> An interactive rule interpreter is provided which accepts a wide range of input formats including segment labels, feature bundle descriptions, or-notation and segment variable labels and translates the linguistic formulation of a phonological rule into Prolog predicates which implement it.</Paragraph> </Section> <Section position="3" start_page="47" end_page="49" type="sub_section"> <SectionTitle> 3.3 The Control System </SectionTitle> <Paragraph position="0"> After accepting the root form, the interactive control system calls the stratum control mechanism to act upon the working string by recursing over the current stratum definitions until all the processes associated with each of the strata defined have been applied. A similar mechanism is implemented to operate upon data predicates read from data files. These mechanisms are implemented by means of tail recursion.</Paragraph> <Paragraph position="1"> Failure of the main predicate which defines the control mechanism allows a second predicate to be invoked. This will always succeed, returning the current value of the working string as the value of the result variable.</Paragraph> <Paragraph position="2"> Each call to the stratum control mechanism is invoked with the stratum number variable already instantiated. Thus, the top-level call passes 1 as the stratum number for the first level of processing.</Paragraph> <Paragraph position="3"> eg. the control predicate which governs processing for data input interactively has the following definitions: do_all_strat(Number, Input, Output):stratum(Number, First, Last, _, Cycle), do-strat(Number, Input, Nextstring, First, Last, _, Cycle), I Nextnum is Number + 1, do_all_strat(Nextnum, Nextstring, Output).</Paragraph> <Paragraph position="4"> do_all_strat(_, String, String).</Paragraph> <Paragraph position="5"> and is initially called with Number instantiated to I and Input instantiated to the wordlist consisting of the segments of the root morpheme.</Paragraph> <Paragraph position="6"> Assuming that a stratum I exists in the current control structure (ie. that there is at least one stratum in the currently modelled system), this stratum definition is matched by the first clause which thereby instantiates all the variables necessary to implement the processing of the first stratum. These variables are then passed to another predicate 'do_strat' which carries out the processing, returning the final state of the working string at the end of the stratum, instantiated to the variable Nextstring. The number of the next stratum is then calculated and this is then used, together with the value of the current working string, to call the next stratum by re-applying the same predicate. This continues until the 'stratum' predicate fails to match, indicating that no further strata have been defined for this model of lexical phonology.</Paragraph> <Paragraph position="7"> The first predicate of 'do all strat' then fails also and the second predicate is then invoked, always succeeding, associating the value of the current working string (Input) to the Output variable of this last stratum call. This is instantiated to the Output variable in the call which generated it and thus instantiated to the Output variables in all previous calls by being passed back as the value of the final variable in each call. Thus the value of Output returned from the first call to the 'do all strat' predicate is the same as that of the working string (Nextstring) at the end of the last call which found a new stratum to apply.</Paragraph> <Paragraph position="8"> As each call to the stratum control mechanism is invoked with only the stratum number instantiated, it must first attempt to unify the remaining control variables in the call with the associated values in a predicate of the stratum definition database which defines the stratum identified by that number, if such a stratum exists. If no matching predicate is found for this level then the call succeeds without further recursion by means of the second control predicate.</Paragraph> <Paragraph position="9"> Further processing is then carried out to convert the working string from the list form, in which the segments may be individually manipulated, to a character string. At this stage all the segments in the list should be symbols representing phoneme names. The resulting string represents the broad phonemic transcription which is the product of the Processor.</Paragraph> <Paragraph position="10"> e.g.</Paragraph> <Paragraph position="12"> In the case of data supplied from file, it may happen that, if affixation processes which can only be resolved at later strata separate affix atoms for earlier strata from the root form, then these earlier affixes will never be resolved. Also, if the application of a phonological rule involves a transformation upon a segment which results in the creation of a new segment not listed in the current phoneme inventory, segment descriptors may remain in the working string at the end of processing. In either of these cases, an appropriate message will be generated and the working string will be displayed in its list form in place of the phonemic transcription which results from a completely successful application of the system.</Paragraph> <Paragraph position="13"> e.g.</Paragraph> <Paragraph position="14"> Unresolved atoms in t h e final working string</Paragraph> <Paragraph position="16"> The phonological rules are applied, in numerical order, in blocks defined by the stratum definition predicates. These blocks must each contain a contiguous set of rules from the current data-base, the only restriction being that the last rule number specified must be the same as or higher than the first rule number. Thus, blocks may overlap and the rules attached to a later stratum need not be defined later than those attached to earlier strata. Although rules are applied ordinaily within each stratum, there is no requirement that a rule must be defined for every number within the range of that stratum.</Paragraph> <Paragraph position="17"> Cyclic and noncyclic Stratum Control Mechanisms are defined and govern the ordering of morphological affixation and phonological transformation phases for each stratum.</Paragraph> <Paragraph position="18"> For cyclic application, the relevant rules for the stratum are applied to any appropriate contexts in the current working string, subject to the SCC condition which prevents application of any rule which has been marked for SCC, on this cycle, and then the appropriate morphological affixation process is invoked. For interactive use, an interface is provided to elicit and validate the data in the correct format. Data supplied from a datafile is validated and checked against the stratum information to ensure that the current stratum is an appropriate level at which to incorporate it into the word string.</Paragraph> <Paragraph position="19"> Following every subsequent affixation at the same stratum, the phonological rules assigned to that stratum are reapplied.</Paragraph> <Paragraph position="20"> In non-cyclic strata, all affixation takes place before the associated block of phonological rules is applied, once only, to the outcome of the affixation process.</Paragraph> <Paragraph position="21"> To complete the implementation of the control strategies described in the Halle and Mohanan paper, for stratum 3, the compounding level, the cyclic control mechanism has been expanded for the default model. If compounding takes place then stratum 2 is re-invoked to test for further morphological affixation before completing stratum 3 and, if this occurs, both the morphological and phonological components of stratum 2 are re-applied. In the Halle and Mohanan model, the only possible morphological process at stratum 3 is compounding.</Paragraph> <Paragraph position="22"> However, in order to make the system as flexible as possible, the 'hmspecial' predicate set, which controls the order of application of morphological and phonological processes, has been extended to include all types of affixation normally possible, in parallel with the normal cyclic control mechanism. Only compounding causes the loopback mechanism to be invoked even if other classes of affixation have been authorised at stratum 3. The loopback mechanism returns the current working string unchanged if no affix is found but invokes the phonological rule predicate followed by the normal 'cyclerule' predicate if affixation has been successful. This enables further affixation to take place, if necessary, cycling through stratum 2 for each affixation process.</Paragraph> <Paragraph position="23"> As this is an extension to the stratum 3 affixation process, no bracket erasure takes place between the two strata during the looping process. A single application of the bracket erasure predicate is invoked only on final completion of stratum 3, i.e. when there is no further compounding and no further stratum 2 affixation to be processed. This means that all the 'new' conjunction atoms introduced at both stratum 2 and stratum 3 are available throughout this operation.</Paragraph> <Paragraph position="24"> This mechanism raises many theoretical problems. However, the objective of its implementation in LexPhon was to ascertain the closest possible correspondence to the descriptions provided by Halle and Mohanan (1985) in order to explore its consequences, so it is sufficient to note here that practical demonstrations (Williams, 1993) indicate that the loopback mechanism is unnecessarily cumbersome and does not achieve the consequences predicted for it.</Paragraph> <Paragraph position="25"> The Halle and Mohanan description was chosen as the default model as it is both the most complex and the most completely specified model of LP available. Other descriptions of LP were taken into account in trying to make LexPhon sufficiently flexible to be utilised in contrasting a wide variety of approaches to LP.</Paragraph> <Paragraph position="26"> In addition to being a tool for developing linguistic descriptions, it is intended that the processor should itself be a model of the control structures of LP.</Paragraph> <Paragraph position="27"> Although the system has not yet been applied extensively to languages other than English, one of the fundamental tenets common to the various approaches to LP is that the mechanisms which apply the rules are language independent but that the phoneme inventory, the morphological and phonological rules and the number and cyclicity of the strata are language dependent (Kiparsky, 1982; Halle and Mohanan, 1985; but see also Booij and Rubach, 1987, for universal model of stratification). The system has been developed to reflect this view with the modules which represent the phoneme inventory, the rulebase and the stratum definitions being independent and interactively changeable. This also facilitates comparisons between alternative models of the same language.</Paragraph> </Section> <Section position="4" start_page="49" end_page="81" type="sub_section"> <SectionTitle> 3.4 Summary of the LexPhon default </SectionTitle> <Paragraph position="0"> model.</Paragraph> <Paragraph position="1"> The default phonological interpreter is based on Halle and Mohanan's (1985) detailed description of LP which provides for a framework of five strata, four of which operate within the lexicon. The postlexical and second lexicai strata are noncyclic, all other strata being cyclic in application. Although fundamentally cyclic, stratum 3, the level at which compounding takes place, has a much more complex structure allowing for backtracking to stratum 2 under specific circumstances. LexPhon can b e reconfigured by the user to provide for potentially any number of strata, each of which must be designated as cyclic or noncyclic.</Paragraph> <Paragraph position="2"> The default databases of the LexPhon system comprise the phoneme inventory containing the 42 phonemes listed above, 24 of the rules suggested by Halle and Mohanan (1985) and the five stratum model also proposed by Halle and Mohanan.</Paragraph> <Paragraph position="3"> Descriptions of the current state of each database can be invoked from the LexPhon system menus. The default definitions are summarised in the tables below.</Paragraph> <Paragraph position="4"> The segment label /y/appears in the defailt phoneme inventory so that the Halle and Mohanan rule specifications could be matched as closely as possible for the default rule set. It is exactly equivalent to the IPA symbol /j/more commonly used for the same feature specification.</Paragraph> <Paragraph position="6"> 2. noncyclic 3. cyclic with loop-back to 2 4. cyclic 5. noncyclic Each of these databases can been updated using the dialogue-based database development tools incorporated in the LexPhon system.</Paragraph> <Paragraph position="7"> Thus, LexPhon can be applied to the comparative study of alternative control frameworks, varying in number and/or cyclicity of strata, for any particular language description; to the development of phonological rule-sets and comparisons of allocations of rules to strata and to the investigation of particular phonological phenomena.</Paragraph> <Paragraph position="8"> An example of the last case, involving the characterisation of regular variation between speaking styles is presented to illustrate how the LexPhon system can be applied to such a study.</Paragraph> </Section> </Section> <Section position="6" start_page="81" end_page="81" type="metho"> <SectionTitle> 4. SPEECH STYLE STUDY </SectionTitle> <Paragraph position="0"> The term &quot;speech style&quot; usually refers to intra-speaker variation although there is not as yet any consensus as to its scope (Llisterri, 1992). Definitions and categorisations may relate to the context in which the speech occurred (formal/ informal); to the task involved in producing the speech (read/descriptive/conversational); to the speed at which the speech is produced (rapid/slow); or to the amount of attention believed to be directed to the speaking process (casual/careful). Stylistic variations may be restricted to prosodic, phonetic and phonemic contrasts or extend to syntax, choice of lexical items or even choice between different languages (Milroy, 1987).</Paragraph> <Paragraph position="1"> Thus, speech style may be seen as a single continuum along which each of the contributory factors may be ranged (Labov, 1981) or as a set of distinct categories within which there can be intra-style variation (Milroy, 1987). Further, different factors can result in the same phonemic or phonetic contrasts (Kreidler, 1989).</Paragraph> <Paragraph position="2"> Harris (1969) presented a study of Hispanic phonology in which he contrasted the regular variations in pronunciations which occurred, relative to the speed at which each utterance was produced. He identified four different speeds which could be characterised for the Mexico City dialect of Spanish under study. These comprised: Largo, slow and also described as overprecise; Andante, not quite so slow and considered typical of natural careful speech; Allegretto, somewhat faster and typical of natural casual speech; and Presto, extremely fast and unconstrained.</Paragraph> <Paragraph position="3"> The styles are distinguished b y differences in nasal and sibilant assimilations as well as by other variants. The assimilation phenomena seem to be at least partially dependent on differences in the level of boundaries which must be specified in the phonological rules. This suggests that the level ordering of LP might enable a simpler characterisation of speech style contrast be formulated.</Paragraph> <Paragraph position="4"> Although Harris defines style in terms of speech speed he assumes a direct correlation between this and the level of attention or carefulness applied to speaking. Both the amount of care taken and the degree of precision with which an utterance is formed are hypothesised to decrease with increases in the rate at which the utterances are produced. Hence, this comparison explores variations in the pronunciation of the same utterances under varying conditions of speech speed and/or carefulness. It is contrasted with a similar study of reported casual versus more careful and/or formal forms in British English.</Paragraph> <Section position="1" start_page="81" end_page="81" type="sub_section"> <SectionTitle> 4.1. The Hispaoic Study </SectionTitle> <Paragraph position="0"> The nasal-assimilation and s-voicing rules suggested by Harris (1969) are applied to underlying forms in the context of a level-ordered LP of Spanish, together with the vowel-glide transformation rule which feeds the nasal and sibilant assimilation rules in certain contexts. Thus, certain adjustments to the Phoneme Inventory were required. In addition to language-specific differences in the phoneme inventory, Harfis's treatment of the Mexico City data explicitly rescinds the distinction between phonemic and phonetic levels. The voiced velar fricative is needed and Harris posits several additional nasal symbols to support his account. In total, he suggests three underlying nasal segments and an additional four nasal variants which appear in surface forms of the Mexico City data.</Paragraph> <Paragraph position="1"> 4. I. I The Hispanic Phoneme Inventory In principle, there is no limit to the size of the phone/phoneme inventory in LexPhon.</Paragraph> <Paragraph position="2"> However, in the current system the feature set from which the phonemic segments are created is fixed and some processes can only operate upon Single character labels. In addition to 'coronal', 'anterior' and 'back', Harris employs the feature 'distributed' to distinguish between the Hispanic nasals.</Paragraph> <Paragraph position="3"> This is not currently available in LexPhon.</Paragraph> <Paragraph position="4"> Thus, the full set of nasal segments available for the Spanish speech study, and the features over which they vary, were as follows: rn n fi q fi</Paragraph> <Paragraph position="6"> where /m/,/n/ and /fi/ are the underlying nasals and/q/ and/fi/ are respectively the velar nasal and the additional '+coronal', '-anterior' nasal, hypothesised by Harris, to be present and distinct although acoustically indistinguishable from the alveolar/n/. Each of these has been allocated a single character label whose appearance in the LexPhon display font is the corresponding IPA character.</Paragraph> <Paragraph position="7"> The phoneme inventory was updated first to ensure that any segments which might be required for rule definitions would be available. This is important as the rule interpreter has data validation mechanisms so that, at the time of input, all segment labels must refer to segments within the current phoneme inventory.</Paragraph> <Paragraph position="8"> Since rules in cyclic strata must relate to derived forms, whereas rules in non-cyclic strata are not permitted to refer to embedded boundaries, two forms of the s-voicing and nasal-assimilation rules were derived.</Paragraph> <Paragraph position="9"> The vowel-glide transformation rule must be formulated with respect to new derivation at the start of its context, so only one formis required as it is neither cross-boundary nor word-formation independent in its action and can therefore apply at any stratum. It must precede both nasal-assimilation and svoicing, as it feeds both rules.</Paragraph> <Paragraph position="11"> \[+syllabic\].</Paragraph> <Paragraph position="12"> Harris provides a series of possible definitions for the nasal assimilation rule. The rule context includes various alternative boundaries or optional boundaries at different speech speeds. Additionally, for fast speech (e.g. Allegretto), Harris requires extra constraints on the nasals to ensure that / m/ is unaffected before /n/, and to account for nasal assimilation to word-onset glides.</Paragraph> <Paragraph position="14"> syllabic, o<coronal, 13high, * 3back\].</Paragraph> <Paragraph position="15"> Fors-voicing, Harris provides only one form which varies with speech speed, only in whether boundaries or optional boundaries need to be defined.</Paragraph> <Paragraph position="17"> syllabic, voiced\].</Paragraph> <Paragraph position="18"> The Hispanic Rule database was formed by first selecting the &quot;Delete Phon Rules&quot; option from the top-level menu, to clear the database and then adding each of the above rule formulae interactively via the &quot;Add a Rule&quot; dialogue selected from the &quot;Change Phon Rules&quot; menu.</Paragraph> <Paragraph position="19"> As Harris had identified a style-dependent variation between article-noun boundaries and the boundaries between less primitive constituents, it was necessary to represent this in the data and adapt LexPhon to cope with utterance construction beyond word level. In this case, as the lowest level word boundaries involved only monomorphemic definite and indefinite articles, it was derided to treat these as an additional word-internal level and reserve the word boundary symbol to designate phrase boundaries and higher level constituent boundaries.</Paragraph> <Paragraph position="20"> As the plural morpheme consistently occurs after other possible suffixes, I have assumed at least two word-internal levels.</Paragraph> <Paragraph position="21"> Thus, most word internal affixes were designated type 1, inflectional affixes were designated type 2, definite and indefinite articles were designated type 3 and phrase boundaries were identified by the word boundary symbol.</Paragraph> <Paragraph position="22"> Three sets of data were selected to include as many cases as possible of nasal and As/ segments in contexts selected to contrast the same underlying phoneme string with different boundaries for as many different phoneme sequences as possible.</Paragraph> <Paragraph position="23"> A total of 76 words or short phrases were included * in the databases tested on all five settings of the system. They were selected Io provide as many contrastive contexts as possiblc with examples for each level of boundary wherever possible. A selection of these are shown below.</Paragraph> <Paragraph position="24"> Optional boundaries are dealt with within the theory of LP by assuming that no boundary is specified in the rule and the rule is either attached to several strata (Halle and Mohanan, 1985) or to the first stratum following the resolution of all boundaries across which it may apply (Booij and Rubach, 1987). Following Booij and Rubach, I assumed that normal derivation up to word level would take place within a three stratum model consisting of cyclic, postcyclic and post lexical strata and therefore selected a five stratum model to begin the investigation (one for each level of affixation, plus one to ensure that rules could be applied after all word-formation processes were complete, including bracket erasure). The word formation processes were assigned in order with type 1 processes at stratum 1, type 2 at stratum 2, type 3 at stratum 3 and word boundaries at stratum 4. Five contrastive settings of the model, which varied only in the assignment of phonological rules to strata, were prepared: Rule-sets allocated to strata may include &quot;empty sets&quot; if no rules have been allocated to any of the rule numbers assigned Results are shown examples in Table 5.</Paragraph> <Paragraph position="25"> for each of the The broad phonemic transcriptions resulting from the three data sets indicate a correspondence from the slow to fast speech with the re-assignment of the phonological rules from low to high strata. However, within this progression, the detailed analysis of the data presented by Harris demonstrates some anomalies. It is not possible to assign the settings CI-C5 to the speech styles Largo, Andante, Allegretto and Presto on a one-to-one (or even many-to-one) basis. At each level there may be forms which Harris assigns to contrasting styles. This may be partially due to the imperfect assignment of structure to the data items examined.</Paragraph> <Paragraph position="26"> For the Mexico City dialect, Largo speech roughly corresponds to settings C1-C2 of the LP control system proposed here for Spanish, Andante to C2-C3, Allegretto to C4-C5 and Presto may correspond to C5.</Paragraph> <Paragraph position="27"> Thus, the LP model, implemented in LexPhon, appears to give a reasonably good approximation to the data available.</Paragraph> <Paragraph position="28"> 4.2. The English Study A more modest study of English style contrast data is included to examine whether similar stratum assignment of phonological rules can account for the casual/careful style contrasts found in standard British English. Again the Booij and Rubach model (1987) has been extended to investigate furlher aspects of the data through the inclusion of an additional stratum and the re-allocation of rules to strata.</Paragraph> <Paragraph position="29"> Descriptions of English style contrasts tend to be less well structured than the Harris (1969) study of the Mexico City dialect. The most common contrasts applied to spontaneous utterances are between careful and casual speech or between formal and informal speech. These two styles are sometimes, but not always, assumed to correspond. In either case, study is not limited to a comparison of binary alternations but usually relates to a series of utterance forms varying from one extreme of the style to the other. Laver (1991), for example, cites seven possible utterance forms for the single word 'actually' varying from most formal to most informal style in educated Southem British English.</Paragraph> <Paragraph position="30"> Kreidler (1989) lists vowel reduction, vowel loss, consonant loss and assimilation among the processes of reduction which contrast formal with informal or rapid English speech. Stress assignment is an important factor in identifying the contexts to which these apply and less formal speech may exhibit some or all of these processes. However, Kreidler suggests that apart from phonemic context, the familiarity of a particular word or phrase may also affect the amount of reduction which it may undergo.</Paragraph> <Paragraph position="31"> Although listing contrastive most careful and most casual forms, Kreidler too states that there may be one or more intermediate pronunciations.</Paragraph> <Paragraph position="32"> The Phonological Rules and Phoneme Inventory used here are the default databases provided for LexPhon, mainly derived from Halle and Mohanan's (1985) description of LP. Four strata were derived by adding an additional noncyclic stratum to the Booij and Rubach (1987) model and again 5 contrastive settings were used in which blocks of the 24 ordered rules are progressively allocated to later strata to reflect more casual speech. The original rule-ordering is maintained throughout.</Paragraph> <Paragraph position="33"> The first setting is the default setting for LexPhon and no distinction was found between this and the results of next two settings in which some or all of the noncyclic rules from stratum 2 were shifted to stratum 3. Word-internal assimilations involving y-insertion and palatalization have taken place. At the fourth setting, in which many of the noncyclic rules are attached to the final stratum and are applied after all word-formation is complete, a number of assimilations involving palatalization and y-insertion take place across word boundaries as well as word-internally.</Paragraph> <Paragraph position="34"> Further variations are demonstrated between the fourth and fifth settings. At the fifth setting all post-cyclic rules operate within the final, post-word-level, stratum causing spimntization to apply to contexts spanning word-boundaries.</Paragraph> <Paragraph position="35"> The variations resulting from the first three settings generally reflect a relatively formal, carefully pronounced speech style, although certain reductions characteristic of spoken rather than written English have not taken place (e.g. 'will not' rather than 'won't'). Some reductions found at the fourth setting in such partial clauses as 'unless you' and 'did you' are attested by Kreidler (1989) as casual speech forms.</Paragraph> <Paragraph position="36"> However, I have found no evidence in the literature for other reductions found at the fourth setting, nor for the further fricative reductions found at the fifth setting. These latter appear to be more typical of &quot;slurred&quot; speech than casual speech.</Paragraph> <Paragraph position="37"> Much of the style-related variation reported by Laver and Kreidler does not occur as a result of the LexPhon settings tested above. However, the variations which they reported are characterised by phonological rules such as fricative assimilation (Kreidler, p257) and cluster reductions, not required (or not yet implemented in LexPhon) for the derivation of the standard forms of the utterances.</Paragraph> <Paragraph position="38"> Thus, the model provides a partial but inadequate description of careful/casual contrasts found in British English.</Paragraph> </Section> </Section> class="xml-element"></Paper>