File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-1026_metho.xml
Size: 15,189 bytes
Last Modified: 2025-10-06 14:12:56
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-1026"> <Title>DATA TYPES IN COMPUTATIONAL PHONOLOGY</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> One of the dominant paradignls ill cnrrell |colnputat.ional linguistics is l)rovided by unification-based grammar formalisms. Such formalisms (of.</Paragraph> <Paragraph position="1"> IShieber 1986; Kasper t~ Rounds 1986)) describe hierarchic~d feature stl'tletllres, which iH inally ways would appear to be an ideal selling \[br formal phonological analyses. 1,'eature bundles have long been used l)y phonologists, and more recent work on so-called feature geonletry (e.~.</Paragraph> <Paragraph position="2"> (Clements 1985; Sagey 19,~6)) has introduced hierarchy into such represenlations. Nevertheless.</Paragraph> <Paragraph position="3"> there are reasons to step back from standard feature-based apl~roaches, and instead to adopl the algebraic perspective of abstracl data types (AD'P) which has been widely adopted iu coml)uter science. One general motivation, which we shall not e.xplore here. is thai Ihe aclivily of grantlnar writing, viewed as a process of programme specification, should be amenable Io sl~pwise refinement in which the set of {sol necessarily isomorphic) n,odels admitted by a loose IThe work reported in this paper has \[)C/:~,1, ~;tl ried ollt its part of the research i)rf)glitli/lll(!S o{ l\]l(' \].{llnl&l\[ (~oFiin/llllic&\[iOll |lesea.rch (}(:Illl'C. sl/ppOl'led })3 the OK Economic and Social Rescalch (:ouncil aml the project Computational l)houoh)gy: .,I ('onst~aint-fh~sC/d Approach, funded by the IlV. ~qcience and Engineering I(t. search Council, under grant (;R/(;-22081. 1 am glalt'ful to Steven Bird. Kimba Newton and 'l'm/v Simou \[m di> cussions relating to this work.</Paragraph> <Paragraph position="4"> AcrEs DE COLING-92, NAtc~S, 23-28 Ao~rr 1992 specilication is gradually narrowed down to a u,fiqtm 'algebra (cf. (Sannella & Tarleeki 1987) for an overview, and (Newton in prep.) for the apldication to grammar writing). A second motivation, discussed in detail by (Beierle & Pletat 1988; Beierle K~ Pletat 1989; Beierle et al. 1988), is to use equational ADTS to provide a mathematical foundation for h~ature structures. A third motivation, dominant in this pal)er , is to use the AI)T appl'oach lo provide a richer array of explicit data types than are readily admitted by &quot;p'tlre' feature structure approaches. Briefly, in their raw form, \[eature terms (i.e., fnrnlalislns for describing h~alure stru(:tures) do not always provide a perspicuous format for representing strllct II re.</Paragraph> <Paragraph position="5"> On the ADT approach, complex data types are built up from atomic types by means of constructor functions. For example .... (where we use the underscore '_' to mark the position of the fimction's arguments) creates elements of type List. A dala type may also have selector functions for taking data elements apart.</Paragraph> <Paragraph position="6"> Thus, selectors for lhe type L+-st are the func tions first and last. Standard feature-bossed encoding of lisls uses only selectors for the data type; i.e. the feature labels FIRST and LAST ill ( 1 ) FIRST : o&quot; 1 17 LAST : (FIRST : o&quot; 2 17 LAST : nil) tlowever, the list constructor is left implicit, That is, the feature term encoding tells you how lists are pulled apart, but does not say how they are built up. When we confine our atlention just to lists, lhis is not much to worry about, ltowever, tile situation becomes less satisfactory when we atIelnpI' to encode a larger variety of data structures into one and the same feature term; say, for example, standard lis(s, associatiw~ lists (i.e.</Paragraph> <Paragraph position="7"> strings), constituent structure hierarchy, and au tosegmental association. In order to distinguish axtequately between elements of such data types, we really need to know the logical properties of their respective constructors, and this is awl 1 4 9 PRec. oF COLING-92. NANTES. AUG. 23-28. 1992 ward when the constructors are not made explicit. For computational phonoloKv, it is not an unlikely scenario to be confronted with such a variety of data structures, since one may well wish to study the complex interaction between, say, non-linear teml)oral relations and prosodic hierarchy. As a vehicle for computational implementation, the uniformity of standard attribute/value notation is extremely usefld. As a vehicle for theory development, it can be extraordinarily uuperspicuous. null The approach which we present here treats phonological concepts as abstract data types. A particularly convenient development environlnent is provided by the language OBJ (Goguen & Winkler 1988), which is based on order sorted equa, tionaJ logic, and all the examples given below (except where explMtly iudicated to the con trary) run in the version of OBJ3 released by sltI in 1988. The denotalional semantics of a.n OB.\] module is an algehra, while its operational semantics is based on order sorted rewritiug. I 1 1.1 and 1.2 give a more detailed introduction into the formal framework, while SS 2 and 3 ilhlstrate the approach with some phonological examples.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 Abstract Data Types </SectionTitle> <Paragraph position="0"> A data type consists of one or more domains of data items, of which certaiu elements are designated as basic, together with a set of opera tious on the domains which suffice to generate al\] data items in the domains fl'om the I)asic items.</Paragraph> <Paragraph position="1"> A data type is abstract if it is independenl of any particular ret)resentational scheme. A fundamental claim of the ADJ group (cf. (Goguen.</Paragraph> <Paragraph position="2"> Thatcher ,~ Wagner 1976)) and llluch subsequent work (cf. (Ehrig & MMn&quot; 1985)) is that abstracl data types are (to be modelled as) algebras: and moreover, that the models of abstract data types are ilfitial alget)ras. ~ The signature ofa mauy-sorted algebra is a l)air = <S,O } consistiug of a set S of sorts and a se~ O of constant and operation symbols. A specification is a pair (rE> consisting of a signal are together with a set g of equations over terms constructed from symbols in O and variables of the sorts in S. A model for a speciIica.tion is ~An initial algebra is characlerized uniquely up to |so morphism as the semantics of a specification: there is a unique homomorphisnl from the initial algebra inlo t'vely algebra of the specification.</Paragraph> <Paragraph position="3"> an algebra over the signature which satisfies all the equations PS. Initial algebras play a special role as the semantics of an algebra. An initial algebra is minimal, in the sense expressed by the principles &quot;no junk' and 'no confusion'. 'No junk' means that the algebra only contains data which are denoted by variable-fl'ee terms built up from ol)eration symbols in the signature. 'No confusion' means that two such terms t and t ~ denote the same object in the algebra only if the equation t = F is derivable from the equations of the specification.</Paragraph> <Paragraph position="4"> Specifications are written in a convent|ohM format consisting of a declaration of sorts, operation symbols (op), and equations (oq). Preceding the equations we list all the variables (var) which figure in them. As an illustration, we give below an OBJ sl)ecification of the data type LIST1.</Paragraph> <Paragraph position="5"> (2) obj LIST1 is sorts Ell List</Paragraph> <Paragraph position="7"> The sort list betweeu the : and the -> in an operation declaration is called the arity of the operation, while the sort after the -> is its value sort. Together. tiw al'ity and value sort constilute the rank of an operation. The declaration op nil : -> Elt means that nil is aconstant of sorl Ell, The specitication(2) fails to guarantee that there are any objects of El/:. While we could of course add soule constants of this sort, we would like to have a more general solution. In a particular application, we might want to define phonological words as a List of syllables (plus other constraints, of course), and phonological phrases as a List of words, rl'hat is, we need to parameterlze the type LIST1 with respect to the class of elements which constitute the lists.</Paragraph> <Paragraph position="8"> Before turning to parameterization, we will first see how a many-sorted specification language is generalized to an order sorted language by introducing a subsort relation.</Paragraph> <Paragraph position="9"> Sul)l)ose, for exanlple, that we adopt the claim Aeries DE COLING-92, NANTES, 23-28 ^offr 1992 1 5 0 PROC. OF COLING-92, NAN'rEs, AUo. 23-28, 1992 that all syllables have ('lonsets :(. Moreover. we wish to divide syllables into the subclasses lmavy and light. Obvimusly we wan! heavy and light syllables to inherit the l)roperties of the clas> of all syllables, e.g., they haw' ('1 onsets. We use ltoavy < Syll to stale that Heavy is a subsorl of tile sort Syll. We inlerl)l'et this to mean thai lhe class of heavy syllables is a subse! of the class (if all syllables. Now, let onset_ : Syll -> Nora lie all operation which selects tlle tits! mora of a syllable, anti let us impose the Iollowing constraint (where Cv is a sul)sor! of Nora): (3) var S : Syll .var CV : Cv .</Paragraph> <Paragraph position="11"> Then tile framework of or(ler sorted algebra ellsures that onset is also delined for obje('l > of s{)i't Heavy.</Paragraph> <Paragraph position="12"> llx~turlling to lists, the speciIication ill (,I) (sli~hll.v simplified from that used h> ((;oguen ,k: Winkhq |988)) introduces Eli alld NeList (notl OlUl)t 3 lists) as subsorts of List. and thereby !rein'ores on LISTI in a number of resi)ects, h, addition.</Paragraph> <Paragraph position="13"> tile specification is parameter!zeal. Thai is. il characterizes a list of Xs, where the paralneler X can be instantiated tm any module which satislies tile condition TRIV; the laller is what ((;oy;uell & Winkler 1988} call a &quot;requirenlenl theory', and in lhis case simply iml)oses on any inpul moduh, that it have a sot! which can be mal)p('(I to Ihe sort Eli.</Paragraph> <Paragraph position="14"> (4) obj LIST\[X :: TRIV\] is sorts List NeList , subsorts Elt< NeList < List ,</Paragraph> <Paragraph position="16"> Notice that the list constrllctor _._ llOW i)el'forllls the additional fluter!on ol append, allowing Iwo lists tm lie concatenated, h, addition. !he se lectors llave beell made 'safe', ill lhe Sellse thai they only apply to objects (i.e.. nonemply lisls) for which they giwr sensible results: for whal. ill LISTI, would have been the meaning of head(nil )? allere, the term mNSET ief(!lS to lh(' inilal mma o\[ a syllM)le in llyman's (198,t) velsion of tit(' iil(nai( th(!ol 3</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Metrical Trees </SectionTitle> <Paragraph position="0"> :\s a further illustration, we give below a specilit'at ion of the data lyp(! BINTREE. This module has two parameters, bolh of whose requirelnent theories are TRIV. 4 ply al)ln'opriale sels of ilOlll, erlnina\] and terminal symbols, l,el us use ui)percase quoted identifiers (eMnenls of the OB.I inodule QID) for nonterminals. and lower case for terminals. The specitlcalion in (5) allows us to treat terminMs as trees, st; Ihal a binary tree. rooted ill a node 'A, can have lerminals as its daughters, ltowever, we ills() allow terminals to be directly dominated by a n(m-branchingmolher node. \[Ioth possibilities occur in the examples below. (6) illustrates the instantiation of tornlal parameters by an actual module, namely QID. using the make construct.</Paragraph> <Paragraph position="1"> 16} make BINTREE-QID is BINTREE\[QID,QID\] endm The nexl exalnph, shows Nellie reductions in this module, obt, aiued by treating the equations as rewrite rules applying fi'om left to right.</Paragraph> <Paragraph position="2"> ~'l'hc n~tatir,a Elt .NONTERN. EIt. TEPd4 utilizes a qual!lit:at!on M t he sort Eli by the input module's paranleter labch this is simply to allow disamlfigulttion.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> ~4 JB </SectionTitle> <Paragraph position="0"> Suppose we now wish to modify the definition of binary trees to obtalu metrical trees, These are binary trees whose branches are ortlered according to whether they are labelled 's&quot; (strong) or 'w' (weak).</Paragraph> <Paragraph position="1"> * v In addition, all trees have a tlistinguishetl leaf node called the 'designated terminal element '(dte), which is connected to the roe! of the tree I)y a path of 's' nodes.</Paragraph> <Paragraph position="2"> Let us define 's' and &quot;w' to t>e our nonterminals: In order to buihl tilt, data iype of metrical lrC/,e~ on top of binary trees, we can import Ill(, moduh, BINTREE, suitably instantialell, using OB.l's extendingconstrucl. Notice thai we use MET to in~tantiate the parameter which fixes BINTFLEE's ~et Of nonterminal symbols. ~ 191 obj HETTREE is extending BINTREE\[MET,QID\]*(sort Id to Leaf) op die : Tree -> Leaf .</Paragraph> <Paragraph position="3"> vat L : Leaf .</Paragraph> <Paragraph position="4"> vats T1 T2 : Tree .</Paragraph> <Paragraph position="5"> '~'\['he * construcl tells ,s thai the i)ri,cipal ~.Ol~ of OlD. llalnely Id, is mappe({ (1)), a sig,tai,.e .;o*pl, isnl) to l llc sort Leaf in METTREE. ceq signals the presen(c o\[a (-otldifionaI cquation. == is a buill-in I)olymou)hic cqualil> operation in OBJ.</Paragraph> <Paragraph position="6"> Acres DECOLING-92. NAm ,'~% 23-28 Aor~r 1992</Paragraph> <Paragraph position="8"> The equations state that the dte (designated terminal element) of a tree is the dte of its strong subtree. Another way of stating this is that the information about dte element of a subtree T is percolated up to its parent node, .just in case T is tile &quot;s' branch of that node.</Paragraph> <Paragraph position="9"> The specification METTREE can be criticised on a number of grounds, it has to use conditional equations in a cumbersome way to test which daughter of a 1)inary tree is labelled 's', Moreover. it fails to capture the restriction that no binary tree can have daughters which are both weak. or both strong. That is, it fails to capture the essential property of metrical trees, namely that metrical strength is a relational notion.</Paragraph> <Paragraph position="10"> What we require is a method for encoding the fob lowing information at a notle: &quot;my left (or right) daughter is strong&quot;. One economicaJ method of doing this is to label (all and only) branching nodes in a binary tree with one of the following two lahels: 'sw' (my left daughter is strong), 'ws' (my right daughter is strong). Thus, we replace</Paragraph> </Section> class="xml-element"></Paper>