File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/c90-3047_intro.xml

Size: 16,664 bytes

Last Modified: 2025-10-06 14:04:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-3047">
  <Title>Unbounded Dependency: Tying strings to rings</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
CS Address Systems
</SectionTitle>
    <Paragraph position="0"> The f~co mapping can be used to define a correspondence between the symbolic operations of union and concatenation, and the CS operations of superposition and association, respectively. This allows the following 1 Characters in bold denote elements of the vector space.</Paragraph>
    <Paragraph position="1"> homomorphism to be defined .f: S -&gt; CS, mapping the semiring S into the CS semiring, where</Paragraph>
    <Paragraph position="3"> and the semiring S comprises the quintuple (L x, U,., 0, {0}) where L x is the finite set of strings defined over the symbolic alphabet X, and U and. denote the union and concatenation operators, respectively, with their corresponding identities. The existence of the homomorphism allows symbolic structures to address CS representations. However, the restriction on this mapping is that CS level address systems cannot capture the full expressive power of regular languages. This is because no CS level operator can be defined with the same closure properties as the Kleene star operator at the symbolic level. The implications of this constraint become apparent in describing how symbolic structures can function as structural addresses for the CS level.</Paragraph>
    <Paragraph position="4"> The symbolic structures that function as addresses to the CS level can be represented using directed, acyclic graphs (DAGS), and are referred to as address structure DAGs (AS-DAGs). AS-DAGs codify the way in which symbolic labels address, or map onto, the nodes and edges of ASStts. In general, two possible types of address system can be defined, direct addressing and relative addressing. A system of direct addressing involves specifying unique ASSH addresses explicitly.</Paragraph>
    <Paragraph position="5"> That is, a symbolic label functioning as an address, directly accesses a unique ASSH node.</Paragraph>
    <Paragraph position="6"> The alternative addressing scheme involves specifying nodes in the configuration in terms of their connectivity paths from some pre-defined origin node. This form of relative addressing requires, (a) a pre-specified origin, or root node, and (b) a labelling system for the connections within the configuration.</Paragraph>
    <Paragraph position="7"> The set of symbolic labels that serve as addresses in AS-DAGs can be partitioned into two classes, local and global labels, which m'e differentiated in terms of their function within an address structure. Global labels map onto the nodes of AS-DAGS providing direct addresses for the superposition spaces in ASSH configurations. Local labels, on the other hand, map onto AS-DAG edges and specify the relative addresses of ASSH spaces. That is, they specify the locations of superposition spaces relative to the addresses of their dominating nodes. This relationship is illustrated in figure 2 showing how AS-DAGs map onto ASSHs.</Paragraph>
    <Paragraph position="8">  The figure shows the CS level encoding of the LFG representation of the sentence The girl persuaded John to go (see Slack, 1990). The superposition space labelled 'JOIIN' in the AS-DAG can also be located using the compound address 'PERSUADE.obj'. 2 The obvious question that arises is what possible rationale is there for this system of double addressing? With a system of direct addressing for ASSHs, the relative addressing scheme would appear redundant.</Paragraph>
    <Paragraph position="9"> At the symbolic level, local labels specify local structure, that is, how a node relates to its immediate descendents. In situations in which the local structure is uniform and finite, the V-space encodings of local labels can be fixed under O.a, mapping each label onto a constant vectorial encoding. This allows AS-DAGs to be viewed as configurations of local structures which can be located in V-space by fixing the vectorial encodings of their root-nodes. This means that global labels must be assigned dynamically under ~a. Putting the emphasis on local structure seems to make the direct addressing system redundant, but there are good reasons for needing direct access to superposition spaces.</Paragraph>
    <Paragraph position="10"> Defining arbitrary structural addresses as strings of local labels descending from a root-node can only be achieved under symbolic level control as the representational machinary necessary for interpreting concatenated label strings does not exist at the CS level. A string of local labels can only be encoded as a single AS-DAG edge corresponding to an uninterpreted label string. That is, the CS level comprises a set of superposition spaces which support structured access, and only a single ASSH edge can link two such spaces. This means that CS level access through relative addressing is limited to addresses comprising a single edge leading from an origin node. Building up addresses in this way necessitates an AS~DAG node labelling scheme such that the origin node can be defined iteratively. In other words, because the V-space encoding of local structure is fixed under f~a, relative addressing can only be specified on a local basis, with the only form of global addressing involving direct access to ASSH nodes, or superposition spaces. This limit on symbolic level access to connectionist representational states is an important source of locality constraints in encoding linguistic structures at the CS level (Slack, 1990) a.</Paragraph>
    <Paragraph position="11">  One linguistic phenomenon which, rnore than any other, focuses on the problem of addressing structural configurations is that of unbounded dependency (UBD). Typically, in sentences like The boy who John gave the book to last week was Bill, the phrase The boy is {aken as the 'filler' for the missing argument, or 'gap', of the gave predicate, as indicated by the underline. At the level of constituent structure there are no constraints on the number of lexical items that can intervene between a filler and its corresponding gap.</Paragraph>
    <Paragraph position="12"> Such &amp;quot;unbounded dependencies&amp;quot; are typical of a class of linguistic phenomena in which the structural address of an element is determined by information which is only accessible over some arbitrary distance in the structure. To build the appropriate memory configuration, it is necessary to determine the address of the gap to which a filler belongs. However, because gaps and fillers can be separated by arbitrary distance in the input string, it is not possible to specify the set of potential predicate-argument relations that the filler can be involved in, and so a direct address cannot be identified.</Paragraph>
    <Paragraph position="13"> Instead, it is necessary to generate a relative address through the construction of a chain of global and local labels.</Paragraph>
    <Paragraph position="14"> Within the framework of Government-Binding theory, these phenomena have been explained through identifying conditions defined on constituent trees that account for the distribution of gaps and fillers both within and system based on a functional partition of V-space into an Address Space and a Content Space (Kanerva, 1988). An important feature of this architecture is that by encoding the elements of both spaces as k-bit vectors, they are potentially interchangeable. This allows elements retrieved from content space to function as addresses to other memory locations, and vice versa. Thus, the memory consists of a set of superposition spaces, where each space has a label (or address), and where labels can be encoded as elements of other spaces resulting in a hierarchical structure. In a hybrid architecture based on a CS level memory, symbolic structures are encoded through symbolic labels addressing elements of the homomorphic ASSH configurations in memory. In other words, symbolic level activity is implemented as the manipulation of address space labels (see Slack, 1990).</Paragraph>
    <Paragraph position="15"> across natural languages. One such principle is based on the structural geomeu'y of constituent trees, in particular, their connectivity properties (Kayne, 1983). Kaplan and Zaenen (1988) have taken a different approach to UBD restrictions arguing that they are best explained at the level of predicate-argument relations, rather than in terms of constituent structure.</Paragraph>
    <Paragraph position="16"> Working within the LFG framework, their formal system is based on the idea of functional uncertainty expressions. For example, for topic-alization sentences these expessions have the general form (,x TOPIC)= (^ GF* GF) involving the Kleene closure operator, where GF stands tbr the set of primitive grammatical functions. These equations express an uncertain binding between the TOPIC function and some argument of a distant predicate. The uncertainty relates to the identification of the appropriate predicate. To resolve the uncertainty it is necessary to expand this expression and match it against the functional paths of missing arguments. Different computational strategies can be used to optimise the resolution process (Kaplan &amp; Maxwell, 1988). What is common to both approaches is that they define a system for specifying the structural address of a gap relative to its corresponding filler.</Paragraph>
    <Paragraph position="17"> The notion of functional uncertainty, in common with other linguistic feature structures, uses an address system based on regular languages (Kasper &amp; Rounds, 1986). It is impossible, however, to use such addresses to access the CS level directly as the Kleene closure operator cannot be interpreted at this level. In their present form, functional uncertainty algorithms require some kind of 'symbolic level' memory in which to expand uncertainty expressions.</Paragraph>
    <Paragraph position="18"> An alternative account of UBD phenomena, also based on predicate-argument relations, can be tbunded on the notion of symbolic labels functioning as local and global addresses to the CS level. The problem of UBD can be decomposed into two sub-problems, one relating to local structure, the other relating to global indeterminacy. Consider the sentence fragment The girl John saw Bill talking to ...... where the problem is to specify the struci-ural address of the topicalised NP, The girl, as the missing argument of some COMP function 4. At the level of local structure, the structural address of the filler is minimally uncertain in that it can fulfil 4 As the present discussion focuses on predicate-argument relations, we will continue to make use of LFG notation and constructs, such as grammatical functions.</Paragraph>
    <Paragraph position="19"> 268 4 only a small set of local roles, for the present case the OBJ function. However, the structural address of the appropriate local structure is maximally uncm~ain, as the filler item carries no information to constrain it.</Paragraph>
    <Paragraph position="20"> Before considering solutions to these two crab-problems, it is necessm'y to clarify how the informational components of linguistic structures such as f-structures function as addresses to the CS level. Elements of the set GF can function as both local and global addresses to memory configurations. Each GF ~:lement defines a component of local structure ~md as such can function as a local label in the :relative address chain for an AS-DAG node. In addition, GF labels can be associated with fixed memory locations, thereby functioning as :;~lobal addresses. For example, the symbol COMP can be used to label an AS-DAG edge ibrming a constituent of a relative address, and at the same time provide direct access to a fixed k)cation, that is, label an AS-DAG node. These ~:wo addressing functions can be distinguished ~y usm~ the labels COMP and COMP to 1 P g ~ ,enote the local and global addresses, respectivelydeg null in encoding f--structures at the CS level, each sub-structure maps onto a separate :C/uperpositon space (Slack, 1986). This form of direct addressing requires a set of global ~,'~ymbolic labels that uniquely identify each subs!:ructure. The predicate names of f-structures provide such a labeling system. In this case, each predicate name constitutes a unique origin fi~r definir,,g relative local addressesdeg Hence, !,:~cal labets like COMPp specify locations rc:lative to a predicate 'p', that is, their immedi~ ate dominating node in the AS.-DA(L These labeling systems can be used to solve the two UBD sub-problems. On encountering a filler item in the input string, the analyser must allocate some structural location in memory at which to store the infi,mnation carried by the * .5 J~ern . Part of that information specifies the fi/!er's local address. For example, the inform-a i.ion carried by the phrase the girl might include an encoding of the functional subs, ructure \[OBJ ** \[pred 'girl' + spec the + hum sg\]\] 6. At some later point in processing, this information will be superposed, or unified, with stored information about the structm'e of some local tree. For example, the predicate talk 5 Problems of structural ambiguity are not being considered at this point.</Paragraph>
    <Paragraph position="21"> 6 This encoding utilises the fact that memory acldresses carl also be encoded as memory contents, and vice versa.</Paragraph>
    <Paragraph position="22"> may encode through subcategorisation tile local structure talk(subj, obj, comp). If the OgJ function remains unspecified, the filler information will unify at this location in memory, enabling its structural address to be specified ielative to the address of the predicate talk. The syntactic analyser can solve the memory allocation problem by generating the label TOPIC, enabling the encoding: TOPICg -&gt; \[OBJ*a\[NP features\]\]Tto be created. This encoding solves the local depend~ency problem.</Paragraph>
    <Paragraph position="23"> To solve the problem of global indete&gt; minacy, the analyser must also build an encoding like COMPp -&gt; TOPICg the effect of which is to move the topicalised information through connected COMP locations.</Paragraph>
    <Paragraph position="24"> in other words, it corresponds to the control equation TOPIC = COMP* at the symbolic level a.</Paragraph>
    <Paragraph position="25"> The principle underlying this latter encoding is that COMP labels a specific location in memory determined relative to the locanon with the global address 'p'. This means that the local label is automatically reassigned as each new local structure unfolds. The operation of reassignment involves two concurrent actions: 1) Direct labelling - the structural location labelled by COMPp is reqabelled using the new predicate name as a global label; 2) Build connection - a new location is connected into the structm'e with the label COMPpwhere 'p' is bound to the new predicate name. The effect of the last action is to pass the topicalised information onto the next connected level of local structure. Obviously, if the first action occurs without the second, the filler label will not be passed as the COMPp -&gt; TOPICg encoding will become undefined. Once this happens the COMP~ address is no longer retrievable mr further processing. However, the second action can only C/mcur if the building of a COMPp edge is licensed by the information carried by the predicate 'p'. For example, consider the partial fostructure shown in figure  node can be passed down the COMPp reassignment chain descending from the same node, but it cannot be passed to the COMP embedded in the SUBJ f-structure. The COMPp -&gt; TOPICg encoding is undefined at the location addressed by the SUBJp label, because the COMP function cannot be licensed by the SUBJ predicate.</Paragraph>
    <Paragraph position="26"> In brief, functional uncertainty expressions such as (^ TOPIC) = (^ COMP* OBJ) cannot be mapped directly to the CS level as structural addresses. Instead, the uncertainty is captured by the CS encodings COMPp-&gt; TOPICg and TO. PICg -&gt; OBJ**\[NP features\]. As the connectlvxty structure unfolds in memory, the action of reassigning COMPp places restrictions on the set of structural addresses to which the topicalised information can be passed. Using predicate-argument structures as address systems for the CS level leads to the conclusion that connectivity, defined at this level of linguistic structure, determines the distribution of fillers and gaps within a language.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML