File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-3507_abstr.xml
Size: 28,604 bytes
Last Modified: 2025-10-06 13:45:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-3507"> <Title>Scaling Construction Grammar up to Production Systems: the Situated Constructional Interpretation Model Guilaume Pitel Langue et Dialogue</Title> <Section position="1" start_page="0" end_page="55" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> While a great effort has concerned the development of fuly integrated modular understanding systems, few researches have focused on the problem of unifying existing linguistic formalisms with cognitive processing models. The Situated Constructional Interpretation Model is one of these attempts. In this model, the notion of &quot;construction&quot; has been adapted in order to be able to mimic the behavior of Production Systems. The Construction Grammar approach establishes a model of the relations between linguistic forms and meaning, by the mean of constructions. The latter can be considered as pairings from a topologically structured space to an unstructured space, in some way a special kind of production rules.</Paragraph> <Paragraph position="1"> Accounting for pragmatical and cognitive phenomena in a linguistic formalism is a challenging task whose resolution would be of great benefit for many fields of linguistics, especially those dealing with interpretation in a context. In domains such as practical dialogue or embodied understanding, there would be a real gain in dealing with environment data the same way one deals with linguistic data. These kinds of systems currently need ad hoc heuristics or representations. These heuristics are implemented in modules that are often impossible to reuse for another task than the one they were developed for. This point particularly concerns phenomena that lay at the interface of linguistics and general cognition, such as vagueness (Ballweg, 1983), reference resolution (Brown-Schmidt, 203; Reboul, 199), or modeling of cognitive representations (Langacker, 1983; Talmy, 198).</Paragraph> <Paragraph position="2"> Similarly, accounting for linguistic phenomena in a psychologically motivated model is far from simple. The attempts in that direction are often limited to simple phenomena, because all linguistic formalisms rely on principles slightly or totally different from those of cognitive architectures.</Paragraph> <Paragraph position="3"> The definitive solution to this problem is probably stil far from reach, but nevertheless, I think that the maturity of cognitive linguistics and the consequent emergence of language analyzers connected to cognitive architectures is an excellent direction toward a unified theory mixing linguistic and psychological models. The Embodied Construction Grammar or ECG (Bergen, 203) and its analyzer (Bryant, 203) are a god example of such an effort, even though it does not go beyond the linguistic layer since mental simulation is left to a mental simulation module based on the notion of x-schema (Narayanan, 201).</Paragraph> <Paragraph position="4"> Consequently, I try to propose a model that conciliates a linguistic theory with a cognitive architecture. The choice of the linguistic theory naturally goes to Construction Grammar (Filmore, 198; Kay 202) and Frame Semantics (Filmore, 1982), due to the parallel one can draw between a production rule and a construction, and the cogni- null tive architecture is, obviously, the family of Production Systems (Newell, 190; Anderson, 193).</Paragraph> <Paragraph position="5"> Moreover, since many pragmatical models rely on topologically structured representation, I introduce the notion of context, a notion that has never been adapted to these theories in order to organize data in &quot;storages&quot; structured in disimilar ways.</Paragraph> <Section position="1" start_page="49" end_page="49" type="sub_section"> <SectionTitle> 1.1 Typical Problem </SectionTitle> <Paragraph position="0"> Consider a situation where a user can command a software to manipulate some very simple objects (colored geometrical objects of various sizes). The user may say (a) &quot;Put the small red square on the left&quot;, (b) &quot;Remove the small red square on the left&quot; or (c) &quot;Move the small red square on the left&quot;.</Paragraph> <Paragraph position="1"> First, these three uterances may involve different parsing depending on the actual environment of the uterance, at least for those with &quot;put&quot; and &quot;move&quot;. Second, the &quot;square&quot; targeted by the user may be a rectangle in the actual software representation, with slightly different width and height. It may also be relatively small compared to other red squares, but biger than other objects, and relatively red compared to other non-square objects.</Paragraph> <Paragraph position="2"> Imagine what happens in the different situations ilustrated in Figure 1. In situation 1, for instance, (a) would not be understandable, since the small square is already on the left, while (c) could lead to the one argument sense of &quot;move&quot;, i.e. &quot;move something somewhere else&quot;, not to the two arguments version &quot;move something somewhere&quot; (actually, the one-argument sense is an implicit understanding of the destination allowed by &quot;move&quot;, so the difference should not be lexicalized). In situation 2, (b) and (c) would lead to two different interpretations of the referring expression &quot;the small red square on the left&quot;: in (b), it refers to the square in the center (with a posible wavering), while it preferably refers to the square on the right in (c). In situation 3, (c) may be interpreted with the one argument sense of &quot;move&quot;, and wil target the square on the left since it is the smallest, but there should be a strong hesitation, since the other square is not that biger, and the two arguments sense of &quot;move&quot; is intuitively preferred. At the same time (a) wil target the square on the right, which is relatively small compared to the neighboring circles, but would raise incomprehension if the circles were mising.</Paragraph> <Paragraph position="3"> In general, in order to take those facts into account, it is necessary either to produce all posible analyzes at each layer of the interpretation (which is quite problematic if it is desirable to allow for imperfect analyzes), or to allow two-ways interactions between the layers of interpretation (for instance, the pragmatic layer talking back to the semantic layer about the fact that the original position of an object is the same as the requested destination, which may indicate a wrong analysis).</Paragraph> <Paragraph position="4"> My proposal is to allow for a generic capacity of interaction between the states of the interpretation (speaking about states is better than about layers since the latter presuposes something about the organizing of the interpretation), based on a unified operation between all the posible states. More specifically, the idea is to merge the notions from construction grammar and productions systems.</Paragraph> </Section> <Section position="2" start_page="49" end_page="50" type="sub_section"> <SectionTitle> 1.2 Merging Construction Grammar and Production Systems </SectionTitle> <Paragraph position="0"> Merging a linguistic analyzer with a cognitive processing model may seem a bit useless since they do not share the same objective. Linguistic analyzer's goal is to provide a formal model for the representation of linguistic knowledge, accordingly to linguistic observations. Cognitive models, on the other hand, aim at helping the modeling of real cognitive processing, in order to compare theoretical model of perception processing with real data from experiments. Cognitive models like production systems being Turing-equivalent, they typically do not lack of any expressiveness, meaning that anything one can describe with any linguistic representation could be implemented within a cognitive model (hopefuly, since linguistic competence is part of the cognitive competence).</Paragraph> <Paragraph position="1"> However, to my knowledge, no attempt to try describing a linguistic competence within a cognitive model has gone a long way. Existing researches on that topic have focused on very narrow problems, and what is more important, have been tightened to very small lexicons (Emond, 197; Ball, 203; Fowles-Winkler and Michaelis, 205).</Paragraph> <Paragraph position="2"> My analysis of this problem is that production systems are to permisive to allow a human to describe a grammar with a reasonable effort. More specifically, all generalization links that exist between grammar rules should be encoded in some explicit way in a production system.</Paragraph> <Paragraph position="3"> Furthermore, linguistic formalisms are designed in such way to only express all posible human languages. In other words, a linguistic formalism is successful when it is flexible enough to describe all linguistic phenomena, while being human-readable enough to allow for a large-scale grammar development. As a consequence, linguistic formalisms are to restrictive to allow dealing with cognitive processes like the ones described using production systems.</Paragraph> <Paragraph position="4"> Puting together a linguistic formalism and a model of pragmatical and cognitive processing implies to make a choice among all the current theories. Given the large predominance of production systems in cognitive modeling, it seems quitnatural to chose them as the cognitive model. The choice for the linguistic formalism is more open.</Paragraph> <Paragraph position="5"> Previous attempts of linguistic modeling in cognitive models have used</Paragraph> <Paragraph position="7"> theory, categorical grammar or construction grammar. My pick has been the construction grammar because it shares some interesting features with production systems, and and because it deals directly with semantic, contrary to other grammatical theories. Particularly, constructions are pairing between to poles: form and meaning, this is very similar to the notion of a production taking one input from a chunk, and producing its output into another one.</Paragraph> </Section> <Section position="3" start_page="50" end_page="51" type="sub_section"> <SectionTitle> 1.3 Example of procesing </SectionTitle> <Paragraph position="0"> In such an approach, what should happen when interpreting &quot;move the smal square on the left&quot; in situation 3 on the Figure 1? The first step of the analysis (simplified for sake of clarity), ilustrated in Figure 3, shows how &quot;move&quot; produces a predicate that encompasses a Cause-Motion schema, itself evoking a Source-Path-Goal (SPG) and a Force-Action (FA) schema. The CxMove construction adds a constraint about the fact that source and goal should differ.</Paragraph> <Paragraph position="1"> After this, two constructions CxImperative can conect, through their theme role, the referents evoked by the RefExp shemas (each construction being one posible interpretation) with the source of the Source-Path-Goal. The CxImperative encapsulates the predicate in a Request schema. Another construction can conect the goal of the Source-Path-Goal with the Spatial-P produced from &quot;on the left&quot;, with the predicate modified by the construction that tok its RefExp from &quot;the small square&quot;.</Paragraph> <Paragraph position="2"> At this point, the &quot;mental simulation&quot; required to resolve the referents can start. This step is ilustrated in a very simplified way in Figure 2. The complete process is described in (Pitel, 204; Pitel & Sansonet, 203) and processes potential referents through several sorting steps, one for each referential predicate (here: square and smal in Resolution Context 1 from the two-arguments move interpretation; square, smal and on the left in Resolution Context 2 from the other one). The process is described with the kind of constructions defined by the SCIM.</Paragraph> <Paragraph position="3"> 2Basic Notions of the SCIM The Situated Constructional Interpretation model (SCIM) describes how information can be processed in a way that is both linguistically and psychologically plausible. It relies on three notions: schemas are for low-level data description, contexts are for describing the organization of instances of schemas, and s-constructions represent the mean to process data. Eventually, a SCIM-based interpretation system wil run instances of s-constructions that take and produce instances of schemas situated in instances of contexts. These three notions are partly inherited from the ECG.</Paragraph> </Section> <Section position="4" start_page="51" end_page="52" type="sub_section"> <SectionTitle> 2.1 Schemas </SectionTitle> <Paragraph position="0"> Schemas are constrained, typed features structures, with an inheritance mechanism and no type disjunction. Schemas are a kind of data type. They describe complex structures of information used to represent the state of the runing interpretation. As shown in Figure 4, schemas are defined with three from which schema(s) this one inherits from. a specific case of the schema x, it inherits all of its properties (roles and constraints).</Paragraph> <Paragraph position="1"> roles, which specifies a list of roles, constrained to a given schema type or atomic type (Integer, Boolean, String, or user-defined enumerations of symbols).</Paragraph> <Paragraph position="2"> constraints, which specify the constraints that must be verified in order for an instance of the schema to be a valid one. A constraint can be a predicate if the role has an atomic type, or an identification constraint (asserting that two roles must share the same value), or a filer constraint with a constant value.</Paragraph> <Paragraph position="3"> An instance of schema is moreover described by values attached to its roles (some or all of them may be left underspecified), a unique identifier, a positive value representing its informative capacity, a percentage of trust level, and the list of its parents' identifiers. A parent of an instance of schema is an instance of schema &quot;used&quot; in the process that led to its production. It is thus posible, in a s-construction, to know hether two given instances of schema are somehow related to each other in the interpretation process.</Paragraph> <Paragraph position="4"> From the production systems perspective, schemas define the type of features that can be attached to a category. Basically, in that point of view, an instance of schema is a chunk and roles are slots. Schemas hierarchy Schemas can inherit roles and constraints from other schemas. That means that schemas are organized in a multiple inheritance hierarchy. In order to avoid ambiguity in role access, inherited roles must be accessed through an inheritance path. For instance, accessing the role color in a schema Square, if the hierarchy is FigureRectangleSquare, and where the color role is declared in the Figure schema, would be realized through this kind of path: Rectangle*Figure*color.</Paragraph> <Paragraph position="6"> One problem with this approach of inheritance is that, in order to fulfil the Liskov substitution principle (Liskov, 198), it is sometimes necesary to use unatural type hierarchies (stating that Square doesn't inherit from Rectangle, for instance). I am very mindful about this problem, since such a discrepancy is quite tedious for a model that aims to approximate the human way of processing information, but this problem is out of the scope of this A schema declaration contains a set of constraints that must be satisfied in order for an instance of this schema to be considered valid. Constraints are specified with six basic forms: Type constraints on roles.</Paragraph> <Paragraph position="7"> Boolean operation (OR, NOT, NAND,...) conecting several constraints.</Paragraph> <Paragraph position="8"> Filer constraint symbolized by a single arrow (-) specifies that a constant, atomic value must fil the role in an instance.</Paragraph> <Paragraph position="9"> Identification constraint, symbolized by a double-headed arrow (-), specifies that both sides of the constraint must unify, that is, all roles' values must be compatible with each other.</Paragraph> <Paragraph position="10"> An equality constraint (=) that constrains two roles to refer to the same instance.</Paragraph> <Paragraph position="11"> A boolean predicate constraint can be asserted between any number of roles.</Paragraph> <Paragraph position="12"> Another kind of constraint, on the places occupied by instances of schema in context, wil be explained in the section about s-constructions, as wil We consider that this problem could be solved by the approach caled &quot;Points of View Theory&quot; (which is not related to inter-person points of view), proposed by Pitel (204). In this theory, there is no type hierarchy, and the ability to substitute a representation by another is described by rules that can take the dynamic context into acount. In this aproach, types do not represent concepts, but points of view on perceptions (in the wide meaning), and transition from one point of view to the other is context-dependent.</Paragraph> <Paragraph position="13"> the role of interrogation marks in the schema declaration formalism.</Paragraph> </Section> <Section position="5" start_page="52" end_page="53" type="sub_section"> <SectionTitle> 2.2 Contexts </SectionTitle> <Paragraph position="0"> A context declaration is a description of a container that can hold instances of schemas. In other words, it describes a space (including the topology part that can be specified by a set of relations and operations) that can contain pointers to instances of schemas at given places.</Paragraph> <Paragraph position="1"> The notion of context inherits all of the properties of the notion of schema. Actually, a context is really a kind of schema and, as a consequence, a schema's role can be restricted to be a context. A declaration of context adds three more blocks to the declaration of a schema, as shown in Figure 5: places declare a list of opaque types (the internal structure of the type is hiden in the implementation) that describe an acceptable position in the context. Instances of schema (or context) that wil be contained in an instance of this context wil be linked with a position whose type is one (and only one) of the declared places. Examples of places are: point, segment, multi-segment, line, box, disc, ...</Paragraph> <Paragraph position="2"> an atomic domain from one or more places.</Paragraph> <Paragraph position="3"> Relations define constraints on the positions of a set of instances of schema. For instance, one can define a precedence relation in a linear context.</Paragraph> <Paragraph position="4"> operations are functions that associate a position from one or more positions. For instance, a union of segments is an operation.</Paragraph> <Paragraph position="5"> Terminologically, an instance of schema (or context) located in a context, that is, an instance with a place, wil be called a situated instance, whereas an instance of schema (or context) simply conected to another instance by a role wil just be called a role instance.</Paragraph> <Paragraph position="6"> The only explicit equivalent to contexts in ECG is the notion of space, which describes Fauconier's mental spaces (Fauconier, 1985). Implicit contexts are however used in Construction Grammar: the form pole, which stores instances of schemas representing linguistic data in a linear space, and the unstructured meaning pole.</Paragraph> </Section> <Section position="6" start_page="53" end_page="54" type="sub_section"> <SectionTitle> 2.3 S-constructions </SectionTitle> <Paragraph position="0"> S-constructions are situated constructions, that is, constructions that describe the relations between several instances of schemas located in structured contexts. As for the notion of context, the notion of s-construction is derived from the schemas, because the s-construction itself can hold information. Besides that, the declaration of a s-construction contains: A constructional block that describes the other instances of s-constructions this s-construction relies on. The block contains a list of label: s-construction-name declarations. Any restriction on the constituents of those instances of s-construction is described as a constraint on label.constituent in the constraints block.</Paragraph> <Paragraph position="1"> A constituents block that describes the instances of contexts and schemas constrained by the s-construction (note that the meaning of constituents is different than in ECG).</Paragraph> <Paragraph position="2"> The declaration of those constituents specifies whether the instance must preexist and/or whether it may be created or specified by the s-construction's constraints.</Paragraph> <Paragraph position="3"> From a production system point of view, it means that we describe which instances are in the input, and which one are produced.</Paragraph> <Paragraph position="4"> S-constructions hierarchy Like schemas, s-constructions are organized in a multiple inheritance hierarchy. Moreover, s-constructions benefit from a mechanism of constructional dependence, held by the constructional block. Those two notions are, to some extent, redundant. Indeed, inheriting from a s-construction is equivalent to having an instance of this s-construction in the constructional block. However, one can have two different instances of the same s-construction in the constructional block, whereas it is imposible to inherit twice from the same sconstruction. Moreover, it is posible to add a negative semantics in the constructional block, in order to assert that some instance of s-construction must not have occurred to satisfy the s-construction's conditions.</Paragraph> <Paragraph position="5"> The constructional block is thus more powerful than the classical inheritance relation, but as for the schemas hierarchy, it is not within the scope of this paper to discus about the inheritance relations between s-constructions. A declaration of s-construction is thus, from that point of view, in conformance with the standard view shared in construction grammars.</Paragraph> <Paragraph position="6"> Situated aspects of s-constructions A s-construction can &quot;chose&quot; instances of schemas, given positional constraints in the context where the instances of schemas are stored. Then, the s-construction wil &quot;create&quot; new instances of context or schemas, or wil specify some previously underspecified role's value. S-constructions can conect together more than two instances of schema. To that extent, it differs from ECG's construction (ECG's way of doing so makes use of an evoke block).</Paragraph> <Paragraph position="7"> The specification of structural constraints is very similar to the other constraints. A structural constraint loks like this: context-id.relation(roles-incontext-id). Basically, a context relation is considered as a bolean predicate constraint. The main difference is that, instead of specifying the roles, such a constraint specifies the place of the instance of schema referred to by the role.</Paragraph> <Paragraph position="8"> Dynamic aspects The bigest gap between productions systems and construction grammar is the difference between the dynamic nature of productions versus the declarative nature of linguistic constructions. For instance, a typical rule in a production system (from the ACT-R tutorial) would be represented in Figure 7.</Paragraph> <Paragraph position="9"> In order to take this posibility into account, it is necessary to introduce at some point some imperative features in the s-construction.</Paragraph> <Paragraph position="10"> Imperative features are introduced through several mechanisms. The first one is about role instances, the second one is about situated instances and the third one is about specifying constituents acting as inputs and/or outputs.</Paragraph> <Paragraph position="11"> ACT-R declaration English description (p start =goal> If the goal is ISA count-from to count from start =num1 the number =num1 step start and the step is start => Then =goal> change the goal step counting to note that one is now counting Mutable roles. In the roles blocks, they are specified by a question mark (?). If a role is marked as mutable in a schema declaration, then it can be accessed through two means in a s-construction constraint. The usual way constrains the state of the role instance before the application of the s-construction, the mutated way constrains the state of the role instance after the application of the s-construction.</Paragraph> <Paragraph position="12"> Removable situated instances. The constraint OUT(<constituent-id>) specifies that the situated instance must be marked as not being present anymore in its context, after execution of the s-construction.</Paragraph> <Paragraph position="13"> Input and/or output constituents. Each constituent of a s-construction is marked with a symbol /I or /O, stating whether the situated instance should be present before and whether it wil be modified.</Paragraph> </Section> <Section position="7" start_page="54" end_page="54" type="sub_section"> <SectionTitle> 3Computational Aspects </SectionTitle> <Paragraph position="0"> Given the characteristics of the SCIM, its expressiveness and its procedural orientation, one canot occult the problems that it raises from the computational point of view. Building an implementation of the Situated Constructional Interpretation Model definitely means to give up the idea of conducting a complete exploration of the search space.</Paragraph> <Paragraph position="1"> The main problem is that two s-constructions may lead to contradictory constraints. In other words, one must keep track of all the decisions and explore all the posibilities.</Paragraph> <Paragraph position="2"> The problem is even worse with mutable instances, since some constraints may be satisfied at some moment in one posible interpretation, while being unsatisfied at another moment. This time dependence must be handled very carefuly, and adds some complexity to the procesing of constraints. null However, the model also presents some interesting features, computationally speaking. For instance, it is quite easy to add a weighting layer to the SCIM, in order to simulate expectation, informational potential, or execution cost. Such a layer could be trained to learn how to lead to the best interpretations at a minimal cost.</Paragraph> </Section> <Section position="8" start_page="54" end_page="55" type="sub_section"> <SectionTitle> 4Conclusion </SectionTitle> <Paragraph position="0"> In this paper I propose and describe a model of interpretation both linguistically and psychologically motivated. This model allows describing a construction grammar as well as a production system, with three basic notions: schemas, contexts and s-constructions. Aplications for such a model are wide, from more integrated dialogue systems to a unified theory of cognition and language.</Paragraph> <Paragraph position="1"> A longer description of the processing architecture would be necessary in order to really confront the hypotheses I made in the section &quot;Computational aspects&quot;, but nevertheless, one can already draw a parallel between this model with a spatial structuring of information, and the structure that neuromimetic models can handle. Also, incomplete exploration of the search space, guided by a cost/gain approach, has previously been proposed as a plausible model of processing for human cognition. More than computational efficiency, the goal of this model is to propose a formalism that would be easier to use both for linguistic and cognitive modeling, in order to observe and act on the simulated processing of language and other cognitive functions.</Paragraph> <Paragraph position="2"> Many of the claims in this paper have yet to be proved through the implementation of the SCIM, and cognitive modeling using the system. Since many processing models have been made both on construction grammar and production systems, important researches should be easy enough to re-use in the SCIM.</Paragraph> </Section> </Section> class="xml-element"></Paper>