File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-0103_metho.xml

Size: 20,677 bytes

Last Modified: 2025-10-06 14:07:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0103">
  <Title>A Web-based Instructional Platform for Constraint-Based Grammar Formalisms and Parsing</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Problems of seminar-style courses
</SectionTitle>
    <Paragraph position="0"> The contents of our core modules are based on a series of previous seminar-style courses, in particular on constraint-based grammar implementation, which also started integrating interactive components and web-based materials into traditional face-to-face teaching. These are described in detail in Section 5. The traditional seminar-style teaching method underlying the courses mentioned therein July 2002, pp. 19-26. Association for Computational Linguistics. Natural Language Processing and Computational Linguistics, Philadelphia, Proceedings of the Workshop on Effective Tools and Methodologies for Teaching has a number of inherent problems, however. These problems become particularly pressing when topics as diverse as linguistic theory, grammar implementation, parsing, mathematical foundations of linguistic theory and feature logics are combined in a single course that is addressed to a mixed audience with varying backgrounds in computer science, knowledge representation, artificial intelligence and linguistics, in any combination of these subjects.</Paragraph>
    <Paragraph position="1"> First, the seminar-style teaching format as used in those grammar implementation courses presupposes a fairly coherent audience of linguists with a shared background of linguistic knowledge. Second, since computers are only used as a medium to implement grammars and since the implementation platform is not optimized for web-based training, it is necessary that there be a relatively low number of students per teacher. Third, the theoretical material is in the form of overheads and research papers, which are in electronic form but not easily accessible without the accompanying lecture as part of a seminar-style course. Fourth, the background lectures of the courses lack the support of the kind of graphical, interactive visualization that teaching software can in principle offer. Finally, the courses follow a single path through the materials as determined by the teacher, which the student cannot change according to their specific interests and their prior knowledge.</Paragraph>
    <Paragraph position="2"> We believe that these shortcomings can be overcome by shifting from a seminar-style to a web-based training format in a way that preserves the positive aspects of successful hands-on courses. On the other hand, to successfully shift from seminar-style to web-based training we believe it is essential to do this based on a scientific understanding of the nature and possibilities of web-based learning. In the next section we therefore embed our work in the context of education and collaborate learning technology research.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Education and collaborative learning
</SectionTitle>
    <Paragraph position="0"> technology research Our perspective on web-based training draws its inspiration primarily from work in building &amp;quot;learning communities&amp;quot; in education research (Lin et al., 1995; Nonaka, 1994), in which:  1. a precise context is established to introduce tacit knowledge and experience, in this case on subjects in computational linguistics and the traditional disciplines it draws from, 2. conflicting perspectives are shared, concepts are objectified and submitted to a process of justification and arbitration, and 3. the concepts are then integrated into the knowl- null edge base as modules upon which further instructional material or grammar implementations can be constructed.</Paragraph>
    <Paragraph position="1"> We thus intend to provide an environment that teaches students by actively encouraging them to participate in research that extends our collective knowledge in this area. In principle, there are no boundaries to the material that could be included in the evolving framework. We intend to make it available as an open-source standard for grammar development and instruction in the hope that this will encourage researchers and educators to contribute modules to it, and to use a feature-structure based approach for their own research and courses.</Paragraph>
    <Paragraph position="2"> Scardamalia and Bereiter (1993) identify seven global characteristics that technologies must have to support this kind of participation: Balance: a distinction between public and private and between individual and group knowledge processes. That includes free access to others' work, including implementations of concepts as algorithms or grammars, and opportunities to borrow ideas into their own work that would be prohibitively time-consuming or otherwise advanced to formulate on their own. Such technologies must also encourage time for personal &amp;quot;reflection and refinement&amp;quot; and anonymous public or private contribution to the knowledge space. The present framework achieves this by providing an open-source setting combined with a web-based instructional tool for self-paced learning and individual design of both the contents and order of the curriculum.</Paragraph>
    <Paragraph position="3"> Contribution and notification: to prevent ideas from being presented in an insulated structure that discourages questioning, debate, or revision. As discussed in Section 4.2, this is achieved by providing extensive linking and annotation of resources using web-compatible metalanguages for integrating modules at the implementational, formal and instructional levels.</Paragraph>
    <Paragraph position="4"> Source referencing: a means of preserving the boundaries of a contributor's idea and its credit as well as a history of prior accounts and antecedents to the idea. In the present framework, this is provided by means of a requirements analysis component that requires contributed modules to identify the contribution by new concepts or resources provided, existing concepts or resources imported for it to work, and an account of existing alternatives with a description of its distinction from them.</Paragraph>
    <Paragraph position="5"> Storage and retrieval: which places contributions in a &amp;quot;communal context&amp;quot; of related contributions by others to encourage joint work between contributors working on problems with significant overlap. The present framework must organize the presentation of existing modules along several thematic dimensions to accomplish this.</Paragraph>
    <Paragraph position="6"> Multiple points of entry: for students/contributors with different backgrounds and levels of experience. Material is made accessible in more basic or fundamental modules by projecting the formal content of the subject into a graphically based common-sense domain at which it can be grasped more intuitively (see Section 4.3). Accessibility in more advanced modules is provided by links specified in the requirements analysis component to more basic modules that the former rely upon.</Paragraph>
    <Paragraph position="7"> Coherence-producing mechanisms: feedback to contributors and framework moderators of modules that are &amp;quot;fading&amp;quot; for lack of attention or further development. These can either be reinstated or reformulated, moved to a private space of more peripheral modules, or deleted outright. This is a way of encouraging activity that is productive, and restricting the chance of confusion or information overload. Such a coherence mechanism must exist within this framework.</Paragraph>
    <Paragraph position="8"> Links to external resources: to situate the justification and discussion of contributions in a wide context. We make use of the web-based training platform ILIAS1 which is available as open source software and offers a high degree of flexibility in terms of the integration of internal and external resources.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Integration of the framework
</SectionTitle>
    <Paragraph position="0"> The goal of our current work is to transform previous, seminar-style courses and new input into teaching materials that are fit for web-based training in the general framework outlined in the previous section.</Paragraph>
    <Paragraph position="1"> This clearly involves much more than simply reformatting old teaching materials into web-compatible formats. Instead, it requires an analysis of the contents of the courses, the interleaving and hyperlinking of the textual materials, and the development of graphical, interactive solutions for presenting and interacting with the content of the material. Since the nature of the textual material as such is familiar (instructional notes, reference guides to major sections with indices, system documentation, annotated system source code, and annotated grammar source code), we use the limited space in this paper to highlight the integrated nature of the approach as well as the web-based training specific issues of hyperlinking and visualization.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Integration of linguistic and computational
aspects
</SectionTitle>
      <Paragraph position="0"> Our approach is distinguished by its integration of grammars, the parsers that use them and the on-line instructional materials. Compared to the LKB system2, which as mentioned in Section 5.2 has also been used successfully in teaching grammar development, the greater range of formal expressive devices available to our parsing system, called TRALE, allows for more readable and compact grammars, which we believe to be of central importance in a teaching context. To illustrate this, we are currently porting the LinGO3 English Resource Grammar (ERG) from the LKB (on which the ERG was designed) to the TRALE system.</Paragraph>
      <Paragraph position="1"> Given the scope of our web-based training framework as including an integrated module on parsing, it is also relevant that the TRALE system itself can be relatively compact and transparent at the sourcecode level since it exploits its close affinity to the underlying Prolog on which it is implemented. This contrasts with the perspective of Copestake et al.</Paragraph>
      <Paragraph position="2"> (2001), who concede that the LKB is unsuitable for teaching parsing.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 The use of hyperlinks
</SectionTitle>
      <Paragraph position="0"> Several different varieties of links are distinguished within the course material, giving a first-class representation to the transfer of knowledge between the linguistic, computational and mathematical sources that inform this interdisciplinary area. We intend to distinguish the following kinds of links: Conceptual/taxonomical: connecting instances of key concepts and terms used throughout the course material with their definitions and provenience; null Empirical context: connecting instances of design decisions, algorithms and formal definitions to encyclopedic discussions of their linguistic motivation and empirical significance; Denotational: connecting instances of constructional terms and issues within linguistics as well as correctness conditions of algorithms to the mathematical definitions that formalize them within the foundations of constraint-based linguistics; Operational: connecting mathematical definitions and instances of related linguistic discussions to computational instructional material describing the algorithms used to construct, refute or transform the formal objects representing them in a practical system; Implementational: connecting discussions of algorithms to the actual annotated system source code in the TRALE system used to implement them, and mathematical definitions and discussions of linguistic constructions to the actual annotated grammar source code used to represent them in a typical implementation. null The idea behind this classification is that when more course material is added to the web-based training framework we are proposing, the new material will take into account these distinctions to obtain a conceptually coherent use of hyperlinks throughout the framework.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Visualization
</SectionTitle>
      <Paragraph position="0"> Our three core modules make use of a number of graphical user interfaces: a tool for interleaved visualization and interaction with trees and attribute value matrices, one for the presentation of lexical rules and their interaction, an Emacs-based source-level debugger, and a program for the graphical exploration of the formal foundations of typed feature logic. The first two are extensions of tools we already used for our previous courses, and the third is an extension of the ALE source-level debugger, so we here focus on the last, new development.</Paragraph>
      <Paragraph position="1"> The main goal of the MorphMoulder (MoMo) is to project the formality of its subject, the formal foundations of constraint languages over typed feature structures, onto a graphical level at which it can be grasped more intuitively.4 The transparency of this level is essential for providing multiple points of entry (Section 3) to this fundamentally important module. The MoMo tool allows the user to explore the relationship between the two levels of the formal architecture: the descriptions and the elements described. To this end, the user works with a graphical interface on a whiteboard. Labeled directed graphs representing feature structures can be constructed on the whiteboard from their basic components, nodes and arcs. The nodes are depicted as colored balls, which are assigned types, and the arcs are depicted as arrows that may be labeled by feature names. Once a feature structure has been constructed, the user may examine its logical properties. The three main functions of the MoMo tool allow one to check (1) whether a feature structure complies with a given signature, (2) whether a well-formed feature structure satisfies a description or a set of descriptions, and (3) whether a well-formed feature structure is a model of a description or a set of descriptions. In the context of the course, the functions of MoMo thus lead the user from understanding the well-formedness of feature structures with respect to a signature to an understanding of feature structures in their role as a logical model of a theory. If a student has chosen course modules that include a focus on formal foundations of feature logics or feature logics based linguistic theory, the first introduction to the subject by MoMo can easily be followed up by a course module with rigorous mathematical definitions.</Paragraph>
      <Paragraph position="2"> In constraint-based frameworks, the user declares the primitives of the empirical domain in terms of a type hierarchy with appropriate attributes and attribute values. Consider a signature that licenses lists of various birds, which may then be classified according to certain properties. First of all, the sig4MoMo is written by Ekaterina Ovchinnikova, U. T&amp;quot;ubingen. nature needs to comprise a type hierarchy and feature appropriateness conditions for lists. Let type list be an immediate supertype of the types non-emptylist and empty-list in the type hierarchy (henceforth abbreviated as nelist and elist). Let the appropriateness conditions declare the attributes HEAD and TAIL appropriate for (objects of) type nelist, the values of TAIL at nelist be of type list, and the values of HEAD at type nelist be of type bird (for lists of birds). Finally no attributes are appropriate for the type elist. A typical choice for the interpretation of that kind of signature in constraint-based formalisms is the collection of totally well-typed and sort resolved feature structures. All nodes of totally well-typed and sort resolved feature structures are of a maximally specific type (types with no subtypes); and they have outgoing arcs for all and only those features that are appropriate to their type, with the feature values again obeying appropriateness. Our signature for lists thus declares an ontology of feature structures with nodes of type nelist or elist (but never of type list), where the former must bear the outgoing arcs HEAD and TAIL, and the latter have no outgoing arcs. They signal the end of the list. The HEAD values of non-empty lists must be in the denotation of the type bird.</Paragraph>
      <Paragraph position="3"> Figure 1 illustrates how the MoMo tool can be used to study the relationship between signatures and the feature structures they license by letting the user construct feature structures and interactively explore whether particular feature structures are well-formed according to the signature. To the left of the whiteboard there are two clickable graphics consoles of possible nodes and arcs from which the user may choose to draw feature structures. The consoles offer nodes of all maximally specific types and arcs of all attributes that are declared in the signature. In the present example, parrot, woodpecker, and canary are the maximally specific sub-types of bird.</Paragraph>
      <Paragraph position="4"> Each color of edge represents a different attribute, and each color of node represents a different type.</Paragraph>
      <Paragraph position="5"> The grayed outlines on edges and nodes indicate that all of the respective edges and nodes in this particular example are licensed by the signature that was provided. The HEAD arc originating at the node of type elist, however, violates the appropriateness conditions of the signature. The feature structure de- null feature structures.</Paragraph>
      <Paragraph position="6"> picted here, therefore, is not well-formed. The signature check thus fails on the given feature structure, as indicated by the red light in the upper function console to the right of the whiteboard.</Paragraph>
      <Paragraph position="7"> Similarly, MoMo can graphically depict satisfiability and modellability of a single description or set of descriptions. To this end, the user may be asked to construct a description that a given feature structure satisfies or models; or she may be asked to construct feature structures that satisfy or model a given description (or set of descriptions). The system will give systematic feedback on the correct or incorrect usage of the syntax of the description language as well as on to which extent a feature structure satisfies or models descriptions, systematically guiding the user to correct solutions.</Paragraph>
      <Paragraph position="8"> Figure 2 shows a successful satisfiability check of a well-formed feature structure. The feature structure is derived from the one in Figure 1 by removing the incorrect HEAD arc and its substructure from the elist node. The query, asked in a separate window, is whether the feature structure satisfies the constraint (nelist, head:(parrot, color:green), tail:nelist). Since this is the case, the green light on the function console to the right is signaling succeed. If we were to perform model checking of the same feature structure against the same constraint, checking would fail, and MoMo would indicate the nodes of the feature structure that do not satisfy the given constraint.</Paragraph>
      <Paragraph position="9"> Figure 2: Graphically evaluating constraint satisfaction of feature structures.</Paragraph>
      <Paragraph position="10"> MoMo's descriptions are a syntactic parallel to TRALE's descriptions, thus introducing the student not only to the syntax and semantics of constraint languages but also to the language that will be used for the implementation of grammars later in the course. The close relationship of description languages also facilitates a comparison of their model-theoretic semantics and the truth conditions of grammars with the structure and semantics of algorithms that use descriptions for constraint resolution and in parsing. Finally, their common structure allows for a tight network of hyperlinks across the boundaries of different course modules and course topics, linking them to a common source of mathematical, implementational and linguistic indices, which explain the usage of common mathematical concepts across the different areas of application of typed feature structures. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML