File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1516_intro.xml
Size: 4,136 bytes
Last Modified: 2025-10-06 14:03:59
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1516"> <Title>SemTAG, the LORIA toolbox for TAG-based Parsing and Generation</Title> <Section position="3" start_page="0" end_page="115" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Inthispaperweintroduceatoolboxthatallowsfor both parsing and generation with TAG. This toolbox combines existing software and aims at facilitating grammar development, More precisely, this toolbox includes1: * XMG: a grammar compiler which supports the generation of a TAG from a factorised TAG (Crabb'e and Duchier, 2004), * LLP2 and DyALog: two chart parsers, one with a friendly user interface (Lopez, 2000) and the other optimised for efficient parsing ming language, and a tabular compiler for this language. The DyALog system is well-adapted to the compilation of efficient tabular parsers.</Paragraph> <Paragraph position="1"> 2 XMG, a grammar writing environment for Tree Based Grammars XMG provides a grammar writing environment for tree based grammars3 with three distinctive features. First, XMG supports a highly factorised and fully declarative description of tree based grammars. Second, XMG permits the integration in a TAG of a semantic dimension. Third, XMGis based on well understood and efficient logic programming techniques. Moreover, it offers a graphical interface for exploring the resulting grammar (see Figure 1).</Paragraph> <Paragraph position="2"> Factorising information. In the XMG framework,a TAG isdefinedbyasetofclassesorganised in an inheritance hierarchy where classes define tree fragments (using a tree logic) and tree fragment combinations (by conjunction or disjunction). XMG furthermore integrates a sophisticated treatment of names whereby variables scope can be local, global or user defined (i.e., local to part of the hierarchy).</Paragraph> <Paragraph position="3"> In practice, the resulting framework supports a very high degree of factorisation. For instance, a first core grammar (FRAG) for French comprising 4 200 trees was produced from roughly 300 XMG classes.</Paragraph> <Paragraph position="4"> Integrating semantic information. In XMG, classes can be multi-dimensional. That is, they can be used to describe several levels of linguistic knowledge such as for instance, syntax, semantics or prosody. At present, XMG supports classes including both a syntactic and a semantic dimension. As mentioned above, the syntactic dimen3Although in this paper we only mention TAG, the XMG frameworkisalsousedtodevelopsocalledInteractionGrammars i.e., grammars whose basic units are tree descriptions rather than trees (Parmentier and Le Roux, 2005).</Paragraph> <Paragraph position="5"> describe (partial) tree fragments. The semantic dimension on the other hand, can be used to associatewitheachtreeaflatsemanticformula. Sucha formula can furthermore include identifiers which corefer with identifiers occurring in the associated syntactic tree. In other words, XMG also provides support for the interface between semantic formulae and tree decorations. Note that the inclusion of semantic information remains optional. That is, it is possible to use XMG to define a purely syntactic TAG.</Paragraph> <Paragraph position="6"> XMG was used to develop a core grammar for French (FRAG) which was evaluated to have 75% coverage4 on the Test Suite for Natural Language Processing (TSNLP, (Lehmann et al., 1996)). The FRAG grammar was furthermore enriched with semantic information using another 50 classes describing the semantic dimension (Gardent, 2006). The resulting grammar (SEMFRAG) describes both the syntax and the semantics of the French core constructions.</Paragraph> <Paragraph position="7"> Compiling an XMG specification. By building on efficient techniques from logic programming and in particular, on the Warren's Abstract Machine idea (Ait-Kaci, 1991), the XMG compiler allows for very reasonable compilation times (Duchier et al., 2004). For instance, the compilation of a TAG containing 6 000 trees takes about 15 minutes with a Pentium 4 processor 2.6 GHz and</Paragraph> </Section> class="xml-element"></Paper>