File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1429_metho.xml
Size: 10,001 bytes
Last Modified: 2025-10-06 14:15:21
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1429"> <Title>SYSTEM DEMONSTRATION NATURAL LANGUAGE GENERATION WITH ABSTRACT MACHINE</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> SYSTEM DEMONSTRATION NATURAL LANGUAGE GENERATION WITH ABSTRACT MACHINE </SectionTitle> <Paragraph position="0"> {gabr, francez}~cs, t echnion, ac. il</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Shuly Wintner Seminar fiJr Sprachwissenschaft Universit~it Tiibingen 72074 Tiibingen, Germany </SectionTitle> <Paragraph position="0"> shuly(c)sf s. nphi i. uni-tueb ingen, de Abstract&quot; We present a system for Natural Language Generation based on an Abstract Machine approach. Our abstract machine operates on grammar s encoded in a unification-based Typed Feature Structure formalism, and is capable of both generation and parsing. For efficient generation, grammars are first inverted to a suitable form, and then compiled into abstract machine instructions. A dual compiler translates the same input grammar into an abstract machine program for parsing. Both generation and parsing programs are executed under the same (chart-based) evaluation strategy: This results in an efficient, bidirectional (parsing/generation) System for Natural Language Processing. Moreover, the system possesses ample debugging features, and thus can serve as a user-friendly environment for bidirectional grammar design and development.</Paragraph> </Section> </Section> <Section position="2" start_page="0" end_page="277" type="metho"> <SectionTitle> 1 Overview </SectionTitle> <Paragraph position="0"> An input for the generation 1 task is a logical form which represents a meaning, and a grammar to govern the generation process. The output consists of one or more phrases in the language of the grammar whose meaning is (up to logical equivalence) the given logical form.</Paragraph> <Paragraph position="1"> The system robe demonstrated applies an Abstract.Machine (AM) approach for Natural Language Generation, within the framework of Typed Feature Structures (Carpenter, 1992b). Such a machine is an abstraction over an ordinary computer, lying somewhere between regular high-level languages and common hardware architectures. Programming an Abstract Machine has proved -fruitful in previous research, reaching a peak as a highly efficient technique to build Prolog compilers (Ait-Kaci, 199!).</Paragraph> <Paragraph position="2"> AMALIA 2 has two compilers of grammars into Abstract Machine instructions; the outputs of compilation are AM programs which perform either chart generation or chart parsing, both according to the given grammar. Both tasks use an auxiliary table (chart) to store intermediate processing results..AMALIA has a uniform core engine for bottom-up chart processing, which interprets the given (abstract machine ) program, and realizes the generation or parsing task. In the case of generation it is the given semantic meaning whose components are consumed in the process. The only differences between the two processing directions are in the nature of chart items and interpretation of the final results. Thereby, AMALIA makes dual use of its chart and forms a complete bidirectional natural language system, which is considered an advantage in the literature (Strzalk0wski, 1994).</Paragraph> <Paragraph position="3"> The system is capable of very efficient processing, since grammars are precompiled directly into abstract machine instructions, which are subsequently executed over and over.</Paragraph> <Paragraph position="4"> lln this work we mean by &quot;generation&quot; what is sometimes known also as &quot;syntactic generation&quot;. Thus, no text. plannin$, speaker intentions and the like are considered here. / Logical forms specified as meanings by input grammars are given in a so-called predicate-argument structure a. Thus, meanings are built from basic units (feature structures), each having a predicate and (optionally) a number of arguments. Our approach also allows ,k-abstractions over predicate-argument constructs, as well as systematic encoding of second- and higher-order functions.</Paragraph> <Paragraph position="5"> Grammars are usually designed in a form oriented towards the analysis of a string and not towards generation from a (usually nested) semantic form. In other words, rules reflect the phrase structure and not the predicate-argument structure. It is therefore useful to transform the grammar in order to enable systematic reflection of any given logical form in the productions. For this purpose, we apply to the input grammar an inversion procedure, based upon 4 (Samuelsson, 1995), to render tile rules with tile nested predicate-argument structure, corresponding to that of input logical forms. The resultant &quot;inverted&quot; grammar is thus more suitable for performing the generation task. Once the grammar is inverted, the generation process can be directed by the input semantic form; elements of the input are consumed during generation just like words are consumed during parsing. Grammars must satisfy certain requirements in order for them to be invertible. However, ttle requirements are not overly restrictive and allow encoding of a variety of natural language grail\] mars.</Paragraph> <Paragraph position="6"> Grammar inversion is performed prior to compilation for generation. The given grammar is enhanced in a way that will uhimately enable to reconstruct the words spannedby the semantic forms. To achieve this aim, ea.ch rule constituent is extended by an additional special-purpose feature. The value Of this feature for tile rule's head is set to the concatenation of its values in the body constituents, to reflect the original phrase structure of the rule.</Paragraph> <Paragraph position="7"> Figure 1 delineates an overview of AM-based generation. After the grammar is inverted, it is compiled into the abstract machine code. At run time, the given logical form is decomposed into meaning components, which initialize the AM chart, and then the generation program is invoked. If generation terminates, it yields a (possibly empty) set of feature structures; a grammar-independent post-processing routine an'alyzes these structures and retrieves the generated phrases per se.</Paragraph> <Paragraph position="8"> as a unified platform for parsing and generation, elaborating more on the way the two directions are integrated into a single system.</Paragraph> </Section> <Section position="3" start_page="277" end_page="277" type="metho"> <SectionTitle> 2 .AMALIA functionality </SectionTitle> <Paragraph position="0"> .AMALIA operates O n input grammars encoded in a subset of the ALE specification language (Carpenter , 1992a) In particular, .AMALIA supports the same type hierarchies as ALE does, with exactly the same specification syntax. This means that the user can specify any bounded-complete partial order as the type hierarchy. In contrast to ALE, .AMALIA allows appropriateness loops in the type hierarchy. On the other hand, .AMALIA does not support type constraints and relational extensions.</Paragraph> <Paragraph position="1"> .AMALIA uses a subset of ALE's syntax for describing totally well-typed, possibly cyclic, non-disjunctive feature structures. Set values, as in ALE, are not supported, but list values are..AMALIA does not respect the distinction between intensionol and extensional types (Carpenter, 1992b, Chapter 8). Also, feature structures cannot incorporate inequality constraints.</Paragraph> <Paragraph position="2"> C/4MALIA supports macros in a similar way tO:ALE. The syntax is the same, and macros can have parameters or call other macros (though not recursively, of course). ALE'S special macros for lists are supported by AMALIA. Lexical rules are not supported in this version of AMALXA..AMALIA'S syntax for pliraze structure rules is similar to ALE'S, with the exception of the cats> specification (permitting a list of categories in the body of a rule) which is not supported. C/4MALIA uses ALE'S syntax in describing lexical entries, and allows disjunctive lexical entries, separated by semicolons.</Paragraph> <Paragraph position="3"> * .AMALIA is implemented in ANSI-C, augmented by, lea: and yacc to implement the input acquisition module , and Tcl/Tk to build the graphical user interface. The application is compatible with a variety Of platforms, such as SUN and SILICON GRAPHICS workstations running UNIX operating system, as well asIBM PC running V~rlNDOWS'95 and LINUX. For a detailed description and a complete user's guide of AMALIA refer to (Wintner, Gabrilovich, and Francez, 1997b).</Paragraph> <Paragraph position="4"> * There are two versions of .AMALIA: an interactive, user-friendly program with a graphical user interface, and a non'interactive but more efficient version for batch processing. The former program provides extensive debugging capabilities, and is ideally suited for developing reversible grammars.</Paragraph> <Paragraph position="5"> Figure 2 presents a sample snapshot of the program screen. In the case of generation, the &quot;Input string&quot; field specifies the name of the query file, which contains (an ALE description of) a feature structure representing the input semantic form. In this example, the query file encodes the logical form Vx(man(x)--, dream(z)); the feature structure for this query is shown in the figure over the main program screen. The &quot;Messages&quot; window displays the phrases generated (if any). The feature structures that encode these phrases are also displayed graphically, in separate windows (not shown in the figure). In the case of parsing, the &quot;Input string&quot; field contains the word string to be parsed, and the program eventually displays feature structures assigned to this string by the parser (if any).</Paragraph> </Section> class="xml-element"></Paper>