File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1035_metho.xml
Size: 12,019 bytes
Last Modified: 2025-10-06 14:11:25
<?xml version="1.0" standalone="yes"?> <Paper uid="C82-1035"> <Title>ARBUS, A TOOL FOR DEVELOPING APPLICATION GRAMMARS</Title> <Section position="4" start_page="221" end_page="221" type="metho"> <SectionTitle> GRAMMAR AND PARSER </SectionTitle> <Paragraph position="0"> A grammar is implemented in ARBUS as a set of trees, with a tree for each syntactic category. Each node of a tree (except the root) represents either a terminal word of the language defined, or a category referring then to another tree (PSig. I). This is also the way the user must describe a granmmr to the system.</Paragraph> <Paragraph position="1"> 'l&quot;nis representation is a simplified form of transition networks, where each sub-network corresponds to a different syntactic categorydeg</Paragraph> </Section> <Section position="5" start_page="221" end_page="221" type="metho"> <SectionTitle> ARBUS 223 </SectionTitle> <Paragraph position="0"> A tree structure is generally less compact~ but absolutely equivalent to a network (by duplicating nodes with multiple parents in the related network). We chose this representation because trees are easier to describe and to visualize interact~vely. They are also easier to process and to display than unrestricted graphs. And every distinct path in a syntactic tree corresponds to a rewrite rule of the grammar, which is not true in general for transition networks.</Paragraph> <Paragraph position="1"> Any node can be augmented with tests and actions to be performed when coming across the node. These tests and actions are predefined in a library at the disposal of the user, and each one is known under a reference name so that they can be used without having to deal with their actual implementation. For instance, there is an action available to note that a noun phrase is singular, and a test to check later on that the subject of a verb was indeed singular. Another action translates a sequence of digits into the corresponding number, etc...</Paragraph> <Paragraph position="2"> These augmentations make it possible to define context-sensitive languages, as one can take the context into account with actions and ~ests, in order to handle conveniently features such as number agreement between subject and verb. This representation of grammars is then quite similar to Augmented Transition Networks (Woods, 1970), in which tests and actions can be associated with the transitions.</Paragraph> <Paragraph position="3"> The main difference is the use of trees instead of networks to implement a grammar in our system.</Paragraph> <Paragraph position="4"> The parser which will test a gra~nar by interpreting its representation is also comparable to an ATN parser. It is designed as a top-down, left-to-rlght parser: when moving through a tree, control is transferred to another tree every time a syntactic category is encountered at a node. This process can be reeurslve thanks to a pushdown stack. If at a given point there are several possible paths, the parser follows only one, but saves the current state on the stack and will backtrack in case of failure to try the alternatives.</Paragraph> <Paragraph position="5"> If a node is augmented with a test, the transition can be followed only if the test is verified; if there is an action at the node, the action is performed (but will be undone in case of backtracking). The actions could be used to build the p~-se of a sentenee~ but in fact the parse-tree produced is simply a trace of the essful transitions through the grammar if the sentence is accepted. This is a Rlly closer to the way a context-free parser operates. If a sentence is ambiu ,us, one version of the parser returns only one analysis; another slower version produces all the possible parses.</Paragraph> <Paragraph position="6"> If the input sentence is not acce~ted, the parser tries to give a simple and clear diagnosis of the failure and specifies the place in the sentence where it had to give up. But systematic backtracking sometimes makes it difficult to tell exactly what happened ; it might be useful to save the whole parse history. Lastly the parser can also run in predictive mode for speech recognition: the grammar is user to constrain possibilities at every step to help lexical recognition.</Paragraph> <Paragraph position="7"> The grau~nar can also be employed to generate sentences. A special generator using a random function produces s6ntences according to the current grammar. This quickly gives a broad view of the type of language defined, without using the parser and without having to think up successive sentences to test. The random generator offers then one more facility to examine a gren~nar and sometimes reveals unforeseen errors in the syntactic rules.</Paragraph> <Paragraph position="8"> So by and large, parsing is done in ARBUS with fairly standard tools which are comparable to other well-known parsers. But the emphasis was put mainly on practical interactive use PSo develop an application grammar, and most design decisions were taken with this primary goal in mind.</Paragraph> <Paragraph position="9"> 224 D. MEMMI and J. MARIANI</Paragraph> </Section> <Section position="6" start_page="221" end_page="221" type="metho"> <SectionTitle> GRAMMAR EDITOR </SectionTitle> <Paragraph position="0"> To define a grammar , the user describes it to ARBUS in the form of transition trees as seen above. Each tree is to be described by moving through the tree in depth-first fashion from left to right, with the help of a prompting program. The system then builds the corresponding internal representation. Actions and tests can also be added on the nodes. But after testing the grarmnar with the parser, it will often appear necessary to modify the syntax. One must therefore be able to edit the grammar.</Paragraph> <Paragraph position="1"> We designed a specialized grarm~ar editor containing a complete set of diplay and modification functions. Because of the way the grammar is represented within the system, this editor deals mainly with tree structures. We tried to select a minimal set of primitives that would allow all the necessary modifications while being simple to learn. More complex editing operations may then have to be executed in several steps.</Paragraph> <Paragraph position="2"> The grammar can first be displayed, as a whole or tree by tree, with actions and tests if needed. One can either display the trees themselves, or list all the distinct paths of a tree, which correspond to rewrite rules. The lexicon may also be examined, as well as the list of syntactic categories of the grammar. The lexicon is automatically updated after any modification and thus always shows the current state. One can also look up the catalogue of actions and tests available to the user for augmentations.</Paragraph> <Paragraph position="3"> With the editor one can replace one word by another, whether at a given node, in a whole tree or everywhere in the grammar. To modify the structure of a transition tree, one can delete, insert or replace a node by itself without its offspring, or a node with its offspring (i.e., a sub-tree). It is also possible to save part of a tree to insert it elsewhere. If a new syntactic category is introduced during a modification, the system will detect it and ask for the description of a new transition tree.</Paragraph> <Paragraph position="4"> Augmentations can of course be also modified by adding, deleting or replacing tests and actions at any node. In short everything in the grammar may be examined and modified. When the result seems satisfactory, the grammar can then be saved on file. It may be recalled later for another session of testing and modifications, used for an application, or even be sent to another parsing system.</Paragraph> <Paragraph position="5"> This editor is fairly simple, and more complex functions could be added. But it allows any possible modification of tree structures and already includes a certain number of functions. How to use the editor is then not irrmediately obvious, and to help the user all editing functions are in fact packaged within a special interactive interface. Modifications will be performed through this interface, which will be responsible for all interactions with the user.</Paragraph> </Section> <Section position="7" start_page="221" end_page="221" type="metho"> <SectionTitle> USER INTERFACE </SectionTitle> <Paragraph position="0"> Because ARBUS is intended primarily to be a development aid, the user interface was designed with particular care and constitutes a sizable part of the whole system. Without this interface, the large number of construction, parsing and editing functions available would have required a detailed instruction manual and a long training period to use the system fully.</Paragraph> <Paragraph position="1"> The basic principle followed in the design of the interface is then to guide the user as much as possible through an interactive dialog at the terminal. The interface totally isolates the user from underlying programs and redefines its own environment regardless of the implementation language. Allsystem functions will be called only by typing commands to the interface, which acts as a command interpreter and executes the corresponding programs.</Paragraph> </Section> <Section position="8" start_page="221" end_page="221" type="metho"> <SectionTitle> ARBUS 225 </SectionTitle> <Paragraph position="0"> The interface is patterned as a tree, in which one can move at will (fig. 2).</Paragraph> <Paragraph position="1"> This structure makes it possible to limit the number of co,ands available at each node of the tree, and these commands are displayed as menus on the screen. The menus vary at each step in the dialogue, but the conmmnds are always very simple.</Paragraph> <Paragraph position="2"> If necessary the system will prompt the user and ask precisely for any complementary information required to execute a command. Incorrect input is diagnosed and will cause no error in the program, which simply goes back to the previous step.</Paragraph> </Section> <Section position="9" start_page="221" end_page="221" type="metho"> <SectionTitle> TOP LEVEL CONSTRUCTION DISPLAY MODIFICATIONS PARSING FILES AUGMENTATIONS WORD STRUCTURE MODIFICATIONS MODIFICATIONS </SectionTitle> <Paragraph position="0"> We tried to classify functions in a clear way, and to split them~p in short operations to avoid burdening the user's memory. Any result is displayed at once.</Paragraph> <Paragraph position="1"> There are never more than five or six items to consider at any moment, whether one takes into account the number of commands in a menu or the number of levels in the structure of the interface. The current situation being always indicated on the screen, there is no need to keep track of events and the system requires almost no training before use.</Paragraph> <Paragraph position="2"> For example during the construction of the grammar, the branches of syntactic trees are displayed node by nede while being built, so as to prompt the user and show him the current position, For each new syntactic category, ARBUS will ask for the description of one more tree until the grammar is completed. The system itself takes care of the scheduling ef operations, prompts the user accordingly, and automatically builds the lexicon corgesponding to the grammar defined. The user is thus guided at every step.</Paragraph> <Paragraph position="3"> Automatic grapheme-to-phoneme translation of the vocabulary is also provided for speech recognition grammars. The user can input words in ordinary spelling, and they will be converted internally to phonetic form for phonemic speech recognition. Moreover pronunciation variants and linking forms are computed (work in progress by F. N~el, M. Esk~nazi and J. Mariani). One may therefore define a grammar in phonetic form without any prior phonetic training and without having to do the transcription oneself.</Paragraph> </Section> class="xml-element"></Paper>