File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-1080_metho.xml
Size: 17,685 bytes
Last Modified: 2025-10-06 14:12:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-1080"> <Title>A PROCEss-AcTIVATION \]8~SF, D PARS~gl ~kL, GOR~'I-la~ I~OR TIlE DEVELOPMENT OF \]NATURAL LANGIIAGIg Gk~</Title> <Section position="2" start_page="590" end_page="590" type="metho"> <SectionTitle> 4o CONTEXTUAL RULES </SectionTitle> <Paragraph position="0"> Rule activation bymeans of the rule-actlvation function, together with NOP rules can be used to handle context sensitive languages, However, rials is entirely done by means of CF productions and the augmentations.</Paragraph> <Paragraph position="1"> A typical CS production is: ~h A p~ --~ p, 1~ ~h where p,, p~, 13 are strings Of symbols, and A is a non-terminal symbol. A bottom-up applieation of such a production is possible lfit happens in two steps: 1) indlviduation of the context p~ I~ P~; the right-hand side must match a sequence of sub-trees that covers 1~13 1~; 2} Inside this context we can perform the api~lleaflon of the CF production A .-~ \[~ building the node A over the sequence of nodes characterized by ~l. So the complete application for a CS production is made in two steps: the tlrst one concerns context determination, the context being represented by the right-hand side of the CS production; the second step is just the application of a CF production if and only if the first step has determined the context where the CF production is applicable. These considerations allow us to say that: step 1 can be performed by the application ofa NOP rule using the NOPspecial categories; in fact this kind of rule Is useful in detet~ining the context by defining a NOP rule with production: { <NOP> I <NOP.-SE> I <NOP-ASE> } --> ~t, ~ bt2 Step 2 can be performed by the application of an activated rule; in fact, wizen the rule at step 1 determines the context it cart activate an inactive rule with a production A--~13, indicating in the cMI to rule-activation the last node In the sequence ~.</Paragraph> <Paragraph position="2"> Now we can give the definition of contextual nile.</Paragraph> <Paragraph position="3"> We say that a rule is contextual if it is a NOP rule with production: {<NOP> I <NOP-SE> \[ <NOP-ASE> } -~ w z ... w and inside the augmentations there is a rule activation of ,gJ~g_~t one inactive rule which has a production: where VN is the set of the non--terminal symbols of the grammar.</Paragraph> <Paragraph position="4"> This definition allows a nesting of contextual xnules: In thct an activated rule can be a contextual rule itself, In addition, we can activate more than one rule at a time; in this way we can access several contexts Inside a main context.</Paragraph> <Paragraph position="5"> We suggest a method to make possible asynchronous operations, i.e., how two independent rules can interact with each other in order to perform long distance operations. All this is based on the fact that we must be sure that a certain rule will be applied after mlother and the earlier rule wants to communicate some information to the other one, To this end we have adopted a communication meeharflsm, that we call message pa~alug, which Is not based on nmtehing as all the previously explained opexations, but on executing two basic tasks: sending and receiving, The sending task is firstly performed by the sending rule that sends a message to a receiving rule; afterwards the receiving rule must perlbrm the receiving task to receive the message, These two tasks are executed by the two rules at two independent thnes, i.e., when the rules are applied, In the following we denote the sending rule as Rs and the receiving rule as Rr, and we assume they are standard rules: so we denote with SN the node built by Rs mad with RN the node built by Pa deg.</Paragraph> <Paragraph position="6"> We state two different approaches for what a ~is: 1) the rules access a global feature structure where they store global features. Each role can access this structure and whatever feature value In It; 2) a Message-Box exists where a rule can send a message to another specified rule. &quot;l~e Message-Box is accessible from every rule but the messages are accessible only by receiving rules. A message is composed as follows: a reference to the teature structure of SN: Rs makes avafiable its feature structure to Rr; a sequence of operations, possibly empty, that Rr ~-o.ust execute.</Paragraph> <Paragraph position="7"> It Is not necessm3r that both these Items m'e present ina message, In the case of the global feature structure all the rules have access to It. We recall that all the feature structures Included Ill the nodes of the graph are local to their own node, Each rule earl store in or get from the global structure features that are global for the sentence: then the messages are feature structures and the same type of oIJerations allowed on the feature structures of the nodes of the graph Is possible on this sta~aetureo The Message-Box Is a structure referred to by all rules that want to send or receive messages. A rule Rs, building the node 8N. sends a message which is automatically inserted In the Message-Box specifying: Its name Rs, the receiving rule Rr, a reference to the feature structure of SN which is made available to Rr, a list of operations, possibly empty, to be perlbrmed by Rr. Until the messages are sent, they are the exclusive property of Rs. Wizen they are sent Rs loses Its property, rights, and only the rule Rr specified In the messages is authorized to get them. In addition, Rr finds In the message a reference to a feature structure ~md this structure is available only to It and always local to its own node.</Paragraph> <Paragraph position="8"> Message passing, In either of the two realizations, is a way to facilitate the tndlviduation and treatment of existing relations among phrases or parts of them. It is certainly flexible and not expensive because It avoids searches, i,e., matches, Inside the graph, and it can be a valid alternative to NOP rules that require a certain number of matches to find particular nodes In the graph. In fact, if there was not overlapping of the sub--trees rooted In SN and RN, thenwe can solve relations between SN and RN by applying a proper NOP rule, but, more efficiently, message passing allows us to avoid a certain computational overhead peribnning proper operations directly in Iks and Rx.</Paragraph> <Paragraph position="9"> When NOP rules are applied they act upon a structure already built. It Is also possible to activate i~ales that pertbl-m further building (contextual rules) and/or teaturing operations within a context. This process of activation can be nested many times inside a certain structure, This analysis per/orms a kind of operation that is virtually directed toward the bottom, in depth. If there was a partial or total overlapping between the sub-trees rooted In SN and RN, then - In this case .- when iLs sends a message;, assuraes that Rr will be applied above its node SN; in thkl way it Is possible to evaluate the consequences of certain operations on a structure which Is not yet but It could be butt. In this case we act toward the top of the parsing slxueture, through as many levels as we want, In contrast, using NOP rules, we only act on an existing structure representing deeper levels.</Paragraph> <Paragraph position="10"> So we can distinguish two ways of operation for long distance analysis araong phrases or parts of them: breadth ;malysls, using both NOP rules or message passing; depth analysis which can be top-down with NOP rules or bottom-up with message passing.</Paragraph> <Paragraph position="11"> The mechanism of the messages so described is performed through functions that can be used within the augmentations.</Paragraph> <Paragraph position="12"> /Gxishman 1976/, designed to run CGU rules, carrying out the syntactic and semantic analysis In parallel. It is a bottom-up algorithm, a~nd it performs left-toorlght scanning and reduction in an Immediate constituent analysis. The data structure It works on is a graph where all possible parse trees are connected. The complete parse h'ee(s) is (are) extracted from the graph in a subsequent step, Therefore, the parser Is also able to create structure fragment,~ for ill-formed sentences, thus returning, even in tills ca;~e, partial analyses. This Is particularly useful for diagnosis and debugghlg.</Paragraph> <Paragraph position="13"> Parsing te.rmination occurs In a natural way, when no more rule can be applied and the input string is completely scanned, Before entering the parser a preprocessor scans the sentence fi'om left to ~ght, performs the dictionary look-up tbr each form in the Input string, and returns a structure, tile preprocessed sentence, with the syntactic and semantic Information taken from the dictionary.</Paragraph> <Paragraph position="14"> The graph Is composed of nodes: the nodes can be either terminals or non-termln~s. Terminal nodes are built in co:a'espondence to a scanned form, whereas non-terminal ones arc built whenever a rule is applied, obviously the rule must not be a NOP rule.</Paragraph> <Paragraph position="15"> As stated above the parser is seen as a processor arid It sees the rules as processes, It handles a queue of w~ting processes/rules to be executed. When the parser takes a packet, for every rule it builds a process descriptor and Insexts it in the queue. We call such a process descriptor an application specification (AS), while the queue is c~dlcd the application specifications Hst (ASL), ASs are composed of: a node identifier, through this node the parser starts fl'ie ntatching; the nmne of the rule that the parser will apply; - only in the case of an AS of an activated rule this Item is the context where the nmned activated rule will be applied, l.e. the nodes that matched the right-hand side of the activating rule, otherwise this item is left empty. ASs in ASL are ordered depending upon the rule involved in an AS. In general, ff stm~dard active rules have to be executed, ASL is handled with a LIFO policy. If we consider the case of NOP rulesdeg then these rules must be ordered before the others, since featm'e modifications they may produce can ser~e as input to other rules of the same packet, which are applied after them..4.n Inactive rule can be activated Just ibr one application by means of rule-activation traction: the activated rules must be applied immediately alter the end of the activating rule. So this kind of rules has the highest priority of execution with respect to NOP rules and s &quot;tandard active rules. Then xalleactivation inserts an activation ~pect/tea~lo~ on the top of/KSL for the activated rule. Sunamarlzlng, the roles have the tbllowing decreasing priority order of execution: 1) activated rules; 2) active NOP rules; 3) standard active rules, Once a node is created, be it terminal (in correspondence to a scarined tbrm) or non~terminal (in eon'espondence to a reduction), the parser inserts In the ASL an AS for every rule in the packet corresponding to the categoxy of the new created node: i.e. the new node Is tile one specified In every inserted AS, The parser performs all possible reductions building more than one node if possible, extracting one AS at a time before analyzing the next one. After ml AS ls extracted tom the ASL, the parser gets file specified rule: the first step is to match the right-hand side on the graph. The nodes matching a right-hand side are searched by the matcher: It returns one or more sets of these nodes, called reduction sets. For every reduction set, the application of the current rule is h-led. In this way we can connect together all possible parses for a sentence in a unique structure. Termination occurs when the ASL is e~.npty and the preprocessed string is completely scanned.</Paragraph> <Paragraph position="16"> Afterwards the parser returns the graph, kom which ~1 parse trees satisi~ing the tbllowing conditions are extracted: a node covers the entire sentence and Its category Is the root symbol of the grmnmar. Here is the complete algorithm of the parser: * Until the end of tile sentence ts not reached: , Scan atorm: * Ill,lid a new ternmml xmde for the scanned tbxm; ~ interpretation of the node: o g.~ the packet corresponding to Its categoi T and for every rule In the packet ~ tile AS In the ASL ; F_Qr~ AS in the ASL: * gtPS the first AS from the top of the ASL; * ggi the specified rule kl tile AS, it Is the current rule, and access to the node specified in the AS, it is tim emxent node; * starting from the cun~nt node perform tile match on the graph using tile production of the current rule; i_f at least one reduction set is found hlh.c_r_l: deg F r_~K._C~YC,~t reduction set: - Apply the current rule; o If a new non-terminal node is butt ~ gtk the corresponding packet to its category and for every rule in it ~ the AS in the ASL; .C/g.~: o Apply recovery actions of the current rule; In this algorithm by match we mean the operation of searching the reduction sets and by 'apply the current rule' we mean the standard rule application starting from tile test checking as stated for the CGU model; particular ways of application, e,g, NOP rules, depend on the particular rule definition.</Paragraph> </Section> <Section position="3" start_page="590" end_page="590" type="metho"> <SectionTitle> 7. AN EXAMPLE </SectionTitle> <Paragraph position="0"> The example concerns a simple fragment of a LFG written in SAIL according to the CGU model, Our example is taken from/Kaplan 1982/and/Winograd 1983/.</Paragraph> <Paragraph position="1"> The lexical entries for this grammar in SAIL are the following: all the fields appearing in the CGUs can be defined; in addition two fields are devoted to the state definition (STATUS field) and the rule type definition, that is ff the rule is a standard rule or a contextual or a NOP rule (CNTXTLORNOPR field). The rules are the following: ; getf-pn gets feature values from the parent node The graph built by the parser applying these rules to the sentence 'a girl handed the baby the toys' is equivalent to the e-structure built by the corresponding LFG as shown in/Wlnograd 1983/. The top node S contains the following feature structure: with the semantic value: (Hand Girl Baby Toys), Comparing the solution of the LFG version with the feature structure and the semantic value of the SAIL version we have that the LFG solution is equivalent to the above feature structure plus the semantic value.</Paragraph> </Section> <Section position="4" start_page="590" end_page="590" type="metho"> <SectionTitle> 8. THE SAIL INTERFACING SYSTEM </SectionTitle> <Paragraph position="0"> The SAIL Interfacing System (S.I.S.) is the framework where a user can interact with SAIL in developing NL applications. In fact SIS is organized in Interface Levels (I.L.s): in SIS we commonly speak of Interface Level Applications (I.L.A.s) which are the association of an IL with a grammar.</Paragraph> <Paragraph position="1"> If IL-Name is the name of an IL, and G-Name is the name of a grammar which defines a particular language through a dictionary and a set of CGU rules, then the pair <IL-Name, G-Name> defines an 1LA inside the SIS: this application is a task performed by that particular IL.</Paragraph> <Paragraph position="2"> In this way the development environment is based on different layers of rules, which are processed by the same parser and can handle the external Interface, the particular application, and any request issued by the user. In fact, the grammar of an ILA defines a language which can be used by the user for sending to the system his requests so that are caught by the parsing system and immediately satisfied.</Paragraph> <Paragraph position="3"> SIS is structured in 2 main ILs: the Kenael Interface Level (K.I.L.) and the Natural Language IL (N.L.I.L).</Paragraph> <Paragraph position="4"> When the system runs only two ILAs are active and available to the user: the KIL, associated to the Kernel Grammar (K.G.) and the Current Running Interface Level (C.R.I.L.). The KIL is always aetive because it is the core ILA of SIS and Its purpose Is to handle the overall system, so when the system Is started the user is introduced to the Kernel Interface Level. The Kernel Grammar is a semantic grammar associated with the KIL and defines a kernel language of commands and through them the user can use all the functionality of the system such as grammar building, parse checking, running other lI.As.</Paragraph> <Paragraph position="5"> When SAIL starts up, the KIL is also the CRIL, but when the user wants to load as CRIL another ILA defined In the system, for example a NLIL application, then a KIL command allows this and NLIL becomes the CRIL by loading a grammar associated to the NLIL: in this way the CRIL is updated to the new application and the loaded grammal becomes the current rumllng grammar.</Paragraph> <Paragraph position="6"> A subset of KIL commands defines a language through which the user can e~e the parsing structures generated by the parser for all the sentences input until that moment. This tool, named ANAPAR (ANAlysis of PARsing), is useful for the grammar and parse checking in deveh)ping NL applications, Finally, we want to point out that the particular structure given to ~IS enables the user to modify the front-end to SAIL by n mdifying the corresponding grammar of the KIL; in fact, all the files involved in their definition are accessible to the user who can modify those files as he wishes, or extend the language by introducing new gramma~ rules.</Paragraph> </Section> class="xml-element"></Paper>