File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/87/e87-1013_metho.xml

Size: 20,759 bytes

Last Modified: 2025-10-06 14:12:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="E87-1013">
  <Title>TEXT UNDERSTANDING WITH MULTIPLE KNOWLEDGE SOURCES: AN EXPERIMENT IN DISTRIBUTED PARSING</Title>
  <Section position="4" start_page="0" end_page="75" type="metho">
    <SectionTitle>
2. RATIONALE AND TECHNICAL REQUIREMENTS
</SectionTitle>
    <Paragraph position="0"> Several reasons recommend and support the choice of a distributed approach to text understanding. From a cognitive point of view, it is indubitable that humans perform such an activity incrementally.</Paragraph>
    <Paragraph position="1"> That is, not all what can be derived from a text becomes evident since the very beginning. Some features of the text are understood almost automatically and with minimum effort, others require more labor, whereas still others become clear only after a thoughtful process. This increasing depth of processing, which has differential effects about what is understood from the same piece of text  should be modeled also in an automatic system. A distributed architecture comprising a collection of specialized problem solvers (specialists) with different skill and competence, and working with different knowledge sources, seems a promising way to achieve incremsntallty.</Paragraph>
    <Paragraph position="2"> Such an architecture offers several advantages from a technical point of view, too. About these we mention: the possibility to adopt different techniques and methodologies for each specialist, the fact that specialists can be developed in isolation independently from each other; the possibility to change one or more specialists without implying a global restructuring of the whole system, the robustness that can be achieved by overlapping the capabilities of different specialists.</Paragraph>
    <Paragraph position="3"> the facility in designing and debugging, The main problem in adopting a distributed approach is that of control, i.e. making the specialists cooperate. As Cullingford (1981: 52) puts it, &amp;quot;... In an ideal system each expert would become available only when needed, run only so long as it had something useful to do and communicate its findings to interested parties in an efficient manner. If an appropriate level of integration could be achieved, one could hope to improve the capabilities of an (mderstanding system by adding new knowledge sources, to reuse experts in different problem domains and to investigate the relative performance degradation due to removing various knowledge sources.&amp;quot;.</Paragraph>
    <Paragraph position="4"> In our approach we adopt a form of control based on the interaction of each individual specialist with a central manager, which superdeg vises and directs the overall operation of the system by coordinating the autonomous activities of the specialists (bottom-up approach), and by exploiting its own general problem solving strategies (top-down approach), The prototype distributed parser which has been developed according to the ideas outlined above works in the domain of descriptive text understanding, more precisely computer science literature on operating systems. It receives in input a natural language text and produces in output a semantic representation of its meaning in the BLR/ELR representation language (Fum, Guida, and Tasso, 1984).</Paragraph>
    <Paragraph position="5"> Three main objectivers have been taken into account in the design of the parser: Incrementality of parsing and generation of the BLR/ELR. AS the parser has to cover a large variety of linguistic features and must rely upon k number of different knowledge sources, it seems appropriate that both analysis of the input text and generation of the BLR/ELR representation are carded out in a step-wise manner through successive additions and refinements.</Paragraph>
    <Paragraph position="6"> Also the structure itself of the BLR/ELR formalism, made up of a collection of propositions appropriately connected together and supplemented with additional information (e.g., about time, quantification, etc.), strongly suggests an incremental approach to parsing.</Paragraph>
    <Paragraph position="7"> Cognitive validity. The parser should not only produce a correct BLR/ELR representation of the input text, but it should also show some degree of linguistic competence in the way it operates intemally. In other words, it should provide an acceptable approximation of the basic mental processes that occur in humans.</Paragraph>
    <Paragraph position="8"> Effectiveness. The parser should be capable of operating in an efficient and correct way in non-trivial cases. Moreover, the parser should be easy to design and debug.</Paragraph>
  </Section>
  <Section position="5" start_page="75" end_page="76" type="metho">
    <SectionTitle>
3. A DISTRIBUTED ARCHITECTURE
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="75" end_page="75" type="sub_section">
      <SectionTitle>
3.1 Overall System Architecture
</SectionTitle>
      <Paragraph position="0"> As mentioned above, our distributed parser is constituted by a collection of individual specialists, each one expert in a facet of the parsing problem (e.g., syntactic analysis, disambiguation, reference, semantics, time, quantification, BLR/ELR construction, etc.). Each specialist is an autonomous problem solver, which has its own competence domain, where it can operate with certain and complete knowledge. However, it is assumed that no specialist has enough knowledge and competence to cover the whole parsing activity: all (or most) of them are necessary to successfully complete the parsing of a complex text. Moreover, we assume that specialists may be heterogeneous, i.e., implemented using different technologies (e.g., a deterministic algorithm, a knowledge-based system, etc.).</Paragraph>
      <Paragraph position="1"> Also, they may have partially overlapping competence areas, and even be redundant, i.e. there may be several specialists for the same task (e.g., for syntactic analysis). As we have stated that specialists are independent problem solvers, we also assume that they have no mutual knowledge: they do not know about each other, they do not even know about the existence of other specialists. This assumption is very important to allow a fully independent design of an individual specialist, without bothering about the others. null Each specialist can solve a well defined class of problems, and once a problem has been assigned to it, it can result in three different outcomes: succeas, i.e. the problem assigned has been solved and its solution produced; fail, i.e. the specialist has been unable to solve the problem and an alarm message is returned; need-help, i.e. the specialist has been successful in decomposing and partially solving the problem at hand, but it needs help from outside to proceed further in the solution process. In this case, the current problem is suspended, and (sub)problems are generated for which solutions are needed.</Paragraph>
      <Paragraph position="2"> The intemal operation of each specialist is not of interest here, as we have assumed that they may be heterogeneous. What is crucial is the interface they show towards the outside which is expected to be very simple. A specialist may receive a problem to solve, and issue a solution, other problems, or an alarm. It may also receive a solution to one of the (sub)problems it has previously generated, which will be used to resume the solution process of some suspended problem.</Paragraph>
    </Section>
    <Section position="2" start_page="75" end_page="76" type="sub_section">
      <SectionTitle>
3.2 Communication and Control Mechanisms
</SectionTitle>
      <Paragraph position="0"> Specialists are not allowed to directly communicate to each other, but can only communicate to a cooperation manager, which is in charge of organizing and controlling the overall activity of the parser. It embodies knowledge about: the actual architecture of the system, i.e. how many and which specialists are available; the competence of each individual specialist; how to match problems to specialists in order to exploit in the best way their specific capabilities; how to schedule the activity of the specialists, i.e. which specialists to activate first, taking into account priority and redundancy problems; how to correctly switch messages among specialists.</Paragraph>
      <Paragraph position="1">  The communication between specialists and the cooperation manager occurs according to a fixed protocol which includes three basic types of messages, namely: prob/ems, solutions, and alarms. as already outlined above.</Paragraph>
      <Paragraph position="2"> The working memory of the parser is a partitioned shared memory, where each specialist can read and write in its own partition only, but has full visibility on the entire memory. Clearly, in order to allow specialists to work correctly on the shared memory, it is necessary that a common representation language is adopted, at least for information that may conoem more than one specialist.</Paragraph>
      <Paragraph position="3"> The operation of the cooperation manager is basically messagedriven: it is all the time waiting for messages and, as soon as messages arrive from the specialists, they are stored in a buffer and later examined and treated according to some specific policy (e.g., the pdority of the messages or their origin may be taken into account). The cooperation manager is in charge of three main activities: it assigns problems to specialists according to their competence, current work load, etc.; - it passes solutions to the relevant specialists (i.e., those who issued the (sub)problem to which the solution refers); it manages alarms (e.g., by resorting to alternative specialists with similar or overlapping competence).</Paragraph>
      <Paragraph position="4"> The cooperation manager, however, in addition to the above mentioned message handling capability, has also its own strategies that can override, when needed, the basic message-driven style, thus affecting the overall operation of the parser. These strategies, that embed knowledge about &amp;quot;how to manage the parsing task&amp;quot;, are crucial to the successful activity of the parser H we really want to allow individual specialists to be designed and constructed independently from each other. In fact, as no global strategy is coded in the system, if must be explicitly assigned as an additional competence to the cooperation manager.</Paragraph>
    </Section>
    <Section position="3" start_page="76" end_page="76" type="sub_section">
      <SectionTitle>
3.3 The Specialists
</SectionTitle>
      <Paragraph position="0"> As illustrated above, our distn'buted parser is well suited to host a large vadety of specialists. We will briefly list in the following some of those utilized in the current implementation of the system.</Paragraph>
      <Paragraph position="1"> The Morphology Specialist (MS) is devoted to perform the morphological analysis of each word, i.e. extracting from the Dictionary all the relevant information and determining the appropriate morphological types and variables.</Paragraph>
      <Paragraph position="2"> The Encyclopedia Specialist (ES) is able to access the Encyclopedia for extracting semantic information and world knowledge.</Paragraph>
      <Paragraph position="3"> The Syntax Specialist (SYS) is able to identify the constituents of a sentence and to build up a parse tree. The current version is implemented through a context-froe grammar augmented with transformational rules.</Paragraph>
      <Paragraph position="4"> The Semantics Specialist (SES) is devoted to a semantic analysis of a sentence performed only through semantic information, discarding any syntactic processing.</Paragraph>
      <Paragraph position="5"> The Syntax-Semantics Specialist (SSS) is able to complement semantic analysis with available syntactic information (and viceversa) in order, for example, to resolve ambiguities.</Paragraph>
      <Paragraph position="6"> The 77me Specialist (TS) is able to attach to each proposition of the BLR/ELR the appropriate temporal information.</Paragraph>
      <Paragraph position="7"> The Reference Specialist (RS) is devoted to analyze pronominal and anaphoric references.</Paragraph>
      <Paragraph position="8"> The Quantification Specialist (QS) is capable of identifying the appropriate quantifier to attach to each concept in the BLR/ELR.</Paragraph>
      <Paragraph position="9"> The BLRIELR Generator Specialist (BEGS) is devoted to integrate all the information useful to actually build up the BLR/ELR representation of the meaning of the text.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="76" end_page="77" type="metho">
    <SectionTitle>
4. EXPERIMENTAL RESULTS
</SectionTitle>
    <Paragraph position="0"> In this section we will shortly illustrate some of the most significant characteristics of the parsing process by means of the analysis of a simple sentence extracted from a text on operating systems. Let us consider the following fragment of text: &amp;quot;... An integer priority is assigned by the scheduler to each process in the ready-queue....&amp;quot; The Cooperation Manager, hereinafter CM, is devoted to organize the work of the specialists which are able to solve specific parts of the overall problem. In the current version, the CM largely relies on the BEGS specialist for structudng the parsing process and for generating the BLR/ELR: each sentence in the text is processed one after the other, from left to right. We will discard in this illustration all the details concerning this specialist, as well as other specialists which are not essential for understanding the system operations.</Paragraph>
    <Paragraph position="1"> Moreover, we will not describe how the management of the shared memory and its partitions is actually carded on.</Paragraph>
    <Paragraph position="2"> As already mentioned, the CM can implement several parsing strategies by forcing different ways of organizing the contribution of the specific specialists to the solution of the overall task. It is important to stress that the proposed architecture allows to change quite easily the strategy adopted. In this example, a semantics-directed parsing wilt be shown. More specifically, when the sample sentence shown above is considered, the CM will assign the problem of semantically analyzing the sentence to all the specialists potentially capable to perform such an analysis (in the current version, SES and SSS). At the same time, it will assign to QS, RS, and TS the quantification, reference, and temporal analysis task, respectively.</Paragraph>
    <Paragraph position="3"> Appropriate problem messages will be sent to each of them, such for example: To: SES From: CM  On: &lt; ... the current sentence ... &gt; Priority: Auxiliary.</Paragraph>
    <Paragraph position="4"> Also the message requesting semantic analysis from SSS will have an Auxiliary pdonty since for the same problem more than one specialist is engaged and can possibly find a correct answer. When, later on, one of them will possibly recognize its inability to correctly complete the task, it will send back to the CM a message containing an alarm, causing in such a way a change in the priority of the semantic analysis problem, that will become Fatal The other three messages sent by the CM to RS, QS, and TS will have a Fatal priority, because no alternative specialists are able to contribute to the solution of that part of the overall problem.</Paragraph>
    <Paragraph position="5"> After these initial problem assignments, the CM enters a suspended state, which will be resumed whenever messages from any of the specialists will be received. RS, QS, and TS can generally carry on their activity only alter some semantic information about the sentence has been provided. To this purpose, all these three specialisis will send to the CM a message of the kind  On: &lt; ... the current sentence ... * Priority: Fatal.</Paragraph>
    <Paragraph position="6"> The use of a Fatal priority will cause a synchronization of the three specialists with the completion of semantic analysis, since their activity will be suspended as long as they will not receive beck from the CM a solution message containing an answer to this problem. In this case, CM has already sent appropriate requests conoeming the semantic analysis, and therefore all the activities will remain suspended until completion of the task.</Paragraph>
    <Paragraph position="7"> As noted above, beth the SES and SSS specialists are called to give their contribution to semantic analysis. The first that will come up with a complete solution will allow the CM to answer RS, QS, and TS.</Paragraph>
    <Paragraph position="8"> In this specific case SES is able to answer only partially. Semantic information on the concepts in the sentence will be requested through a problem message that the CM will forward to ES. The information that ES is able to extract from the Encyclopedia will include the following fragments:  where the slash indicates that neither quantification, nor reference or temporal information are included yet in the BLR.</Paragraph>
    <Paragraph position="9"> As syntactic information is not taken into account, SES sends an alarm message to the CM, since unable to build up the complete solution.</Paragraph>
    <Paragraph position="10"> On the other hand, SSS will succeed by integrating syntactic information provided by SS, and the semantic information shown above. SS needs also morphological information contained in the Dictionary, that will be requested through an appropriate message to the CM. The outcome produced by SSS is a mere complete version of the BLR, containing:</Paragraph>
  </Section>
  <Section position="7" start_page="77" end_page="78" type="metho">
    <SectionTitle>
10 ASSIGN (/SCHEDULER/,/PRIORITY/,/PROCESS/)
20 INTEGER (/PRIORITY/)
30 LOC (/PROCESS/,/READY-QUEUE/),
</SectionTitle>
    <Paragraph position="0"> where the ambiguity of considering READY-QUEUE as an argument of ASSIGN Or as an argument of the predicate LOC (relative to the preposition &amp;quot;in') has been resolved by means of semantic agreement between predicates and arguments. This solution will allow the CM to send an answer to the three suspended specialists QS, RS, and TS, that will resume their operation.</Paragraph>
    <Paragraph position="1"> It is interesting to illustrate how QS, RS, and TS can cooperate together.</Paragraph>
    <Paragraph position="2"> QS starts its processing from the logical subject of the sentence, i.e. SCHEDULER. In order to determine whether the definite article should be considered as indicating an anaphoric reference or something else, it will send the following problem message to the CM:  that will be forwarded to RS. Two things could happen at this point: the concept was already mentioned in previous parts of the text and RS will send beck the corresponding identifier as a solution, or the concept was not mentioned before in the text, and RS will answer that this is the first occurrence of SCHEDULER. In the former case, QS will quantify the entity with an existential quantification. In the latter case, the need arises of considering also the tense of the verb, that can be provided by TS. The present tense of &amp;quot;is assigned' makes QS decide for a universal quantification (Hess, 1985), i.e. &amp;quot;every scheduler assigns a priority'.</Paragraph>
    <Paragraph position="3"> Assuming the latter interpretation, QS will continue its processing with READY-QUEUE. Again RS will check whether a previous reference exists. Since this is not the case, RS looks for implicit references. The ES can provide an answer to this request, since in the SCHEDULER frame of the Encyclopedia it is stated that &amp;quot;schedulers are associated with waiting-queues, ready-queues, etc.'. READY-QUEUE is then considered to be one of the ready-queues associated with SCHEDULER. Moreover, since scheduler is already universally quantified, it will result that READY-QUEUE is existentially quantified with respect to SCHEDULER, i.e. &amp;quot;for every scheduler there exists a ready-queue&amp;quot;.</Paragraph>
    <Paragraph position="4"> This kind of process is carded on until eventually BEGS will integrate all the contributions of the other specialists, producing the following BLR/ELR:  An important aspect of the operation of CM is worthy to be stressed again: all the problem messages that CM receives do not contain any explicit suggestion on the specialist(s) that should be invoked. It is specific responsibility of the CM to menage these assignments by means of a specific knowledge base devoted to this task. In the current version of the system this is implemented through a simple rule-based mechanism.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML