File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-1070_intro.xml
Size: 18,851 bytes
Last Modified: 2025-10-06 14:05:34
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1070"> <Title>THE &quot;WHITEBOARD&quot; ARCHITECTURE: A WAY TO INTEGRATE HETEROGENEOUS COMPONENTS OF NLP SYSTEMS</Title> <Section position="3" start_page="0" end_page="428" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> Speech translation systems must integrate components handling speech recognition, machine translation and speech synthesis. Speech recognition often uses special hardware. More components may be added in the future, for task understanding, multimodal interaction, etc. In more traditional NLP systems, such ,'cq MT systems for written texts, there is also a trend towards distributing various tasks on various machines.</Paragraph> <Paragraph position="1"> Sequential ,architectures \[10, 11\] offer ,an easy solntion, but lead to loss of information and lack of robustness. On the other hand, reports on experimenls with blackboard architectures \[16, 13, 20\] show they also have problems.</Paragraph> <Paragraph position="2"> We ,are exploring an intermediate architecture, in which components are integrated under a coordinator, may be written in various programming languages, may use their own data structures and algorithms, and may run in parallel on different machines. The coordinator maintains in a whiteboard an image of the input and output data structures of each component, at a suitable level of detail. The whitehoard fosters reuse of partial results and avoids wasteful recomputation. Each component process is encapsulated in a manager, which transforms it inlo a server, commuuicating with external clients (including the coordinator) via a system of mailboxes. Managers handle the conversions between internal (server) and external (client) data formats. This protocol enhances modularity and clarity, because one needs to to explicitly and completely declare fl~e appearance of the partial results of the components on the whileboard.</Paragraph> <Paragraph position="3"> Managers may also make batch components appear :is incremental components by delivering outputs in a piecewise fashion, thus taking a first step towards systems simulating simultaneous translation.</Paragraph> <Paragraph position="4"> We have prc~luced a rudimentary architectural prototype, KASUGA, to demonstrate the above ideas.</Paragraph> <Paragraph position="5"> In fl~e first section, our four main guidelines ,are detailed: (1) record overall progress of components in a whiteboard; (2) let a coordinator schedule the work of components; (3) encapsnlate components in managers; and (4) use the managers to simulate Incremental Processing. In the second section, some high-level aspects of the KASUGA prototype ,are first described, and a simple demonstration is discnssed, in which incremental speech translation is simulated. Lower-level details are then giveu on some internal aspects.</Paragraph> <Paragraph position="6"> I. TIlE WII1TEBOARD ARCHITECTURE 1. Record overall progress in a whitelmard The whiteboard ,architecture is inspired by the chart architecture of the MIND system \[8\] and later systems or formalisms for NLP \[1, 5\], as well as by the blackbo~u'd architecture, first introduced in HEARSAY-II \[6, 13\] for speech recognition, llowever, there is a significant difference: tile components do not access the whiteboard, and need not even know of its existence.</Paragraph> <Paragraph position="7"> There are 2 main problems with the sequential approach.</Paragraph> <Paragraph position="8"> * Pl: loss of information If components ,are simply concatenated, as in Asnra \[10, 11\], it is difficult for them to share partial results. Information is lost at subsystem interfaces and work has to be duplicated. For example, the cited system uses an LR parser to drive speech recognition; but syntactic structures found are discarded when recognition candidates are passed to MT. Complete reparsing is thus needed.</Paragraph> <Paragraph position="9"> * P2: lack of robustness Communication difficulties between subsystems may also dmnage robusmess. During reparsing for MT in ASURA, if no well-formed sentences are found, partied syntactic structures are discarded before semantic analysis; thus there is no chauce to tr,'mslate partially, or to use semantic inlonnation to complete the parse.</Paragraph> <Paragraph position="10"> The pure blackboard approach solves P1, but not P2, and introduces four other problems.</Paragraph> <Paragraph position="11"> * P3: control of concurrent access In principle, all components are allowed to access the blackboard: complex protection and synchronization mechanisms must be included, and fast components may be considerably slowed down by having to wait for permission to read or write.</Paragraph> <Paragraph position="12"> * P4: commnnication overloads The amount of information exchanged may I~ large. I1&quot; components rnn on different machines, such :is is often the case for speech-related componeuts, and may be the case for Example-Based MT con~ponents in the future, commmfication overloads may annihilate the bcuciit of using spcckdized or distributed hardware.</Paragraph> <Paragraph position="13"> * P5: efficiency problems As components compute directly on the blackbo,'u'd, it is a compromise by necessity, and can not offer the optimal kind of data structure h~r each component.</Paragraph> <Paragraph position="14"> * P6: debugging problems These ,are due to the complexity of writing each component with the complete blaekbo,'u'd in mind, and to the parallel nature of the whole computation.</Paragraph> <Paragraph position="15"> In the &quot;whiteboard&quot; approach, the global data structure is hidden from the components, and accessed only by a &quot;coordinator&quot;. (The whiteboard drawing is expanded later.) This simple change makes it possible to avoid problems P3-P6. It has also at least two good points: - It encourages developers to clearly define andpublish what their inputs and outputs are, at least to the level of detail necessary to represent them in the whiteboard.</Paragraph> <Paragraph position="16"> - The whiteboard can be the central place where graphical interfaces are developed to allow for e~Lsy inspection, at v,'u'ious levels of det~fil.</Paragraph> <Paragraph position="17"> As long as an NLP system uses a central record accessed ouly by a &quot;coordinator' and hidden fi'om the &quot;comlxmeuts&quot;, it cau be said to use a whiteboard architecture. It remains open what dala structures the whiteboard itself should use. As in \[21, we suggest the use of a time-aligned lattice, in which several types of nodes can be distinguished. In stating our preference for lattices, we must first distinguish them from grids, and then distinguish true lattices from 2 types of quasi-lattice, charts and Q-graphs (fig. 2 & 3).</Paragraph> <Paragraph position="19"/> <Paragraph position="21"> Grids have no arcs, but nodes Co,Tesponding to time spans. A ncxle N spanning It132\] is implicitly connected to another node N' spanning \[t'l,t'2\] iff its time span begins earlier 01 gt'l ), ends strictly earlier (t2<t'2), and the respective sp,'ms (a) are not too far apart anti (b) don't overlap tc~) much (t2-max-gap_<t'l ~t2+max-ovorlap). max-gap and max-overlap are gapping and overlapping threshokts \[12\]. Because t2<t'2, there can be no cycles.</Paragraph> <Paragraph position="22"> ht a lattice, by contrast, nodes and arcs are explicit.</Paragraph> <Paragraph position="23"> Cycles are also forhiddcn, and there must be a unique first node and a unique last node.</Paragraph> <Paragraph position="24"> (;rids have often been used in NLP. l&quot;or example, Ihe output of the phonetic component of Kt~AL \[121 was a word grid, and certain speech recognition programs at NI'R l~r(xluce phoneme grids 1. In gener~d, each uc~le bears a time span, a label, and a score. Grids can also be used to represent an input text obtained by scanning a bad original, or a stenotypy tape \[9\], and to implement some working structures (like flint of the Cocke algorithm).</Paragraph> <Paragraph position="25"> llowever, we will require explicit arcs in order to explicitly model possible sequences, sometimes with associated information concerning sequence probability.</Paragraph> <Paragraph position="26"> Thus mw grkls am insufficient for our whiteboards.</Paragraph> <Paragraph position="27"> Two kinds of quasi-lattices have been used extensively, in two wtrietics. First, chart structures have origi,mlly been intr(~luccd by M. Kay in the MIND system around 1965 \[8\], In a ch:ut, as understood tt~lay (Kay's charts were more general), the nodes ,are arranged in a row, so that there is always a path between any two given nodes. The arcs bear the information (label, score), not the nodes. Ch\[u'ls are also used by many unification-based natural hmguage analyzers \[ 141.</Paragraph> <Paragraph position="28"> Chart structures are unsuitable for represcnting restflts on a whiteboard, however, because they are tmable to represent alternate sequences. Consider the alternate word sequences of Figure 4. It is not possible to arr.'mge the words in a single mw so that all and only the proper sequences can be read out, 1 1</Paragraph> <Paragraph position="30"> A second type of quasi-lattice is the Q-graphs of \[15\] and their exteasiou \[17\], the basic data structure for text representation in tile METI~,O \[14\] aud TAUM-Aviation \[71 systems. A Q-graph is a loop-free graph wilh a tmique entry node and a uni(lue exit node. As iu charts, the inlonnalion is carried on the arcs. It cousisls in labeled or atmotaled trees. As there may be no l)ath between two nixies, Q-graphs can indeed faithfully represent alternate sequences like those of Figure 4. But in this case it is necess;uy to use, on more thau one arc, identical labels referring to the same span of the input. For representation on a whitclxmrd, such duplication is a drawback.</Paragraph> <Paragraph position="31"> To simplify bookkeeping and visual presentation, we prefer a representation in which a given label referring to a given span appeaJw in only one place. A true lattice, like flint of Figure 5, makes this possible.</Paragraph> <Paragraph position="32"> &quot;lhe decomposition of the laltice in htyers seems natural, aud leads to more clarity. Fach layer contains results of 1115, 16\]. By contrast, tile IIWIM \[20\] system used a &quot;phonetic lattice&quot; on which an extended ATN operated. The &quot;Whiteboard&quot; Architecture: a way to integrate... Boitet & Seligmcm, COLING-94 one component, selected to the &quot;appropriate level of detail&quot;. Its time-aligned character makes it possible to organize it in such a way that everything which has been computed on a certain time interval at a certain layer may be found in the same region. Each layer has three dimensions, time, depth and label (or &quot;class&quot;). A node at position (i,j,k) corresponds to the input segment of length j ending at time i and is of label k. All realizations of label k corresponding to this segment are to be packed in this node, and all nodes corresponding to approximately equ,'d input segments am thus geometrically clustered.</Paragraph> <Paragraph position="33"> In other words, ambiguities are packed so that dynamic programming techniques may be applied on direct images of the whiteboard. Figure 6 gives an ex,'unple, Where the main NP has been obtained in two ways.</Paragraph> <Paragraph position="35"> The true lattice, then, is our preferred structure for the whiteboard.</Paragraph> <Paragraph position="36"> We said that the whiteboard could be a central place for transp,'u'ent inspection, at suitable levels of detail. We use the notion of &quot;shaded nodes&quot; for this.</Paragraph> <Paragraph position="37"> - &quot;White&quot; nodes are the real nodes of the lattice. They contain results of the computation of the component associated with their layer: a white node contains at least a label, legal in its layer, such as NP, AP, CARDP, VP... in the example above, and possibly more complex information, as allowed by the declaration of the layer in the whitelx~ard.</Paragraph> <Paragraph position="38"> - &quot;Grey&quot; nodes may be added to show how the white nodes have been constructed. They don't belong to the lattice structure proper. In the example above, they stand for rule instances, with the possibility of m-->n rules. In other cases, they may be used to show the correspondences between nodes ot two layers.</Paragraph> <Paragraph position="40"/> <Paragraph position="42"> - &quot;Black&quot; nodes may be used to represent finer steps in the computation of the component, e.g. to reflect the active edges of a chart parser.</Paragraph> <Paragraph position="43"> Whiteboard la rers are organized in a lc, op-li'ee dependency graph. Non-linguistic as well as linguistic information can be recorded in appropriate layers. For example, in a multimodal context, the syntactic analyzer might use selected information from a map layer, where pointing, etc.</Paragraph> <Paragraph position="44"> could be recorded. Interlayer dependencies should be decl~u'ed, with associated constraints, stating for instance that only nodes with certain labels can be related to other layers.</Paragraph> <Paragraph position="45"> Ilere is an illustration of that idea, wilhout any pretense to propose a whiteboard for a nndtimodal NLP ,~'ystem 2. Let a coordin'ator schedule tile components In its simplest form, a coordinator only transmits the results of a component to Ihe next component(s). l lowever, it is in a position to carry out global strategies by filtering low-ranking hypotheses and transmitting only the most promising part of a whitcboard layer to its processiug component. Further, if certain components make uselhl predictions, the coordinator can pass these to other components as constraints, ,along with input. 3. Encapsulate components in managers Developers of components should be free to choose and vary their algorithms, data structures, programming languages, and possibly hardware (especially so lor speech-related components). Our approach is to encapsulate existing components in managers, which hide them and transform them into servers. This strategy has the furlher adv,'mtage of avoiding any direct call between coordinator and components. To plug in a new component, one just writes a new manager, a good part of which is generic. The &quot;Whiteboard&quot; Architecture: a way to integrate... Boitet & Seligman, COLING-94 A m,'mager has a request box where clients send requests to open or close connections. A connection consists of a pair of in and out mailboxes, with associated locks, mid is opened with certain paraneters, such as its sleep time and codes indicating pre-agreed import and export formats. The coordinator puts work to do into in-boxes aid gets results in corresponding out-boxes.</Paragraph> <Paragraph position="46"> As illustrated in Figure 1 above, a client can open more than one connection with the sane manager. For exanple, au on-line dictionary might be called for displaying &quot;progressive&quot; word for word translation, as well as for ,'mswering ternfinological requests by a human interpretcr supervising several dialogues and l~ddng over if needed. And a malager can in principle have several clients.</Paragraph> <Paragraph position="47"> llowever, this potential is not used in KASUGA.</Paragraph> <Paragraph position="48"> 4. Simulate incremental processing In real life, simullanexms interpretation is often preferred over consecutive interpretation: although it may be less exact, one is not forced to wait, and one can react even before the end of tile speaker's utterance. Incremental processing will thus be an iinportant aspect of future machine interpretation systems. For instance, a sem.'mlic processor might begin working on the syntactic structures hypothesized for early parts of an utterance while later parts ,are still being syntactically an,'dyzed \[19\]. Even if a component (e.g., a W cun'ently existing speech recognizer) has to get to file end of the utterance before producing any result, its nmnager may still m;tke its processing appear incremental, by delivering its result piecewise and iu the desired order. I lence, this organiz'~tion makes it possible to siintfiate future incremental components.</Paragraph> <Paragraph position="49"> 11. TIlE KASUGA PROTOTYI'E 1. External level The coordinator (KAS.COORD) is writtcn in KEK TM, au object-oriented expert system shell with excellent interfacebuilding tools. The whiteboard is declared ill KEF\]s object language. KEE itself is written ill Common lisp.</Paragraph> <Paragraph position="50"> Three components are inw/lved: - speech recognition (SP.REC) providing :t 3-level grid, progrmnmcd in C \[15\]; - ish'md-driven syntactic chart-parsing (SYNT.AN) deriving words and higher-level syntactic units, programned in C; - word-for-word translation (WW.TRANS) at file word level, written in C aid running on another machine.</Paragraph> <Paragraph position="51"> The tanagers are written in Lisp, ,'rod run independently, in three Unix processes. Each manager ,and the c(gmlinator can rat in different Unix shells. Although WW.TRANS is already accessible as a server on a distant machine, we had to create a manager lbr it to get the intended behavior. With only these components, it is possible to produce a simple demonstration in which incremental speech translation is simulated and the transparency gained by using a whiteboard is illustrated. The phonemes produced by SP.REC are assembled into words and phrases by SYNT.AN. As this goes on, WW.TRANS produces possible word-for-word translations, which are presented on screen ,'u,~ a word lattice.</Paragraph> <Paragraph position="52"> KASUGA's whiteboard has only three layers: phonemes; source words and phrases; and equivalent target words. At the first layer, the phoneme lattice is represented with phonemes in nodes. At the second layer, we retain only the complete substructures produced by SYNT.AN, that is, the inactive exlges. Phonemes used in these slructures appear again at that layer.</Paragraph> <Paragraph position="53"> In KEE, we define a class of NODES, with subclasses WHITE.NODES, GREY.NODES, PIlON.LAYI~P,.NOI)ES, aud SYNT. I,AYER. NODES in tile syntactic htycr. NODES have a generic display method, and subclasses have specialized variants (e.g., the placing of white nodes depends on their time interval, while that of grey nodes depends on that of the white nodes they cermet0.</Paragraph> </Section> class="xml-element"></Paper>