File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1070_metho.xml
Size: 5,290 bytes
Last Modified: 2025-10-06 14:13:38
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1070"> <Title>THE &quot;WHITEBOARD&quot; ARCHITECTURE: A WAY TO INTEGRATE HETEROGENEOUS COMPONENTS OF NLP SYSTEMS</Title> <Section position="4" start_page="428" end_page="429" type="metho"> <SectionTitle> 2. Internal level </SectionTitle> <Paragraph position="0"> When a manager receives a Make.Conuection request frola a client, it creates an in box and an out box (and associated locks, used to prevent interference between components), through which information is p.'~ssed to and from the client. The Make.Connection request includes codes showing in which format(s) the client is expecting to deposit data in the iu box and read data from tile out box, lbr that connection.</Paragraph> <Paragraph position="1"> Mlhough data transfer could be programmed more efficiently, e.g. nsing lhfix sockets, our method is more general, as it uses only the file system, and we believe its overhead will be negligible in comparison with tile processing times required by the compouents.</Paragraph> <Paragraph position="2"> Ikn each out box, the client (KASUGA) actbatcs a reader process and tile relewmt mauagcr actiwttes a writer process. Conversely, for each in box, tile client activates at writer process and the manager activates a reader process. A zeader process wakes up regul:uly and checks whether its mailbox is both non-empty and nnlocked. If so, it locks the mailbox; reads ils contents; empties tile mailbox; unlocks it; and goes to sleep again. A writer process, by comparison, wakes up regul:uly and checks whether its mailbox is both empty and unlocked. If so, it locks the box, fills it with appropriate data, unlocks it, and goes back to sleep. For example, the writer associated with SYNT.AN will deposit in the appropriate out box the image of all tile inactive arcs created since the lm;t deposit. SItI?,EC provides, lor each of 40 prerecorded bunsetsu (elementary phrase), a set of about 25 phoneme malrices, one for each phoneme. A malrix cell contains the score for a given phoneme with a given begimfiug/ending speech frane pair. These nmtrices are then compared, and 3 other inatrices are computed. The tnp-scoring ln:llrix contains in each cell the tnl~-scnring phone and its score for Ihe corresponding begimliug/cnd. The 2nd-scoring a~d 3rdscoring matrices are computed sinfilarly. These three mauices are used to build the first layer of the whiteboard. To build the whilcboard's second layer, an ishmd-driven clmrt parser is used, where the matrices are cousklered as initialized charts. The over:dl best-scoring cell in the top matrix is established as the only anchor, and hi-directional searching is carried out wilhin the (handset) limits set by max-gap and max-overlap. A CFG written by J. llosaka for tile ASURA demos is now used as is. Parsing results are convertcd to syntact Jc:. \] a t< 5 ce. N (by Olt\[ chartto- la t t ice filter) and brought into KEF~.</Paragraph> <Paragraph position="3"> Then an image lattice, ww. la t t \] c e. N, is comptlted as the whiteboard's third layer, using a C-based ou-tine J-l{ dictionary. Each lexieal syntactic node gives rise to oue Fmglish word for each meafing. For example, ~ gives yes, yes-sir, the-lungs, ashes, etc.</Paragraph> <Paragraph position="4"> Layers of the whiteboard are represented by KEF, &quot;planes&quot;. We can move planes rehtlive to e\[ich olher; ztx~m The &quot;Whiteboard&quot; Architecture: a way to integrate... Boitet & Seligman, COLING-94 in various ways; put various information in the nodes (label, rule responsible, id, time span, score); exp,'md the nodes; open & close the nodes selectively. And we can color the nodes according to their score. It is possible to show or hide various parts of the whiteboard. In Figure 9, the first layer, the time grid, the lattice lines, and the initial/final lattice nodes have been hidden. Alternatively, we could hide constnlction (dotted) lines, rule boxes, label boxes, etc. The view of any part of the whiteboard can he changed for emphasis: one can for instance interactively select only the nodes above a certain confidence threshold.</Paragraph> <Paragraph position="5"> Overall processing can be inten'upted for examination.</Paragraph> <Paragraph position="7"> If this architecture is to be further developed in the future, one could use instead of KEE a general-purpose, portable interface building toolkit in order to avoid the oved~ead ,'rod overspecialization ,associated with using a complete expert system shell.</Paragraph> <Paragraph position="8"> KAS.COORD writes and reads data to and from the managers in a LISP-like format, and handles the transformation into KEE's internal fornmt. Each manager translates back ,and forth between that format and wbatever format its associated component happens to be using.</Paragraph> <Paragraph position="9"> Ilence, formats must be precisely defined. For inst,'mce, the edges produced by the speech recognizer are of the form (begin end phoneme score). The nodes and edges of the conesponding phoneme layer in the whiteboard are of Ihe form (node-id begin end phoneme score (in-arcs) (out-arcs)), with ares being of the form (are-id origin extremity weight).</Paragraph> </Section> class="xml-element"></Paper>