File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1082_metho.xml
Size: 9,698 bytes
Last Modified: 2025-10-06 14:14:14
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1082"> <Title>Research on Architectures for Integrated Speech/Language Systems in Verbmobil</Title> <Section position="4" start_page="486" end_page="487" type="metho"> <SectionTitle> 3 Parallel Parsing </SectionTitle> <Paragraph position="0"> One of our main research interests has been the exploration of performance gains in NLP through parallelization. To this end, we developed a parallel version of the INTARC parser. Although the results so far are yet not as encouraging as we expected, our efforts make for interesting lessons in software engineering. The parallel parser had to obey the tbllowing restrictions: Running on our local shared memory lnnltiprocessor (SparcServerl000) with 6 processors, parallelization should be controlled by inserting Solaris-2.4 thread and process control primitives directly into the code. The only realistic choice we had was to translate our parser with Chestnut Inc.'s Lispto-C-Translator automatically into C. Since the Lisp functions library is available in C source, we could insert the necessary Solaris parallelisation and synchronization primitives into key positions of the involved fnnctions.</Paragraph> <Section position="1" start_page="486" end_page="487" type="sub_section"> <SectionTitle> 3.1 Parallelization Strategy and Preliminary Results </SectionTitle> <Paragraph position="0"> For effective parallelization it is crucial to keep communication between processors to a minimum.</Paragraph> <Paragraph position="1"> Early experiments with a fully distributed chart showed that the effort required to keep the partial charts consistent was much larger that the potential gains of increased parallelism. The chart must be kept as a single data structure in a shared memory processor, where concurrent reads are possible and only concurrent writes have to be serialized with synchronisation primitives. An analysis of profiling data shows that even the heavily optimized UG formalism causes between 50% -and 70% of the compntational load in the serial c~e.</Paragraph> <Paragraph position="2"> Therefore we provide an arbitrary number of unification workers running in parallel which are fed unification tasks from the top of an agenda sorted by scores. Due to the high optimization level of the sequential parser, load-balancing is faMy poor. Namely, the very fast type check used to circumvent most unifications, causes large disparities in the granularity of agenda tasks. Furthermore, pathological examples have been found in which a single unification takes much longer than all other tasks combined.</Paragraph> </Section> </Section> <Section position="5" start_page="487" end_page="488" type="metho"> <SectionTitle> 4 Distributed Control in </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="487" end_page="487" type="sub_section"> <SectionTitle> Verbmobil </SectionTitle> <Paragraph position="0"> The question of control in VM is tightly knit with the architecture of the VM system. As yet, the concept of architecture in VM has been used mostly to describe the overall modularization and the interfaces implied by the data flow between modules. This socalled dornair~ architecture is incomplete in the sense that it does not specify any interactio~ strategics. Within our research on interactive system architectures we developed a modular communication framework, ICE ~, in co-operation with the University of Hamburg. Now, ICE is the architectural framework of the VM research prototype.</Paragraph> </Section> <Section position="2" start_page="487" end_page="487" type="sub_section"> <SectionTitle> 4.1 The INTARC Architecture </SectionTitle> <Paragraph position="0"> The INTARC architecture as first presented by (Pyka 1992) is a distributed software system that allows for tile intcrconncction of NLSP modules under the principles of incrementality and interactivity. Figure 2 shows the modularization of INTARC-1.3: There is a main broad channel connecting all modules in bottom-up direction, i.e., from signal to interpretation. Furthermore, there are smaller channels connecting several modules, which are used for the top-down interactive disambiguation data flow. Inerementality is required for all modules. ICE assumes that each module has a local memory that is not directly accessible to other modules. Modules communicate explicitly with one another via messages sent over bidirectional channels. This kind of communication architecture is hardly new and eonlY=onts us directly with a large number of unresolved issues in distributed problem solving, ef. (Durfee et al.</Paragraph> <Paragraph position="1"> 1989). In the last 20 years there have been numerous architecture proposals for distributed problem solving among computing entities that exchange information explicitly via message passing.</Paragraph> <Paragraph position="2"> None of these models include explicit strategies or paradigms to tackle the problem of distributed control.</Paragraph> </Section> <Section position="3" start_page="487" end_page="487" type="sub_section"> <SectionTitle> 4.2 Structural Constraints of Verblnobil </SectionTitle> <Paragraph position="0"> Modularity, being a fundamental assumption in VM (Wahlster 1992), does still leave us with two problems: First, modules have to communicate with one another, and second, their local behaviors have to be somehow coordinated into a coherent global, possibly optimal, behavior. Unfortunately, the task of system integration has to obey some structural constraints which are mostly pragmatic in natnre:</Paragraph> </Section> <Section position="4" start_page="487" end_page="487" type="sub_section"> <SectionTitle> 1.3 architecture </SectionTitle> <Paragraph position="0"> Some of the modules are very complex software systems in thelnselves. Highly parameterizable and with control subtly spread over many interacting submodules, understanding and then integrating such systems into a common control strategy can be a very daunting task.</Paragraph> <Paragraph position="1"> Control issues are often very tightly knit with the domain the module is aimed at, i.e., it is very difficult to understand the control strategies used without sound knowledge of the underlying domain. The problem even gets worse if what is to be fine-tuned is the interaction between several complex modules.</Paragraph> <Paragraph position="2"> These two arguments are similar in nature, but diflhr in the architecturM levels that they apply to. 'File former is implementation related, the latter algorithm arid theory related.</Paragraph> </Section> <Section position="5" start_page="487" end_page="488" type="sub_section"> <SectionTitle> 4.3 Layers of Control </SectionTitle> <Paragraph position="0"> Modules have to colnmunicate with one another and their local behaviors have to be coordinated into a coherent global, possibly optimal, behavior.</Paragraph> <Paragraph position="1"> In highly distributed systems we generally tind the following levels of control: System Control: The minimal set of operating system related actions that each participating module must be able to per\[brm which will typically include means to start up, reset, moni tot, trace and terminate individual modules or the system as a whole.</Paragraph> <Paragraph position="2"> Is()lated Local Controh The control strategies used within the module, disregarding any interactions beyond initial input of data and final output of solutions. 'Fhere is only one thread of control active at any time.</Paragraph> <Paragraph position="3"> lnteraetiv(; Local Controh ll.oughly, this can be seen as isolated local control extended with interaction cal)al/ilities, lncr~&quot;mentalitp is given by the l)ossibility of control flowing back to a certain internal stake aftex an outl)ut operation, lligher mtcraclivily is made possible by entering a state more often fl:om w~rious points within the roodtile and by adding a new waiting lool/to cheek for any tot)-down requests. The requirement for any.</Paragraph> <Paragraph position="4"> time behavior is a special case of that (G6rz and Kesseler \]994).</Paragraph> <Paragraph position="5"> in ore: experience ~he change to interactive COl> trol will tremendously increase the complexity of the resulting (:ode. But we are swill making the simplifying assumptions that tile algorithm can be used increnlentally - but there are algorithms m~suitable for incremental processing (e.g. A*).</Paragraph> <Paragraph position="6"> h~crementality can lead to the (\[elrlalld for a eolni)\]ete redesign of a lnodule. Furthexmore we assume that simply by exchanging data and doing simple extensions in the control \[tow everything will balan(:e out nicely on the system scale which is enorlnously naiv(:. Even for the sequential architecture implied by the case of isolated local control, we have to solve a whole plethora of uew problems that corne along with interaetivity: a module that comes close to possessing the &quot;inte-grated view&quot; ot'a centralized blackboard control: the dialogue module. So it seems the right place to handle some of the global strategic control issues, like: - l)omain error trundling . Observe timeont constraints . ll,esolve, external ambiguitie.s/unl(nowns The fact that tile dialogue module exercises a kind of global control does not invalidate what has bee, n said about the unfeasability of central control, be.. cause the control exercised by it is very coar.',e grained. To handle liner grained control issues in any rood ule would take us back to memory and/or eomm,mication system contention.</Paragraph> </Section> </Section> class="xml-element"></Paper>