File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2045_metho.xml

Size: 9,006 bytes

Last Modified: 2025-10-06 14:12:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-2045">
  <Title>An Interactive Japanese Parser for Machine Translation</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 System Structure
</SectionTitle>
    <Paragraph position="0"> The system structure of JAWB is shown in Figure 1. Japanese syntax analysis is divided into two parts: morphological analysis and dependency analysis.</Paragraph>
    <Paragraph position="1"> An input sentence is first segmented into a sequence of linguistic units called bu'nsets'u, which can be roughly translated in English as phr'ase,s. Each bunsetsu, hereafter called a phrase, consists  of one or more primitive words. The morphological analyzer analyzes a consecutive sequence of characters and identifies word and phrase boundaries. Japanese morphological analysis is a relatively well established technology (Maruyama et al. 1988) and intervention by the user is seldom required, although the system does provide a facility for this. A Japanese syntactic structure is depicted by modifier-modifiee relationships between phrases.</Paragraph>
    <Paragraph position="2"> The dependency analyzer determines the modifiee of each phrase. This is the most difficult task and normally user interaction takes place at this stage. First, the system determines the modifiee candidates of each phrase by using the grammar rules, and builds a data structure called a constraint network. The grammar rules are based on Constraint Dependency Grammar (Maruyama 1990), and are essentially constraints between modifications. The constraint network holds the modifiee candidates of each phrase, and the grammatical constraints are posed between the candidates.</Paragraph>
    <Paragraph position="3"> The system then proposes the most plausible reading and displays it on the screen along with the other possibilities. If the human user is satisfied with the proposal or does not want to make any decision, he tells the system to 'go ahead' and the proposal is passed through to the transfer component as the unique parsing result. Alternatively, tile user can select an arbitrary phrase and choose its modifiee from the rest of the candidates. The system incorporates this information into the constraint network, makes another proposal, and shows it to the user. This process is iterated until no more ambiguity remains. During analysis, the constraint propagation engine keeps tile constraint network locally consistent by using the constraint prvpagation algorithm (Waltz 1975).</Paragraph>
    <Paragraph position="4"> Before the unique parse tree is submitted to the transfer component, JAWB performs some 'post processing' on the tree. This processing includes resolving remaining lexical ambiguities, giving grammatical relations such as SUBJ and DOBJ, and transforming a passive-voice structure into an active-voice structure. Since making such decisions requires expert knowledge about Japanese linguistics and/or the system's internal structure, it is preferable that this process is carried out automatically. Since correct modifier-modifiee relationships are given at the previous stage, this process makes few errors without huo man intervention.</Paragraph>
    <Paragraph position="6"> \[phrase=l, string ='' &amp; fPS ~,2 ~&amp;quot; (anataga), cat=rip, mcat=pred modifier, modif J ee-~{~,2,3,4,5Y,}, words:: .\[ \[string ='' ~tPS ~',i&amp;quot; (anata) , syn=-\[Y.</Paragraph>
    <Paragraph position="7"> \[pos=105, string=&amp;quot;j~)~- &amp;quot;(you), sem = \[sf={hum}, caseframe={}\] \] , \[pos=105, string=&amp;quot;~\]~ &amp;quot;(far off), /ram = \[sf={loc, con,abs},caseframe={}\] \] X}\], \[string=&amp;quot; 75~&amp;quot; (ga), syn= \[pos=75, string=&amp;quot; ~&amp;quot; (SUB J) \] \] }\], \[phrase=2, string=&amp;quot; ~ \[~ &amp;quot;(kinou) , cat=advp, mcat=pred_modifier, modif iee={~,3,4,5~,},</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Dependency Analysis
</SectionTitle>
    <Paragraph position="0"> Let us consider sentence (1).</Paragraph>
    <Paragraph position="2"> Part of the input to the dependency analyzer for this sentence is shown in Figure 2. A sentence is a sequence of phrases, each of which is represented as a feature structure. Some of the values are enclosed by special brackets {% and %}, representing di.sj~Lrtctio'ns or choice t)oints. Phrase 1 in Figure 2, for example, contains two choice points, one for structural ambiguity (the modifiee slot) and the other tor lexical ambiguity (tile sgn slot of the first word). In Japanese, every phrase except the last one modifies exactly one phrase on its right. 1 Therefore, the modifiee of phrase 1 is one of the four succeeding phrases.</Paragraph>
    <Paragraph position="3"> The grammatical rules that we need here are as follows:  According to the above rules, tile modifiee (i.e., the governor) of phrase 1 (you-SUBJ) is either phrase 3 (meet-PAST) or phrase ,5 (see-PAST), since phrase 1 is a predicate-modifier and phrases 3 and ,5 are predicates. Similarly, phrase 2 can modify either phrase 3 or phrase 5. The values of the modifiee slot of each phrase thus become as follows:</Paragraph>
    <Paragraph position="5"> Because modification links do ,lot cross each other (by rule G3), tile cases of phrase 1 modifying phrase 3 and phrase 2 modifying phrase 5 do not co-occur. Therefore, this sentence has three different readings, which correspond to (1-1) to (14):  (1-1) (I) saw the man you met yesterday.</Paragraph>
    <Paragraph position="6"> (1-2) You saw the man (I) met yesterday.</Paragraph>
    <Paragraph position="7"> (1-3) Yesterday, you saw the man (I) met.</Paragraph>
    <Paragraph position="8"> Tile system maintains these readings implicitly by having constraints between choice points. For example, the following eorzstrairzt ma-</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 I
5 I
</SectionTitle>
    <Paragraph position="0"> By means of the constraint matrices, the system can defer tile generation of individual parse trees until all structural ambiguities are resolved. The number of parse trees may combinatorially explode when the sentence becomes long. For example, sentences with more than 20 phrases are not rare and such sentences may have tens of thousands of parse trees.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
User Interface
</SectionTitle>
      <Paragraph position="0"> The essential portion of the user interface is shown in Figure 3. The system does not display the proposed modifiees of all the phrases at once. Instead, when the user moves the cursor to a phrase by using a mouse, the proposed modifiee and the other possible candidates are highlighted. In the figures, the current phrases pointed to by the cursor are underscored, the proposed modifiees are in reversed video, and the other modifiee candidates are in a shaded box. 2 The number appearing at the left lower corner of each phrase shows the number of modifiee candidates of the current phrase.</Paragraph>
      <Paragraph position="1"> 2These are in different colors on the real screen.</Paragraph>
      <Paragraph position="2"> If this number is one, the modifiee is uniquely determined. Otherwise, the modifiee of the phr~e is ambiguous.</Paragraph>
      <Paragraph position="3"> Figure 3-a shows the screen when the cursor is on phrase 1 (you-SUB J). Phrase 1 can modify either phrase 3 or phrase 5, and the system's proposal is phrase 5. Figure 3-b shows the screen when the cursor is o11 phrase 2. By moving the cursor oi1 tile phrases, the user can check the current system proposal. If tile user is satisfied with it, he indicates this by clicking a special 'go-ahead' icoq.</Paragraph>
      <Paragraph position="4"> Otherwise, he has to select the proper candidates.</Paragraph>
      <Paragraph position="5"> The user selects one of the ambiguous phrases by clicking tile mouse, moves the cursor to its proper modifiee, and clicks the mouse again. The second click triggers the constraint propagation engine, and the updated situation is displayed instantaneously. Figure 4 shows the situation after the user has instructed the system that phrase 1 modifies phrase 3. The reader may notice that the modifiee of phrase 2 is also determined automatically because of constraint propagation.</Paragraph>
      <Paragraph position="6"> During parsing, the user always has the initiative in the interaction. The user knows the exact sources of the structral ambiguity, and he can select any of them to give information to the system. This is in contrast to the previous systems, in which the user must answer system-generated queries one by one. The constraint propagation engine ensures that the given information is maximally used in order to minimize further interaction. The user also has the option of saying ~go-ahead'  at any time, taking the default choices proposed by the system.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML