File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/i05-2019_metho.xml
Size: 11,361 bytes
Last Modified: 2025-10-06 14:09:36
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-2019"> <Title>eBonsai: An integrated environment for annotating treebanks</Title> <Section position="3" start_page="109" end_page="109" type="metho"> <SectionTitle> 2 Annotating treebanks </SectionTitle> <Paragraph position="0"> Figure 2 shows a workflow of annotating a tree-bank using eBonsai.</Paragraph> <Paragraph position="1"> 1. An annotator picks a sentence to annotate from plain-text corpora.</Paragraph> <Paragraph position="2"> 2. The MSLR parser (Tanaka et al., 1993) performs syntactic analysis of the sentence. 3. The annotator chooses a correct syntactic structure from the output of the parser. If necessary, retrieval of structures in the tree-bank is available in this step.</Paragraph> <Paragraph position="3"> 4. The annotator adds the chosen syntactic structure to the treebank.</Paragraph> <Paragraph position="4"> The coverage of Japanese grammar used in the MSLR parser is fairly wide. The number of grammar rules of the current system is almost 3,000. That means we have a lot of outputs as a result of syntactic analysis in step 2. These structures are represented in terms of a special data structure called packed shared forest (PSF) (Tomita, 1986). The main role of eBonsai is supporting annotators to choose a correct one from a lot of candidate structures in step 3.</Paragraph> </Section> <Section position="4" start_page="109" end_page="109" type="metho"> <SectionTitle> 3 Annotation plug-in module </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="109" end_page="109" type="sub_section"> <SectionTitle> 3.1 Overview </SectionTitle> <Paragraph position="0"> The annotation plug-in module helps to choose a correct syntactic structure from a set of structures represented by a packed shared forest which is an output of the MSLR parser.</Paragraph> <Paragraph position="1"> Since there are generally so many syntactic structures given by the parser, it is impractical to find a correct one by checking all of them. The annotation plug-in module shows a single structure at a time as shown in figure 1, and asks annotators to specify a constraint. The system reflects the constraint immediately by filtering out the structures violating it. This process is repeated until a single correct structure is identified. The constraints which can be specified by annotators are following two: 1. Annotators can specify a destination constituent of a dependent constituent.</Paragraph> <Paragraph position="2"> 2. Annotators can specify a correct label of a node.</Paragraph> <Paragraph position="3"> This plug-in module is a reimplementation of an annotation tools developed by Okazaki (Okazaki et al., 2001) in an Eclipse framework.</Paragraph> </Section> <Section position="2" start_page="109" end_page="109" type="sub_section"> <SectionTitle> 3.2 Example of annotation </SectionTitle> <Paragraph position="0"> Take the following Japanese sentence for an example to explain the usage of the annotation plug-in module.</Paragraph> <Paragraph position="1"> /w+w; (yellow paper for warning-ACC) #(!Zt (mailbox-DAT) oSMhU (put, but)|i oMsM (not being paid yet).</Paragraph> <Paragraph position="2"> (I put a yellow paper for warning in the mailbox, but it is not paid yet.) 1. An annotator double-clicks a PSF file name in Eclipse &quot;Navigator&quot; (a left window) to pick up a sentence to annotate.</Paragraph> <Paragraph position="3"> 2. A new window opens and one of the struc- null tures is shown in terms of a syntactic tree (a right window).</Paragraph> <Paragraph position="4"> This window is called &quot;Annotation editor&quot;. The notation &quot;1/9&quot; in the left-top part of the window indicates that the presented structure is the first one out of nine. 3. A red node (e.g. &quot;< -; - >&quot; (verb phrase)) indicates that this node has other possible label names.</Paragraph> <Paragraph position="5"> 4. Clicking a right button on the red node makes a list of possible label names pop up as shown below.</Paragraph> <Paragraph position="6"> 110 5. Annotators can choose a correct label name of the node in the list. In this case, label &quot;< -! - >&quot; will be selected.</Paragraph> <Paragraph position="7"> 6. Then label &quot;< -; - >&quot; (verb phrase) changes to &quot;< -! - >&quot; (noun phrase) in the tree and its color becomes black at the same time. Black label names indicate that there is no other possible label for this node.</Paragraph> <Paragraph position="8"> Now, the number of structures shown in the left-top part of Annotation editor decreases to 3.</Paragraph> <Paragraph position="9"> 7. A green node (e.g. &quot;< 4 - - >&quot;) indicates the constituent governed by that node can depend on more than one constituent. null 8. Clicking a right button on node &quot;<4 - - >&quot; makes a list of destinations of dependency pop up as shown below.</Paragraph> <Paragraph position="10"> 9. Annotators can choose a correct destination in the list. In this case, &quot;#(!ZtoS Mh (put (a yellow paper) in the mailbox)&quot; will be selected.</Paragraph> <Paragraph position="11"> 10. At this moment, all nodes have turned into black and the number of structure becomes 1. That means the annotation of this sentence has been finished.</Paragraph> </Section> <Section position="3" start_page="109" end_page="109" type="sub_section"> <SectionTitle> 3.3 Other features </SectionTitle> <Paragraph position="0"> The following features are also implemented.</Paragraph> <Paragraph position="1"> + Unlimitedly repeatable Undo/Redo. It is possible to undo/redo after saving results by using the popup menu. (figure 3).</Paragraph> <Paragraph position="2"> + Viewing other structures. Items [Previous tree] and [Next tree] in the popup menu shows different structures.</Paragraph> <Paragraph position="3"> + Folding constituents. Clicking a right button on a node and selecting item [Switch folding] makes the structure under the node folded. Selecting the same item again unfolds the structure.</Paragraph> <Paragraph position="4"> + Copying a part of a structure to the retrieval plug-in module. Item [Copy to search] in the popup menu copies a selected structure to the query input window. This feature will be described in detail in the later section.</Paragraph> </Section> </Section> <Section position="5" start_page="109" end_page="111" type="metho"> <SectionTitle> 4 Retrieval plug-in module </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="109" end_page="111" type="sub_section"> <SectionTitle> 4.1 Overview </SectionTitle> <Paragraph position="0"> During the course of annotation, annotators usually put constraints to narrow down to a correct structure considering the meaning of a sentence.</Paragraph> <Paragraph position="1"> However, there are cases in which it is difficult to pin down a correct one by referring to only that sentence. Annotators can consult the system to retrieve the similar structure of sentences in the treebank. The retrieval plug-in module provides annotators such functionality. The retrieval plug-in module receives a syntactic structure as a query and provides a list of sentences which include the given structure.</Paragraph> <Paragraph position="2"> The retrieval plug-in module has been realized with the method proposed by Yoshida (Yoshida et al., 2004). The method is based on Yoshikawa's method (Yoshikawa et al., 2001) which was originally proposed for handling XML documents effectively by using relational database (RDB) systems. Yoshida adopted Yoshikawa's method to deal with syntactic structures in the database.</Paragraph> <Paragraph position="3"> Since an XML document can be represented as SQL query and the retrieval is performed.</Paragraph> <Paragraph position="4"> A query involving a large number of nodes generates a longer SQL query, thus degrades retrieval speed significantly. Yoshida proposed to decompose an input query into a set of subtrees, and to translate each subtree into a SQL query.</Paragraph> </Section> <Section position="2" start_page="111" end_page="111" type="sub_section"> <SectionTitle> 4.2 Example of structure retrieval </SectionTitle> <Paragraph position="0"> 1. An annotator puts a query tree in the query input window (upper-left window of Figure 5). The query can be modified by the following way.</Paragraph> <Paragraph position="1"> + A node label can be changed by clicking a left button on the node and putting a new label in the input area. A label can contain a wild card character &quot;%&quot;. + A child node can be added by clicking a right button on a node and selecting menu item [Add child].</Paragraph> <Paragraph position="2"> 2. Clicking a right button in the query input window and selecting a menu item starts retrieval. There are four types of search.</Paragraph> <Paragraph position="3"> + Menu item [Search] retrieves sentences containing a structure which is exactly the same as the query.</Paragraph> <Paragraph position="4"> + Menu item [Partial search] retrieves + Menu item [Narrow search] searches in the previously retrieved sentences instead of in the entire treebank.</Paragraph> <Paragraph position="5"> + Menu item [Partial narrow search] is the combination of [Partial search] and [Narrow search].</Paragraph> <Paragraph position="6"> 3. Retrieval results are shown in the retrieval result list window (a left-bottom window in Figure 5).</Paragraph> <Paragraph position="7"> 4. Clicking a sentence in the list shows the de null tailed structure of the sentence in the detail window (a right window of Figure 5). A part of the structure matching with the query is colored with blue.</Paragraph> <Paragraph position="8"> 5. If there is more than one substructure matching with the query in a sentence, the system shows the total number of matching parts, and the identifier of the currently colored part by number. Menu items [Previous match] and [Next match] allows annotators to move the other matching parts.</Paragraph> </Section> </Section> <Section position="6" start_page="111" end_page="112" type="metho"> <SectionTitle> 5 Interplay between annotation and </SectionTitle> <Paragraph position="0"> retrieval Since both the annotation plug-in module and the retrieval plug-in module are implemented as Eclipse plug-ins, they can easily exchange information each other. Thanks for this feature, annotators can copy a part of syntactic structures shown in Annotation editor and submit it to the retrieval module as a query. This can be done by the following procedure.</Paragraph> <Paragraph position="1"> 1. Dragging a mouse pointer over the area covering a target syntactic structure selects the structure, of which color changes to blue. 2. Clicking a right button in Annotation editor makes a command list pop up, and selecting item [Copy to search] copies the selected structure to the query input window.</Paragraph> <Paragraph position="2"> 3. The annotator can modify the query if needed.</Paragraph> <Paragraph position="3"> 4. Clicking a right button in the query input window makes a command list pop up and selecting one of search commands performs a search.</Paragraph> </Section> class="xml-element"></Paper>