XML Viewer - h01-1005

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/h01-1005_metho.xml
Size: 11,871 bytes
Last Modified: 2025-10-06 14:07:33
<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1005">
  <Title>The Annotation Graph Toolkit: Software Components for Building Linguistic Annotation Tools</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. ARCHITECTURE
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 General Architecture
</SectionTitle>
      <Paragraph position="0"> Existing annotation tools are based on a two level model (Figure 1 Top). The systems we demonstrate are based around a three level model, in which annotation graphs provide a logical level independent of application and physical levels (Figure 1 Bottom).</Paragraph>
      <Paragraph position="1"> The application level represents special-purpose tools built on top of the general-purpose infrastructure at the logical level.</Paragraph>
      <Paragraph position="2"> The system is built from several components which instantiate this model. Figure 2 shows the architecture of the tools currently being developed. Annotation tools, such as the ones discussed below, must provide graphical user interface components for signal visualization and annotation. The communication between components is handled through an extensible event language. An application programming interface for annotation graphs (AG-API) has been developed to support well-formed operations on annotation graphs. This permits applications to abstract away from file format issues, and deal with annotations purely at the logical level.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 The Annotation Graph API
</SectionTitle>
      <Paragraph position="0"> The complete IDL definition of the AG-API is provided in the appendix (also online). Here we describe a few salient features of the API.</Paragraph>
      <Paragraph position="1"> The API provides access to internal objects (signals, anchors, annotations etc) using identifiers. Identifiers are strings which contain internal structure. For example, an AG identifier is qualified with an AGSet identifier: AGSetId:AGId. Annotations and anchors are doubly qualified: AGSetId:AGId:AnnotationId, AGSetId:AGId:AnchorId. Thus, it is possible to determine from any given identifiers, its membership in the overall data structure. The functioning of the API will now be illustrated with a series of examples. Suppose we have already constructed an AG and now wish to create a new anchor. We might have the following API call: CreateAnchor( &amp;quot;agSet12:ag5&amp;quot;, 15.234, &amp;quot;sec&amp;quot; ); This call would construct a new anchor object and return its identifier: agSet12:ag5:anchor34. Alternatively, if we already  have an anchor identifier that we wish to use for this new anchor (e.g. because we are reading previously created annotation data from a file and do not wish to assign new identifiers), then we could have the following API call: CreateAnchor( &amp;quot;agset12:ag5:anchor34&amp;quot;, 15.234, &amp;quot;sec&amp;quot; ); This call will return agset12:ag5:anchor34.</Paragraph>
      <Paragraph position="2"> Once a pair of anchors have been created it is possible to create an annotation which spans them:</Paragraph>
      <Paragraph position="4"> This call will construct an annotation object and return an identifier for it, e.g. agSet12:ag5:annotation41. We can now add features to this annotation: SetFeature( &amp;quot;agSet12:ag5:annotation41&amp;quot;, &amp;quot;date&amp;quot;, &amp;quot;1999-07-02&amp;quot; ); The implementation maintains indexes on all the features, and also on the temporal information and graph structure, permitting efficient search using a family of functions such as: GetAnnotationSetByFeature( &amp;quot;agSet12:ag5&amp;quot;, &amp;quot;date&amp;quot;, &amp;quot;1999-07-02&amp;quot; );</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 A File I/O Library
</SectionTitle>
      <Paragraph position="0"> A file I/O library (AG-FIO) to support creation and export of AG data has been developed. This will eventually handle all widely used annotation formats. Formats currently supported by the AG-FIO library include the TIMIT, BU, Treebank, AIF (ATLAS Interchange Format), Switchboard and BAS Partitur formats.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Inter-component Communication
</SectionTitle>
      <Paragraph position="0"> Figure 3 shows the structure of an annotation tool in terms of components and their inter-communications.</Paragraph>
      <Paragraph position="1">  The main program is typically a small script which sets up the widgets and provides callback functions to handle widget events. In this example there are four other components which are reused by several annotation tools. The AG and AG-FIO components have already been described. The waveform display component (of which there may be multiple instances) receives instructions to pan and zoom, to play a segment of audio data, and so on. The transcription editor is an annotation component which is specialized for a particular coding task. Most tool customization is accomplished by substituting for this component.</Paragraph>
      <Paragraph position="2"> Both GUI components and the main program support a common API for transmitting and receiving events. For example, GUI components have a notion of a &amp;quot;current region&amp;quot; -- the timespan which is currently in focus. A waveform component can change an annotation component's idea of the current region by sending a SetRegion event (Figure 4). The same event can also be used in the reverse direction. The main program routes the events between GUI components, calling the AG-API to update the internal representation as needed. With this communication mechanism, it is a straightforward task to add new commands, specific to the annotation task.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.5 Reuse of Software Components
</SectionTitle>
      <Paragraph position="0"> The architecture described in this paper allows rapid development of special-purpose annotation tools using common components. In particular, our model of inter-component communication facilitates reuse of software components. The annotation tools described in the next section are not intended for general purpose annotation/transcription tasks; the goal is not to create an &amp;quot;emacs for linguistic annotation&amp;quot;. Instead, they are special-purpose tools based on the general purpose infrastructure. These GUI components can be modified or replaced when building new special-purpose tools.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. GRAPHICAL USER INTERFACES
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 A Spreadsheet Component
</SectionTitle>
      <Paragraph position="0"> The first of the annotation/transcription editor components we describe is a spreadsheet component. In this section, we show two tools that use the spreadsheet component: a dialogue annotation tool and a telephone conversation transcription tool.</Paragraph>
      <Paragraph position="1"> Dialogue annotation consists of assigning a field-structured record to each utterance in each speaker turn. A key challenge is to handle overlapping speaker turns and back-channel cues without disrupting the structure of individual speaker contributions. The tool solves these problems and permits annotations to be aligned to a (multi-channel) recording. The records are displayed in a spreadsheet. Clicking on a row of the spreadsheet causes the corresponding extent of audio signal to be highlighted. As an extended recording is played back, annotated sections are highlighted (both waveform and spreadsheet displays).</Paragraph>
      <Paragraph position="2"> Figure 5 shows the tool with a section of the TRAINS/DAMSL corpus [4]. Figure 6 shows another tool designed for transcribing telephone conversations. This latter tool is a version of the dialogue annotation tool, with the columns changed to accommodate the needed fields: in this case, speaker turns and transcriptions. Both of these tools are for two-channel audio files. The audio channel corresponding to the highlighted annotation in the spreadsheet is also highlighted.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 An Interlinear Transcription Component
</SectionTitle>
      <Paragraph position="0"> Interlinear text is a kind of text in which each word is annotated with phonological, morphological and syntactic information (displayed under the word) and each sentence is annotated with a free translation. Our tool permits interlinear transcription aligned to a primary audio signal, for greater accuracy and accountability.</Paragraph>
      <Paragraph position="1"> Whole words and sub-parts of words can be easily aligned with the audio. Clicking on a piece of the annotation causes the corresponding extent of audio signal to be highlighted. As an extended recording is played back, annotated sections are highlighted (both waveform and interlinear text displays).</Paragraph>
      <Paragraph position="2"> The following screenshot shows the tool with some interlinear text from Mawu (a Manding language of the Ivory Coast, West Africa).</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 A Waveform Display Component
</SectionTitle>
      <Paragraph position="0"> The tools described above utilize WaveSurfer and Snack developed by K@are Sj&amp;quot;olander and Jonas Beskow [7, 8]. WaveSurfer allows developers to specify event callbacks through a plug-in architecture. We have developed a plug-in for WaveSurfer that enables the inter-component communication described in this paper.</Paragraph>
      <Paragraph position="1"> In addition to waveforms, it is also possible to show spectrograms and pitch contours of a speech file if the given annotation task requires phonetic analysis of the speech data.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. FUTURE WORK
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 More GUI Components
</SectionTitle>
      <Paragraph position="0"> In addition to the software components discussed in this paper, we plan to develop more components to support various annotation tasks. For example, a video component is being developed, and it will have an associated editor for gestural coding. GUI components for Conversation Analysis (CA) [6] and CHAT [5] are also planned.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 An Annotation Graph Server
</SectionTitle>
      <Paragraph position="0"> We are presently designing a client-side component which presents the same AG-API to the annotation tool, but translates all calls</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Timeline for Development
</SectionTitle>
      <Paragraph position="0"> A general distribution (Version 1.0) of the tools is planned for the early summer, 2001. Additional components and various improvements will be added to future releases. Source code will be available through a source code distribution service, SourceForge ([http://sourceforge.net/projects/agtk/]). Further schedule for updates will be posted on our web site: [http: //www.ldc.upenn.edu/AG/].</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML