File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/i05-4005_metho.xml

Size: 17,793 bytes

Last Modified: 2025-10-06 14:09:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-4005">
  <Title>An Integrated Framework for Archiving, Processing and Developing Learning Materials for an Endangered Aboriginal Language in Taiwan</Title>
  <Section position="3" start_page="32" end_page="37" type="metho">
    <SectionTitle>
3 Integrated Framework
</SectionTitle>
    <Paragraph position="0"> In this section, we will describe our design and the theoretical framework behind the design.</Paragraph>
    <Paragraph position="1"> The project is divided into four major steps:  (1) field recording: recording the oral sound data of the Yami language, (2) archiving: editing the sound data and annotating the data using the metadata, (3) multimedia transformation: analyzing the original data and creating a multi-media Yami dictionary and text description, null (4) e-Learning: creating online Yami language learning materials.</Paragraph>
    <Paragraph position="2">  The framework is designed to meet two requirements of our Yami language archiving pro- null ject: (1) to build a complete and original archiving database for Yami language including speech of various genres, grammar, vocabulary and cultural artifacts.</Paragraph>
    <Paragraph position="3"> (2) to create learning materials in an easyto-learn environment via internet and computer.</Paragraph>
    <Section position="1" start_page="33" end_page="33" type="sub_section">
      <SectionTitle>
3.1 Field Recording
</SectionTitle>
      <Paragraph position="0"> First of all, the existing records collected by the research team since 1994 will be organized and digitalized, along with new field recordings.</Paragraph>
      <Paragraph position="1"> In our project, we will develop an oral speech archiving database to store these oral recordings.</Paragraph>
      <Paragraph position="2"> Each recording will be scanned to find the basic sound characteristics and transferred to digital data. The sound characteristics are used for comparing and tracking these recordings. Following a study by Chen (1996) about tone and stress patterns in Asian languages, we will extract information on intonation and stress from the field recording. This information will later be used to create the learning material. The field recordings are arranged by segments, ranging from words in isolation to &amp;quot;idea units&amp;quot; or &amp;quot;tone units&amp;quot; (Chafe 1979) in continuous speech.</Paragraph>
      <Paragraph position="3"> Once a segment of the field recording has been completed, the original data is stored in the computer and two different types of digital data are created. These include MP3 data that will be used for creating the learning materials and the annotated digital data in which the recordings are separated into phrases with Chinese and English translations. All these data are stored in a relational database with the recording date used as the searching key.</Paragraph>
      <Paragraph position="4"> The processing of field recordings is considered to be the preparation and preprocessing stage of the Yami language documentation project. The voice database is used to create the archived data and learning materials.</Paragraph>
    </Section>
    <Section position="2" start_page="33" end_page="34" type="sub_section">
      <SectionTitle>
3.2 Archiving
</SectionTitle>
      <Paragraph position="0"> The archiving step begins with editing the voice database and construction of the OLAC metadata for each entity in the voice database.</Paragraph>
      <Paragraph position="1"> The original sound tracks in the field recording database are edited to improve clarity of the sound by using sampling techniques (Kientzle 1998). The edited sounds are stored as the new sound records in the voice database.</Paragraph>
      <Paragraph position="2"> The metadata used for describing Yami language is the OLAC metadata, an extended Dublin Core set with basic elements of language resources. To meet the requirement of the linguistic community, certain new extension elements are put in the OLAC set following DCMI guidelines (DCMI 2000). To build a proper OLAC metadata for the Yami language, we have chosen to adopt the OLAC set proposed by Bird and Simons (Bird et al. 2001, Bird &amp; Simons 2003) for this project. Because Yami is primarily an oral language, we use a subset of this OLAC set. The OLAC elements used in this project are: {Title, Creator, Subject, Subject language, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Rights}. The reason for selecting these elements is to create a common description of the Yami language. Furthermore, after reviewing the field study materials, we can show that the above OLAC subset can meet the basic requirement for describing the Yami language. The rules to apply these OLAC elements to each recording of the Yami language are:  (1) Each OLAC element can be optional and repeatable; (2) Each OLAC element can describe only one single identification or one single range; (3) Data format of each OLAC element fol- null lows the rules in DCMI (DCMI 2002).</Paragraph>
      <Paragraph position="3"> Each OLAC element used in describing the Yami language is given following the OLAC and ELDP guidelines. Suppose there is a Yami language sound track to be described, the OLAC  element set of this sound track is shown as follows: null Title: the Chinese name of the Yami language sound track. A second Title element is used to store English translation.</Paragraph>
      <Paragraph position="4"> Creator: the Yami speaker who uttered this speech. A second Creator element is used to store his/her Chinese name.</Paragraph>
      <Paragraph position="5"> Subject: the keyword used to classify the content of the Yami language sound track. The keywords and controlled vocabularies are being collected.</Paragraph>
      <Paragraph position="6"> Subject language: the Chinese linguistic description of the Yami language. A second element is the corresponding English description.</Paragraph>
      <Paragraph position="7"> Description: the usage and the multimedia data related to this Yami language sound track.</Paragraph>
      <Paragraph position="8"> Some multimedia data are collected using the Multimedia Transformation step described in Section 3.3.</Paragraph>
      <Paragraph position="9"> Publisher: the research teams and the sponsoring institutions.</Paragraph>
      <Paragraph position="10"> Contributor: the research teams and the person who recorded this sound track.</Paragraph>
      <Paragraph position="11"> Date: the date this sound track was recorded and the date the archiving process was completed. null Type: the genre of the content of the Yami language sound track. We are transferring many Yami language linguistic and anthropological terms into DC-type. These DC-type terms will be used as the Type element.</Paragraph>
      <Paragraph position="12"> Format: the digital data type of the Yami language sound track.</Paragraph>
      <Paragraph position="13"> Identifier: the ELDP identifier for this Yami language sound track. We will follow ELDP guidelines to create identifiers for the archived sound track.</Paragraph>
      <Paragraph position="14"> Source: the location of the archiving database and the location for storing the field study draft.</Paragraph>
      <Paragraph position="15"> Language: English and Chinese (traditional and simplified characters) Relation: the related Yami language sound tracks.</Paragraph>
      <Paragraph position="16"> Rights: copyright information of this sound track.</Paragraph>
      <Paragraph position="17"> In the archiving step we will also consider how to build a database of the controlled vocabularies for the Yami language. We will use three sources for the controlled vocabulary in this project: lexicon, primary text and language description.</Paragraph>
      <Paragraph position="18"> The table of OLAC metadata is created in two forms, one XML text table format and one relational table format. The voice database from the first step is edited and connected to the metadata table.</Paragraph>
      <Paragraph position="19"> Another goal of this step is to build a Yami language online phrase dictionary. The OLAC metadata are used for parsing and editing with the voice database to create a Yami language online phrase dictionary. We will develop an auto dictionary-generating program that can process the OLAC metadata and find suitable terms. In addition, we use the grammar and course materials of Yami language multimedia courseware created by Rau et al. (2005) to build our on-line multimedia Yami language phrase dictionary.</Paragraph>
      <Paragraph position="20"> When the metadata of a set of the Yami language sound tracks are completed, the results will be published online on our web site. This year, our focus is aligning the OLAC metadata of the Yami language sound tracks with the multimedia courseware by Rau et al. (2005). Later, we will try to use ontology to determine rules for creating metadata automatically and to develop an automatic metadata generator for the Yami language.</Paragraph>
    </Section>
    <Section position="3" start_page="34" end_page="35" type="sub_section">
      <SectionTitle>
3.3 Multimedia Transformation
</SectionTitle>
      <Paragraph position="0"> The Yami language is basically a spoken language, although an orthography is being developed and standardized as texts are collected. To preserve the Yami language, we will use an image database to annotate the language. In addition, each word in Yami is annotated with its orthography stored in a sound database. The purpose of this transformation is to build an image for each Yami word. Therefore, the meaning of the word can be related directly to a picture.</Paragraph>
      <Paragraph position="1"> The reasons why we have chosen to use this approach to annotate the Yami language are as follows:  (1) The Yami language, like all other languages, has culture-specific words and expressions, of which pictures are direct representations.</Paragraph>
      <Paragraph position="2"> (2) The annotated pictures help learners understand the traditional lifestyle on Orchid Island and give them more incentive to learn the language.</Paragraph>
      <Paragraph position="3"> (3) The pictures include many Yami cultural  artifacts. The annotated pictures can thus preserve descriptions of their cultural heritage.</Paragraph>
      <Paragraph position="4"> The steps for multimedia transformation of the Yami language are as follows:  (1) Collect suitable images for building the annotated image database. We will consult many other research teams to borrow Yami images and video recordings.</Paragraph>
      <Paragraph position="5"> (2) Design criteria to choose the images. We will select appropriate images and develop possible connections between Yami expressions and a set of pictures.</Paragraph>
      <Paragraph position="6"> (3) Build a special annotated database and use the Yami language to annotate the image data. The annotated algorithms are based on the fuzzy logic style (Kecman 2001) or the Coherent Language model (Jin 2004).</Paragraph>
      <Paragraph position="7"> (4) Build a corresponding mapping relation between a Yami expression and a set of annotated images. The mapping relations are a set of contexts and symbolic tables similar to a set of induction rules.</Paragraph>
      <Paragraph position="8"> (5) Build a sound connection between each Yami word and its phonetic symbols by  using the fuzzy logic learning algorithm. The results of multimedia transformation can be used as a foundation for creating online learning material. The results are stored in a relational multimedia database as well as the XML pages.</Paragraph>
    </Section>
    <Section position="4" start_page="35" end_page="37" type="sub_section">
      <SectionTitle>
3.4 e-Learning
</SectionTitle>
      <Paragraph position="0"> The final task of our project is to find an effective way to teach the Yami language to urban Yami youngsters and other learners of Yami as a second language. To build an open and self-learning environment, the computer-based learning or the webs for learning is our choice. There have been various discussions about how to use information technologies and the web to learn a different language. Gerbault (2002) showed that it is viable to set up a proper multimedia environment for leaning a language without a teacher's participation. Fujii et al. (2000) demonstrated a project using the Internet as a tool for the teacher to post course materials and create an online learning environment. In addition, Lamb (2005) suggested rethinking pedagogical models for e-learning from the what, the why and the how. e-Learning consists of self-access, reference sources, discussion forum, and virtual learning classrooms. The main motives for introducing e-learning include improving student multimedia learning experience, enhancing learner autonomy and widening participation.</Paragraph>
      <Paragraph position="1"> Finally, e-learning can be controlled primarily by tutors or students, depending on objectives, contents, learning tasks, length/time/place of study, or choice of assessment activities.</Paragraph>
      <Paragraph position="2"> As mentioned in a study by Leung (2003), the computer-based learning environment is very important as a way to help students learn effectively. In order to provide an effective learning environment, Leung (2003) suggested that four contextual issues should be considered in design and implementation of computer-based learning. These issues are topic selection, authenticity, complexity, and multiple perspectives. The design of the web-based computer-assisted learning program for the Yami language takes these four issues into consideration. We outline our design as follows.</Paragraph>
      <Paragraph position="3"> The learning environment in this project is a virtual classroom without teacher participation.</Paragraph>
      <Paragraph position="4"> Students can select the Yami language learning materials prepared by the second author. If a student asks for clues or explanation of a specific Yami word or expression, a suitable image or video clip is retrieved from the multimedia database. If a student is not familiar with a specific Yami sound, a similar phonetic symbol is provided to him/her. The learning materials are arranged in three different settings, scenario setting, easy-to-difficult condition setting and learner's choice setting. The scenario setting uses related scenes in Yami society such as the flying fish festival as a main theme of the learning materials. The easy-to-difficult condition  setting allows the learner to select different levels of the Yami language materials. The levels are based on word frequencies and complexity of grammar. The learner can arrange his/her learning materials in the learner's chosen setting. The learning system will give detailed guidelines to explain how to choose the learning materials. If a student wants to learn the Yami language, he/she can choose different learning materials based on his/her interest. The learning materials are designed as theme units with exercises and rubrics for self-assessment. The design of these Yami language exercises is based on a study about the reactions of students to using a web-based system for learning Chinese in Taiwan (Yang, 2001).</Paragraph>
      <Paragraph position="5"> We use the annotated image database as a tool to help the learners understand the meaning of Yami words or expressions. To make the pictorial explanation more understandable, an animation clip combined with several images is created to explain them.</Paragraph>
      <Paragraph position="6"> A study by Aist (2002) showed that different designs of the oral-reading interactions can help students understand the language more. The learning system will provide several reading modes for students to listen and practice. These modes include: to read the entire sentence without interruption, to read the entire sentence by isolating each word, to read a word slowly syllable-by-syllable, recue the whole sentence and recue the selected words.</Paragraph>
      <Paragraph position="7"> The interface of the proposed learning environment is built on a web server with a dynamic web page. To establish a more efficient learning environment, all the learning materials are edited into reusable learning objects. The user interface is developed as an adaptive style following Mich et al.'s (2004) PARLING system. null The proposed framework is illustrated in  We implement the proposed framework as a hybrid system with many different processes including: (1) Data collection and formulation: to collect the original Yami language data and to build the metadata and the table for digital archiving.</Paragraph>
      <Paragraph position="8"> (2) System design and analysis: to design and develop suitable computer systems and servers to accommodate the proposed framework.</Paragraph>
      <Paragraph position="9"> (3) Research and construction of proposed framework: to develop each subsystem or database shown in Figure 1, such as the OLAC metadata database, the annotated image database and the Yami language learning materials.</Paragraph>
      <Paragraph position="10"> (4) Assessment and evaluation: to test the effectiveness of the proposed learning ma- null terials and to evaluate whether the project goals were accomplished.</Paragraph>
      <Paragraph position="11"> Currently, we are collecting the Yami language materials and building the system server for the proposed framework. We will use a SQL server as the main sever to manage the workflow and the documentation logs. A PHP web server with mySQL server is used as a server for multimedia transformation. Another SQL server is used as the archiving server. The system diagram of the proposed framework is shown in Figure 2.</Paragraph>
      <Paragraph position="12"> Figure 2 Diagram for the proposed framework null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML