File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/p03-2002_metho.xml

Size: 18,721 bytes

Last Modified: 2025-10-06 14:08:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-2002">
  <Title>An Ontology-based Semantic Tagger for IE system</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PERSON
</SectionTitle>
    <Paragraph position="0"> ...</Paragraph>
    <Paragraph position="1"> 3-O:We get|{z} an overdue boat |{z }, missing boat |{z } on the South Coast of Newfoundland |{z }...</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
STATUS MISSING-VESSEL MISSING-VESSEL LOCATION-TYPE
</SectionTitle>
    <Paragraph position="0"> 4-O:They did a radar search |{z }for us in the area |{z }.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
DETECTION-MEANS LOCATION
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
STATUS-REQUEST STATUS-REQUEST TASK SAR-AIRCRAFT-TYPE DETECTION-MEANS
</SectionTitle>
    <Paragraph position="0"> extraction. The tag below each bold chunk is a domain-specific information automatically generated by the semantic tagger. Chunks like possibility, go, flowing and first light are annotated by using sense tagging outputs. Whereas chunk such as Mr. Joe Blue, the South coast of Newfoundland and Aurora are annotated by the named concept extraction process.</Paragraph>
    <Paragraph position="1"> (Shriberg, 1994) such as repetitions (13-O: Ha, do, is there, is there ...) , omissions and interruptions (3-O: we've been, actually had a ...). And, there is about 3% of transcription errors such as flowing instead of blowing (11-O Figure 1).</Paragraph>
    <Paragraph position="2"> The underlined words are the relevant informations that will be extracted to fill in the IE templates. They are, for example, the incident, its location, SAR resources needed for the mission, the result of the SAR mission and weather conditions.</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Overall system
</SectionTitle>
    <Paragraph position="0"> The information extraction system is a four stage process (Figure 2). It begins with the extraction of words that could be candidates to the extraction (stage I). Then, the semantic tagger annotates the extracted words (stage II). Next, given the context and the semantic tag a word is extracted or rejected (stage III). Finally, the extracted words are used for the coreference resolution and to fill in IE templates (stage IV). The knowledge sources used for the IE task are the SAR ontology and the Wordsmyth dictionary-thesaurus1.</Paragraph>
    <Paragraph position="1"> In this section we describe the extraction of candidates, the SAR ontology design and the topic segmentation which have already been implemented.</Paragraph>
    <Paragraph position="2"> We leave the description of the topic labeling, the selection of relevant words and the template generation to future work. The semantic tagger, is detailed in section 4.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Extraction of candidates
</SectionTitle>
      <Paragraph position="0"> Candidates considered in the semantic tagging process are noun phrases NP, proposition phrases PP, verb phrases VP, adjectives ADJ and adverbs ADV.</Paragraph>
      <Paragraph position="1"> To gather these candidates we used the Brill transformational tagger (Brill, 1992) for the part-of-speech step and the CASS partial parser for the parsing step (Abney, 1994). However, because of the disfluencies (repairs, substitutions and omissions) encountered in the conversations, many errors occurred when parsing large constructions. So, we reduced the set of grammatical rules used by CASS to cover only minimal chunks and discard large constructions such as VP ! VX NP? ADV* or noun  extraction system. Dashed squares represent processes which are not developed in this paper.</Paragraph>
      <Paragraph position="2"> phrases NP ! NP CONJ NP. The evaluation of the semantic tagging process shows that about 14.4% of the semantic annotation errors are partially due to part-of-speech and parsing errors.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Topic segmentation
</SectionTitle>
      <Paragraph position="0"> Topic segmentation takes part to several stages in our IE system (Figure 2). Dialogue-based IE systems have to deal with scattered information and disfluencies. Question-answer pairs, widely used in dialogues, are examples where information is conveyed through consecutive utterances. By dividing the dialog into topical segment, we want to ensure the extraction of coherent and complete key answers. Besides, topic segmentation is a valuable pre-processing for coreference resolution, which is a difficult task in IE. Hence, for the extraction of relevant candidates and the coreference resolution which is part of the template generation stage (Figure 2), we use topic segment as context instead of the utterance or a word window of arbitrary size.</Paragraph>
      <Paragraph position="1"> The topic segmentation system we developed is based on a multi-knowledge source modeled by a hidden Markov model. (N. Boufaden and al., 2001) showed that by using linguistic features modeled by a Hidden Markov Model, it is possible to detect about 67% of topics boundaries.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 The SAR ontology
</SectionTitle>
      <Paragraph position="0"> The SAR ontology is an important component of our IE system. We build it using domain related informations such as airplane names, locations, organizations, detection means (radar search, diving), status of a SAR mission (completed,continuing, planned), instance of maritime incidents (drifting, overdue) and weather conditions (wind, rain, fog). All these informations were gathered from SAR manuals provided by the National Search and Rescue Secretariat (SARManual, 2000) and from a sample of conversations (10 conversations about 10% of the corpus) to enumerate the different status informations.</Paragraph>
      <Paragraph position="1"> Our ontology was designed for two tasks of the  semantic tagging: 1. Annotate with the corresponding concept all the extracted words that are instances of the ontology. This task is achieved by the named concept extraction process (section 4.1).</Paragraph>
      <Paragraph position="2"> 2. For each word not in the ontology, generate  a concept-based representation composed of similarity scores that provide information about the closeness of the word to the SAR domain.</Paragraph>
      <Paragraph position="3"> This is achieved by the sense tagging process (section 4.2).</Paragraph>
      <Paragraph position="4"> In addition to SAR manuals and corpus, we used the IE templates given by the DREV for the design of the ontology. We used a combination of the top-down and bottom-up design approaches (Fridman and Hafner, 1997). For the former, we used the templates to enumerate the questions to be covered by the ontology and distinguish the major top level classes (Figure 4). For the latter, we collected the named entities along with airplane names, vessel types, detection means, alert types and incidents. The taxonomy is based on two hierarchical relations: the is-a relation and the part-of relation. The is-a relation is used for the semantic tagging. Whereas, the  DEF: 1. to experience a sensation of admiration or amazement (often fol. by at): EXA: She wondered at his bravery in combat.</Paragraph>
      <Paragraph position="5"> SYN: marvel SIM: gape, stare, gawk DEF: 2. to be curious or skeptical about something: EXA: I wonder about his truthfulness.</Paragraph>
      <Paragraph position="6"> SYN: speculate (1) SIM: deliberate, ponder, think, reflect, puzzle, conjecture ...</Paragraph>
      <Paragraph position="7">  describing a STATUS-REQUEST concept (8-O Figure 1). The ENT, SYL, PRO, POS, INF, DEF, EXA, SYN, SIM acronyms are respectively the entry, the syllable, the pronunciation, the part-of-speech, inflexion form, textual definition, example, synonim words and similar words fields. To build the SAR ontology we used the information given in the fields DEF, SYN and SIM. Whereas, to compute the similarity scores we used only the information of the DEF field.</Paragraph>
      <Paragraph position="8"> part-of relation will be used in the template generation process.</Paragraph>
      <Paragraph position="9"> The overall ontology is composed of 31 concepts.</Paragraph>
      <Paragraph position="10"> In the is-a hierarchy, each concept is represented by a set of instances and their textual definitions. For each instance we added a set of synonyms and similar words and their textual definitions to increase the size of the SAR vocabulary which was found to be insufficient to make the sense tagging approach effective. null All the synonyms and similar words along with their definitions are provided by the Wordsmyth dictionary-thesaurus. Figure 3 is an example of Wordsmyth entries. Only textual definitions that fit the SAR context were kept. This procedure increases the ontology size from 480 for a total of 783 instances.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Semantic tagging
</SectionTitle>
    <Paragraph position="0"> The purpose of the semantic tagging process is to annotate words with domain-specific informations. In our case, domain-specific informations are the concepts of the SAR ontology. We want to determine the concept Ck which is semantically the most appropriate to annotate a word w. Hence, we look for C which has the highest similarity score for the word w as shown in equation 1.</Paragraph>
    <Paragraph position="2"> Basically, our approach is a two part process (figure 2). The named concept extraction is similar to named entity extraction based on gazetteer (MUC, 1991). However it is a more general task since it also recognizes entities such as, aircraft names, boat names and detection means. It uses a finite state automaton and the SAR ontology to recognize the named concepts.</Paragraph>
    <Paragraph position="3"> The sense tagging process generates a basedconcept representation for each word which couldn't be tagged by the named concept extraction process.</Paragraph>
    <Paragraph position="4"> The concept-based representation is a vector of similarity scores that measures how close is a word to the SAR domain. As we mentioned before (section 1), the concept-based representation using similarity scores is a way to get around the problem of small-scale corpora. Because we assume that the closer a word is to an SAR concept, the more relevant it is, this process is a key element for the selection of relevant words (figure 2). In the next two sections, we detail each component of the semantic tagger.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Named concept extraction
</SectionTitle>
      <Paragraph position="0"> This task, like the named entity extraction task, annotates words that are not instances of the ontology. Basically, for every chunk, we look for the first match with an instance concept. The match is based on the word and its part-of-speech. When a match succeeds, the semantic tag assigned is the concept of the instance matched. The propagation of the semantic tag is done by a two level automaton. The first level propagates the semantic tag of the head to the whole chunk. The second level deals with cases where the first level automaton fails to recognize collocations which are instances of the ontology. null These cases occur when : the syntactic parser fails to produce a correct parse. This mainly happens when the part of speech tag isn't correct because of disfluencies encountered in the utterance or because of transcription errors.</Paragraph>
      <Paragraph position="1"> the grammatical coverage is insufficient to parse large constructions.</Paragraph>
      <Paragraph position="2"> Whenever one of these reasons occur, the second level automaton tries to match chunk collocations instead of individual chunks. For example, the chunk Rescue Coordination Centre which is an organization, is an example where the parser produces two NP chunks (NP1:Rescue Coordinationand NP2:Centre) instead of only one chunk. In this case, the first level automaton fails to recognize the organization. However, in the second level automaton, the collocation NP1 NP2 is considered for matching with an instance of the concept organization. Figure 5 shows two output examples of the named concept extraction.</Paragraph>
      <Paragraph position="3"> Finally, if the automaton fails to tag a chunk, it assigns the tag OTHER if it's an NP, OTHER-PROPERTIES if it's a ADJ or ADV and OTHER-STATUS if it's a VP.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Sense tagging
</SectionTitle>
      <Paragraph position="0"> Sense tagging takes place when a chunk is not an instance of the ontology. In this case, the semantic tagger looks for the most appropriate concept to annotate the chunk (equation 1). However, a first step before annotation is to determine what word sense is intended in conversations. Many studies (Resnik, 1999; Lesk, 1986; Stevenson, 2002) tackle the sense tagging problem with approaches based on similarity measures. Sense tagging is concerned with the selection of the right word sense over all the possible word senses given some context or a particular domain. Our assumption is that when conversations are domain-specific, relevant words are too. It means that sense tagging comes back to the problem of selecting the closer word sense with regard to the SAR ontology. This assumption is translated in equation 2.</Paragraph>
      <Paragraph position="2"> (2) Where Nl is the number of positive similarity scores of the w(l) similarity vector. w(l) is the word w given the word sense l. The closer word sense w is the highest mean computed from element of the w(l) similarity vector.</Paragraph>
      <Paragraph position="3"> In what follows, we explain how are generated the similarity vectors and the result of our experiments.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Similarity vector representation
</SectionTitle>
      <Paragraph position="0"> A similarity vector is a vector where each element is a similarity score between a word(l) (the word w given the sense word l) and a concept Ck from the SAR ontology. The similarity score is based on the overlap coefficient similarity measure (Manning and Schutze, 2001). This measure counts the number of lemmatized content words in common between the textual definition of the word and the concept. It is defined as :</Paragraph>
      <Paragraph position="2"> where Dw(l) and DCk are the sets of lemmatized content words extracted from the textual definitions  gated to the whole chunk for each concept Ck of the SAR ontology; Ck2fincident,detection-means,status. . .g for each instance Ij of Ck; Ij2fbroken,missing,overdue. . .gfor the concept incident for each synonym Si of Ij; Si2fsmach,crack. . .gfor the instance broken</Paragraph>
      <Paragraph position="4"> of the instance for the concept Ck and M the number of concepts in the ontology.</Paragraph>
      <Paragraph position="5"> of w(l) and Ck. The textual definitions are provided by the Wordsmyth thesaurus-dictionary.</Paragraph>
      <Paragraph position="6"> However, since we have represented each concept by a set of instances and their synonyms in the SAR ontology (section 3.3), we modified the similarity measure to take into account the textual definition of concept instances and their synonyms. Basically, we compute the similarity score between w(l) and each synonym Si of a concept instance Ij. Then, the similarity score between w(l) and the instance concept Ij is the median of the resulting similarity vector representing the similarity scores over all the synonyms. Finally, the similarity score between a concept Ck and w(l) is the highest similarity score over all the concept instances. The algorithm describing these steps is given in Figure 6.</Paragraph>
    </Section>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Preliminary results and discussion
</SectionTitle>
    <Paragraph position="0"> The evaluation of the semantic tagging process was done on 521 extracted chunks (about 10 conversations). Only relevant chunks where considered for  Mean sim is the mean of the similarity scores. It is the selection criteria used to choose the closest word sense.</Paragraph>
    <Paragraph position="1"> the evaluation. The evaluation criteria is an assessment about the appropriateness of the selected concept to annotate the word. For example, the concept time is appropriate for the word first light, whereas the concept incident is not for the word detachment which is closer to the search unit concept.</Paragraph>
    <Paragraph position="2"> Table 2 shows the recall and precision scores for each component and for the overall semantic tagger. The third column shows the input error rates for each component. The error rate in the first row comprises  ponents of the semantic tagger error rates of the part-of-speech tagger, the parsing and the manual transcription. The error rate in the second row are mostly part-of-speech errors. In spite of the significant error rate, the approach based on partial parsing is effective. The use of a minimal grammar coverage to produce chunks reduced considerably the parsing error rate.</Paragraph>
    <Paragraph position="3"> As far as we know, no previous published work on domain-specific WSD for speech transcriptions has been presented, although, word sense disambiguation is an active research field as demonstrated by SENSEVAL competitions2. Hence it is difficult to compare our results to similar experiments. However, some comparative studies (Maynard and Ananiadou, 1998; Li Shiuan and Hwee Tou, 1997) on domain-specific well-written texts show results ranging from 51,25% to 73,90%. Given the fact that our corpus is composed of speech transcriptions with the effect of increasing parsing errors, we consider our results to be very encouraging.</Paragraph>
    <Paragraph position="4"> Finally, results reported in Table 2 should be regarded as a basis for further improvement. In particular, the selection criteria in the sense tagging process could be improved by considering other measures than the mean of all similarity scores as shown in equation 2.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML