XML Viewer - i05-2036

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/i05-2036_metho.xml
Size: 16,424 bytes
Last Modified: 2025-10-06 14:09:34
<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-2036">
  <Title>Svetlana.Hensman@comp.dit.ie</Title>
  <Section position="3" start_page="0" end_page="209" type="metho">
    <SectionTitle>
2 System overview
</SectionTitle>
    <Paragraph position="0"> We use a two-step approach for constructing conceptual graph representations of texts: firstly, by using VerbNet and WordNet, we identify the semantic roles in a sentence, and secondly, using these semantic roles and a set of syntactic/semantic rules we construct a conceptual graph.</Paragraph>
    <Paragraph position="1"> To evaluate our algorithms we use test documents from two corpora in different domains the Reuters-21578 text categorization test collec- null tion (Reuters, 1987) and the collection of aviation incident reports provided by the Irish Air Accident Investigation Unit (AAIU) (Air Accident Investigation Unit, 2004). All documents are parsed using Eugene Charniak's maximum entropy inspired parser (Charniak, 2000).</Paragraph>
  </Section>
  <Section position="4" start_page="209" end_page="211" type="metho">
    <SectionTitle>
3 Semantic role identification
</SectionTitle>
    <Paragraph position="0"> There are number of different existing approaches for identifying semantic roles, varying from traditional parsing approaches, for example using HPSG grammars and Lexical Functional Grammars, that strongly rely on manually-developed grammars and lexicons, to data-driven approaches, for example AutoSlog (Riloff and Schmelzenbach, 1998). In the domain of the Air Traveler Information System (Miller et al., 1996) the authors apply statistical methods to compute the probability that a constituent can fill in a semantic slot within a semantic frame. Gildea and Jurafsky (Gildea and Jurafsky, 2002) describe a statistical approach for semantic role labelling using data collected from FrameNet by analysing a number of features such as phrase type, grammatical function, position in the sentence, etc.</Paragraph>
    <Paragraph position="1"> Shi and Mihalcea (Shi and Mihalcea, 2004) propose a rule-based approach for semantic parsing using FrameNet and WordNet. They extract rules from the tagged data provided by FrameNet, which specify the realisation (order and different syntactic features) for the present semantic roles.</Paragraph>
    <Paragraph position="2"> They also create a feature set representation of the sentence and match it to each of the extracted rules. The result is the rule providing the most feature matches. The authors do not provide any information on how they select between different matches with the same score, or if there is any semantic check on suitability of a phrase to realise a semantic role (FrameNet does not provide any restrictions on the semantic roles similar to the selectional restrictions present in VerbNet).</Paragraph>
    <Paragraph position="3"> The approach we propose for semantic role identification uses information about each verb's behaviour, provided in VerbNet, and the Word-Net taxonomy to decide whether a phrase can be a suitable match for a semantic role.</Paragraph>
    <Paragraph position="4"> VerbNet (Kipper et al., 2000) is a computational verb lexicon, based on Levin's verb classes, that contains syntactic and semantic information for English verbs. Each VerbNet class defines a list of members, a list of possible thematic roles, and a list of frames (patterns) of how these semantic roles can be realized in a sentence.</Paragraph>
    <Paragraph position="5"> WordNet (Fellbaum, 1998) is an English lexical database containing about 120 000 entries of nouns, verbs, adjectives and adverbs, hierarchically organized in synonym groups (called synsets), and linked with relations such as hypernym, hyponym, holonym and others.</Paragraph>
    <Paragraph position="6"> To identify the semantic roles for a clause in a sentence we identify and match the clause pattern to each of the possible semantic frames for the clause verb (from VerbNet). The result is a list of all possible semantic role assignments, from which we must identify the correct one.</Paragraph>
    <Section position="1" start_page="209" end_page="210" type="sub_section">
      <SectionTitle>
3.1 Constructing sentence patterns for the
</SectionTitle>
      <Paragraph position="0"> verbs in a sentence For each sentence clause we construct a syntactical pattern, which is a flat parse representation that identifies the main verb and the other main categories of the clause. As a sentence can have subordinate clauses, we usually have more than one syntactic pattern per sentence. Each such pattern is processed individually.</Paragraph>
      <Paragraph position="1"> Using a constituency parser (such as Charniak's) is suitable in the majority of cases, but there are some sentences where the correct set of role fillers cannot be identified by using the parse tree. For example, for sentences such as The price of oil will rise by 5 cents by the end of the year.</Paragraph>
      <Paragraph position="2"> the phrase the price of oil will be identified as a possible role filler by our system, while the correct result would have the price identified as the Attribute and oil as the Patient. For such cases the use of a dependency parser (such as a Link Grammar parser or a Functional Dependency Grammar parser) would be required.</Paragraph>
      <Paragraph position="3"> We also address some simple cases of pronoun anaphoric reference. For example, for patterns such as Iomega Corp said it has laid off over a quarter of its professional and management staff.</Paragraph>
      <Paragraph position="4">  we identify the pronoun it as referring to the subject of the verb in the main clause (which here is Iomega Corp) if they agree by gender and number. In cases where the type of the concept represented by the phrase is known, an agreement by type is also required.</Paragraph>
      <Paragraph position="5"> Some cases of intersentential pronoun anaphoric references are also resolved by analysing the previous sentence context for suitable candidates, that agree by gender, number and type. Agreement by type is present if the type of the phrase is compatible (or the same) as the type of the phrase it references. For example, if the company refers to Iomega Corp, which is listed as an instance of the type organization, then the types of the two phrases are compatible, as company is defined as sub-type of organization. If agreement by type cannot be assured, the reference is not resolved. The reference is only resolved if there is a single possibility for its resolution.</Paragraph>
    </Section>
    <Section position="2" start_page="210" end_page="210" type="sub_section">
      <SectionTitle>
3.2 Extracting VerbNet semantic role frames
</SectionTitle>
      <Paragraph position="0"> Each verb can be described in VerbNet as a member of more than one class, and therefore the list of its possible semantic frames is a combination of the semantic frames defined in each of the classes in which it participates.</Paragraph>
      <Paragraph position="1"> We extract all the semantic frames in a class and consider them to be possible semantic frames for each of the verbs that are members of this class. Each verb class also defines a list of selectional constraints for the semantic roles. For example, for all the verbs that are members of the VerbNet class get-13.5.1 one of the possible semantic role frames is:</Paragraph>
      <Paragraph position="3"> The selectional constraints check is implemented using one or a combination of the following techniques: hypernym relations defined in WordNet, pattern matching techniques, syntactic rules and some heuristics.</Paragraph>
    </Section>
    <Section position="3" start_page="210" end_page="211" type="sub_section">
      <SectionTitle>
3.3 Matching algorithm
</SectionTitle>
      <Paragraph position="0"> The matching algorithm matches the sentence pattern against each of the possible semantic role frames extracted from VerbNet. We match the constituents before and after the verb in the sentence pattern to the semantic roles before and after the verb in the semantic role frame.</Paragraph>
      <Paragraph position="1"> If the number of the available constituents in the sentence pattern is less than the number of the required slots in the frame, the match fails.</Paragraph>
      <Paragraph position="2"> If there is more than one constituent available to fill a slot in a semantic frame, each of them is considered a different match. If, for a semantic frame, we find a constituent for each of the semantic role slots that complies with the selectional constraints, the algorithm considers this a possible match.</Paragraph>
      <Paragraph position="3"> Multiple results are identified when there are two or more phrases in a sentence that are possible semantic role realisations, or if there are two or more semantic frames for which matches were found. To select the correct role assignment we use a weighting function that assigns scores to each result and returns the one with the highest score. For each identified role the weighting function adds one point if the role does not have any selectional restrictions, and two points if there are restrictions (including prepositional restrictions).</Paragraph>
      <Paragraph position="4"> The total score for a solution is the sum of the scores for each identified roles. The solution with the highest score is selected.</Paragraph>
      <Paragraph position="5"> For example, for the sentence USAir bought Piedmont for 69 dlrs cash per share.</Paragraph>
      <Paragraph position="6"> the algorithm identifies two possible role as- null Therefore, the algorithm returns the first set of role assignments as a result.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="211" end_page="211" type="metho">
    <SectionTitle>
4 Building conceptual graphs
</SectionTitle>
    <Paragraph position="0"> The conceptual graph representation of the sentence is built through the following steps: firstly, for each of the constituents of the sentence we recursively build a conceptual graph representation; then we link all the conceptual graphs representing the constituents into a single graph; and finally, we resolve the unknown (generic) relations.</Paragraph>
    <Paragraph position="1"> Each of these steps is described in more detail in the following sub-sections.</Paragraph>
    <Section position="1" start_page="211" end_page="211" type="sub_section">
      <SectionTitle>
4.1 Building a conceptual graph
</SectionTitle>
      <Paragraph position="0"> representation of a phrase The first step involves building a conceptual graph for a phrase. Our general assumption is that each lexeme in the sentence is represented using a separate concept, therefore all nouns, adjectives, adverbs and pronouns are represented using concepts, while the determiners and numbers are used to specify the referent of the relevant concept (thus further specifying the concept).</Paragraph>
      <Paragraph position="1"> Below we illustrate the procedure for building a conceptual graph for some of the most common types of phrases.</Paragraph>
      <Paragraph position="3"> For phrases following this pattern we create two concepts - one for the NN with a concept referent corresponding to the type of the determiner DT, and another concept representing the adjective, and link both of them by an Attribute relation. If the phrase contains more than one adjective, each of them is represented by a separate concept and they are all linked with Attribute relations to the concept representing the noun.</Paragraph>
      <Paragraph position="5"> This pattern represents phrases where the noun is further specified by the SBAR (for example, The co-pilot, who was acting as a main pilot, landed the plane.) For these patterns a conceptual graph is built for the SBAR and the head concept, if a WHNP phrase (e.g. which or who), is replaced by the concept created for the NP.</Paragraph>
      <Paragraph position="7"> For such prepositional phrases we construct a conceptual graph representing the noun phrase. We also keep track of the preposition heading the prepositional phrase, as it is used to mark the relation between this phrase and the other relevant phrases in the sentence.</Paragraph>
    </Section>
    <Section position="2" start_page="211" end_page="211" type="sub_section">
      <SectionTitle>
4.2 Attaching all constituents to the verb
</SectionTitle>
      <Paragraph position="0"> Once the graphs for each of the constituents are constructed they are linked together in a single conceptual graph. As each of them describes some aspect of the concept represented by the verb, we link them to that concept.</Paragraph>
      <Paragraph position="1"> If the constituent already has an identified semantic role during the previous phase, the same relation is used when constructing the conceptual graph between the CG representing the constituent and the verb. If the constituent does not have any semantic roles identified, a relation with a generic label is used, which allows us to build the structure of the CG concentrating on the concepts involved, and to resolve the generic labels at a later stage. The generic labels used are either REL, or in the case of prepositional phrases headed with a proposition prep, REL prep (e.g.</Paragraph>
      <Paragraph position="2"> REL on).</Paragraph>
    </Section>
    <Section position="3" start_page="211" end_page="211" type="sub_section">
      <SectionTitle>
4.3 Resolving unknown relations
</SectionTitle>
      <Paragraph position="0"> Finally we resolve some of the unknown (generic) relations in the conceptual graph. We keep a data-base of the most common syntactic realisation of relations between concepts with specific types.</Paragraph>
      <Paragraph position="1"> An example of a relation correction rule is: Flight REL from City -&gt; Flight Source City where the left part of the rule represents the two concepts linked by a generic relation and the right side represents the graph after the modification.</Paragraph>
      <Paragraph position="2"> All generic relations present after this step must be manually resolved by the user. The system offers help by suggesting possible relations introduced by a preposition. For example, the preposition for can indicate Beneficiary (e.g. a book for Mary), Duration (e.g. for three hours), etc.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="211" end_page="212" type="metho">
    <SectionTitle>
5 Query representation
</SectionTitle>
    <Paragraph position="0"> Representation of questions differs than representation of declarative sentences and deserves special attention. For sentences representing questions we try to identify the statement that will  correspond to the question and then construct the conceptual graph in a similar way as for declarative sentences.</Paragraph>
    <Section position="1" start_page="212" end_page="212" type="sub_section">
      <SectionTitle>
5.1 Yes/No questions
</SectionTitle>
      <Paragraph position="0"> We process simple yes/no questions (questions that require a yes/no answer) that are constructed by a subject-verb inversion by applying a transformation to reverse the question to a declarative sentence.</Paragraph>
    </Section>
    <Section position="2" start_page="212" end_page="212" type="sub_section">
      <SectionTitle>
5.2 Wh question
</SectionTitle>
      <Paragraph position="0"> The parse tree of a sentence expressing a wh question has the following general structure: SBARQ -&gt;WH phrase SQ ? where the WH phrase is either WHNP, WHADVP or WHPP and represents the concept that triggers the query. The SQ represents the rest of the sentence.</Paragraph>
      <Paragraph position="1"> Similarly to yes/no questions, these type of questions are also transformed to declarative sentences. The wh word (e.g. who, what, where, when) is represented by a generic concept. The relation that attaches this concept to the verb depends on the type of the wh phrase and can be one of the following: WHNP These phrases are headed by the wh question words who, what or which. The relation between the wh phrase and the verb is either identified from applying a suitable semantic frame for this verb, or it is a generic one, REL.</Paragraph>
      <Paragraph position="2"> WHADVP These phrases represent an adverbial modifier for time, place or location. If the phrase marked as WHADVP is where the relation is locative;ifitiswhen, the relation is temporal; and if it is how, the relation is manner. WHPP Such phrases are not processed by our system. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML