File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-3161_metho.xml

Size: 9,589 bytes

Last Modified: 2025-10-06 14:13:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-3161">
  <Title>Shalt2 a Symmetric Machine Translation System with</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. Pm'ser
</SectionTitle>
    <Paragraph position="0"> The Shall2 parser first generates a DS with packed structural ambiguities for a given inlmt sentence. It actually calls a PEG parser or 'lbmita's LR-parser \[12\] for PUG, and then calls a DS converter to map a PEG parse tree or a PUG feature structurc~ to a DS. Next, mapping rules arc applied to the DS so that lexical and structural mappings are associated with each node and arc in the DS. Figure 2 shows a DS with mapping constraints for the sentence &amp;quot;Keep the diskette in the drive,&amp;quot; where the verb &amp;quot;keep&amp;quot; has tive word senses: *execute, *guard, *hold, *employ, and *own. It is clear that we will end up with ten distinct conceptual representations if we evaluate all tim mapping rules, and in general, combinatorial explosion could easily make the parser impractical. Viewing the mapping rules as coostralnts rather titan procedural rules is the key to our parser, and is called delayed composition of conceptual representation.</Paragraph>
    <Paragraph position="1"> A sentence analyzer called SENA \[x41 disambiguates the DS by using a constraint solver JAUNT\[S\] and a &amp;quot;~Tomlta~s lucid amhigulty packing \[12\] is used to obtain a fea- ture structure with packed atructurM ambiguities. case base. JAUNT applies grammatical constraints (for instance, nlodifier-modifiee links between nodes do not cross one another) and semantic constraints (such as selectional restrictions,t~ functional control, and other NL object identity constraints detected by the context analyzer) uniformly to a DS that has ambiguities, and calculates pairwise consistency efficiently for each combination of nodes. Finally, the case base provides prefo erential knowledge to favor one pair of nodes over all other consistent pairs. The disambiguation process call be summarized as follows: 1. For each conflicting arc in the DS, calculate the &amp;quot;dlstauce&amp;quot; \[6\] between the two nodes in the arc by using a case base.</Paragraph>
    <Paragraph position="2">  2. Leave tile ,xrc with the minimal distance and eliminate all the other conflicting arcs. Each NL object associated with a matching node in the case base also gives a higher preference to the same class of instance over the other instances in a node.</Paragraph>
    <Paragraph position="3"> 3. Propagate the deletion of arcs to other nodes and arcs in the DS. Eliminate nodes and arcs that are no longer valid in the DS.</Paragraph>
    <Paragraph position="4"> 4. Apply the above steps until there are no conflicting  arc~ in the DS.</Paragraph>
    <Paragraph position="5"> The resulting 1)S has 11o structural ambiguity. B.emadling lexical ambiguities are similarly resolved, because we can also determine which pair of NL objects connected with an arc has the minimal distance in the case base. Our case base for cmnputer manuals would support the &amp;quot;diskette -PPADJUNCT- drive&amp;quot; arc and the *hold-1 instance with diskette as its DOBJECT.</Paragraph>
    <Paragraph position="6"> Therefore, the parser will eventually return</Paragraph>
    <Paragraph position="8"> as a disambiguated result. Nagao \[e} discusses snore sophisticated techniques such as scheduling sets of arcs to bc disambiguated, backtracking, and relaaation of case base matching by means of an is-a hierarchy.</Paragraph>
    <Paragraph position="9"> Finally, a context analyzer is called to resolve auaphora, identity of definitc nouns, and implicit identity between NL objects. It stores the DS in the working space, where references to preceding instances are represented by the links between instances in the DSs. These inter-sentential links are used to determine the scopes of adverbs such as &amp;quot;also&amp;quot; and &amp;quot;only&amp;quot;.</Paragraph>
    <Paragraph position="10"> For example, if the phrase &amp;quot;role of an operator&amp;quot; appears in a text, the word &amp;quot;operator&amp;quot; could be a person who operates a machine or a logical operator for computation, but no sufficient information is available to resolve this ambiguity at this moment. In such cases, creating referential links in a forest of DSs could lead us to find evidence for disamblguating these two meanings. The scope of an adverb, such as &amp;quot;also,&amp;quot; is determined by identifying repeated NL objects and newly introduced NL objects, where the latter are more likely to fall within the scope of the adverb.</Paragraph>
    <Paragraph position="11"> The context analyzer uses a similar method to determine lexical ambiguities that were not resolved by the sentence analyzer wlmn the case base failed to provide enough information.</Paragraph>
    <Paragraph position="12"> ?~We use about 20 of the semantic features described in the LDOCE. The restrictions imposed by the features are rather &amp;quot;loose,&amp;quot; and are used to eliminate only unlikely combinations of word senses, Aches DE COL1NG-92. NANII~S, 23-28 AO~r\[ 1992 l 03 '7 PRO{:, O1: COLING-92. NANTES, AUO. 23-280 1992</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. Concept Mapper
</SectionTitle>
    <Paragraph position="0"> Given a conceptual representation, which is an output from the parser, and a target language, the concept mapper tries to discover another conceptual represeatation that has a well-defined mapping to a DS while keeping the semantic content as intact as possible. This process is called conceptual transfer. If the given coneeptual representation already has a well-defined mapping to a DS, the concept mapper does nothing and Shalt2 works like an interlingual MT system. It is important that conceptual transfer should be related with the mapping to a DS, because there are generally many conceptual representations with a similar semantic content. The existence of well-defined mapping not only guarantees that the generator can produce a sentence in the target language, but also effectively eliminates unsuccessful paraphrasing.</Paragraph>
    <Paragraph position="1"> 111 addition to the paraphrasing rules mentioned earlier, the concept mapper uses the following genera\] rules for conceptual transferJ The paraphrasing rules are composed to make a complex mapping.</Paragraph>
    <Paragraph position="2"> * Projection: Map an NL object witha filled slot s to an instance of the same class with the unfilled slot s. Projection corresponds to deletion of a slot 8.</Paragraph>
    <Paragraph position="3"> s Generalization: Map an NL object of a class X to an instance of one of the superclasses of X.</Paragraph>
    <Paragraph position="4"> e Specialization: Map an NL object of a class X to an instance of one of the subclasses of X.</Paragraph>
    <Paragraph position="5"> As an example, a projection rule is frequently used when we translate English nouns into Japanese ones, as in the following examp: diskette (*diskette (:sum (*st))) diskettes (*diskette (:num (*pl))) a diskette (*diskette (:num (*st)) (:def (*indeX))) the diskettes (*diskette (:num (*pl))</Paragraph>
    <Paragraph position="7"> Here, the four English noun phrases above are usually translated by the same Japanese noun phasef~ (the fifth one), which does not carry any information on mum and :def. We provide a paraphrasing rule for translation in the opposite direction such that for any instance of the *object can obtain appropriate :sum and :def fillers. The parser, however, is responsible for determining these fillers in most cases. In general, the designer of semi-equivalent rules for translation in one direction has to provide a way of inferring missing information for translation in the opposite direction. Generalization and specialization rules are complementary and can be paired to become equivalent rules when a specialization rule for any instance of a class z is unambiguous. That is, without losing any fillers, one can always choose an instance of a subclas~ V to which z can be uniquely mapped. A generalization from e~ch ~ to z provides the opposite mapping.</Paragraph>
    <Paragraph position="8"> ~Theee are ~emi-equivMent rules. Equivalent rules have higher priority when the tulsa axe to be applied. ~fOne exception is that deictic noun phrases are translated when we use the ~apanme counterpart &amp;quot;-~C/3 ~ for the determiner &amp;quot;the&amp;quot;.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. Grammar-Based Sentence Genera-
</SectionTitle>
    <Paragraph position="0"> tor Recent investigation of unification grammars and their bi-direetionality Its, 9, \]0\] has enabled us to design and implement a grammar-based generator. Our gen~ crater uses a PUG grammar, which is also used by the parser, to traverse or decompose a feature structure obtained from a D$ in order to find a sequence of grammar rule applications, which eventually lead to lexical rules for generating a sentence. The generation algorithm is based primarily on Wedekind's algorithm, but h merittied for PUG.</Paragraph>
    <Paragraph position="1"> The current implementation of our generator lacks subtle control of word ordering, honorific expressions, and style preference. We are developing a set of discourse parameters to a~ociate preferences with grammar rules to be tried, so that specific expressions are favored by the parameter settings.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML