File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/e93-1019_intro.xml
Size: 3,454 bytes
Last Modified: 2025-10-06 14:05:24
<?xml version="1.0" standalone="yes"?> <Paper uid="E93-1019"> <Title>Rule-based Acquisition and Maintenance of Lexical and Semantic Knowledge *</Title> <Section position="4" start_page="149" end_page="150" type="intro"> <SectionTitle> 2 The Knowledge Acquisition and </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="149" end_page="150" type="sub_section"> <SectionTitle> Maintenance Problems </SectionTitle> <Paragraph position="0"> At the Center for Machine Translation, we use Lexical Functional Grammar (LFG) \[Kaplan Bresnan, 1982\] as a basis for our syntactic grammars as well as our linking rules \[Levin, 1987\] for mapping syntactic functions to and from semantic roles. The latter we refer to as &quot;mapping rules&quot;. These mapping rules are used in conjunction with a domain model to build or generate from the interlingua text representations (ILT). The use of ILT is characteristic of the CMT approach to Knowledge-Based Machine Translation \[Goodman, 1991; Mitamura e~ al., 1991; Frederking et al., 1992\].</Paragraph> <Paragraph position="1"> Given the emphasis placed on the lexicon in LFG in both syntax and semantics and the extensive domain knowledge required for our translation system, we place a great deal of importance on the lexicon and finding easy methods to acquire, maintain, view, store and reuse the lexical information. COOL is a tool we are developing and using on the ESTRATO project for accomplishing these tasks.</Paragraph> <Paragraph position="2"> The knowledge acquisition and maintenance tasks can be rather cumbersome. Acquiring 1000's of new semantic concepts and placing them into the top-level semantic hierarchy by hand is tedious and errorprone. This also applies to adding English and Spanish words. Once the run-time knowledge sources for the various NLP modules have been acquired, maintaining consistency among the lexical and semantic files (phrasal-noun list, glossary, morpho-syntactic lexicons, word-to-concept mappings and the semantic concepts) is difficult. The NLP modules require different lexical and semantic knowledge with varying formats. All modules share some information which must be kept consistent, such as the part of speech and the word-sense. The concept name must be the same for the run-time semantic knowledge source, the Spanish run-time lexical knowledge source, and the English run-time lexical knowledge source. Both acquiring the knowledge and maintaining consistency in the knowledge are prone to human error.</Paragraph> <Paragraph position="3"> One of the requirements of ESTRATO is that a non-linguist lexicographer be able to acquire and maintain lexical information as much as possible. A-COOL allows the semi-automatic creation of NLP lexical knowledge from lexicographic information supplied by a non-linguist.</Paragraph> <Paragraph position="4"> At present, linguists must do some of the lexical acquistion work such as providing semantic class information and some specialized syntactic information for closed-class items, adjectives and verbs. When there is not always a one-to-one lexical mapping from a Spanish and English word to the same concept \[Talmy, 1972; Talmy, 1985\], the lexical entries can only be produced semi-automatically. Linguists must also provide collocational information in the lexicon relevant to lexical selection\[Mel'~uk et al., 1984\].</Paragraph> </Section> </Section> class="xml-element"></Paper>