File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/96/c96-1013_abstr.xml

Size: 4,739 bytes

Last Modified: 2025-10-06 13:48:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1013">
  <Title>Concept clustering and knowledge integration from a children's dict ionary</Title>
  <Section position="1" start_page="0" end_page="55" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Knowledge structures called Concel)t (?lustering Knowledge (\]raphs (CCKGs) are introduced along with a process for their construction from a machine readable dictionary. C(3K(\]s contain multiple concepts interrelated through multil)le semantic relations together forming a semantic duster represented by a con-.</Paragraph>
    <Paragraph position="1"> ceptual graph. '1'he knowledge acquisition is performed on a children's first dictionary. The concepts inw)lved are general and typical of a daily lid conw'a'salion. A collection of conceptual clusters together can lbrm the basis of a lexical knowledge base, where each C'(,'l((.~ contains a limited nnmber of highly connected words giving usefid information about a particular domain or situation.</Paragraph>
    <Paragraph position="2"> it Introduction When constructing a l,exieal Knowledge Ilase (1,KB) useful for Natural l,anguage Processing, the source of information from which knowledge is acquired and the structuring of this information within the LKB are two key issues. Machine Readable Dictionaries (MIH)s) are a good sour(:e of lexical information and have been shown to be al)plical)le to the task of I,KII COllStruction (l)ola.n ct al., 1993; Calzolari, t992; Copestake, \[990; Wilks et al., 1989; Byrd et al., 1987). Often though, a localist approaeh is adopted whereby the words are kept in alphabetical order with some representation of their definitions in the form of a template or feature structure. F, flbrt in findlug cormections between words is seen in work on automatic extraction of sem~mtic relations Dora MRI)s (Ahlswede and Evens, 1988; Alshawi, 1989; Montemagrfi and Vandorwende, 19!32). Additionally, effort in finding words that are close semantically is seen by the current interest in statistical techniques for word clustering, looking at (-ooccurrences of words in text corpora or dictionaries (Church and IIanks, 1989; Wilks et al., 1989; Brown et al., 11992; l'ereira et al., 11995).</Paragraph>
    <Paragraph position="3"> Inspired by research in the. areas of semantic relations, semantic distance, concept clustering, and using (,once I tual (Ji a l hs (Sowa, 1984) as our knowledge representation, we introduce (;oncept (?lustering I{nowledge Graphs (CCKGs). Each (JCKG will start as a Conceptual Graph representation of a trigger word and will expaud following a search algorit, hm to incorporate related words and ibrm a C'oncept Cn,s(,er. The concept chlstcr in itself is interesting for tasks such as word disambiguation, but the C(~K(\] will give more to that cluster. It will give the relations between the words, making the graph in some aspects similar to a script (Schank and Abelson, 11975). llowever, a CCK(I is generated automaticMly and does not rely on prin,itives but on an unlimited number of concel)ts , showing objects, persons, and actions interacting with each other. This interaction will be set, within a lmrtieular domain, and the trigger word should be a key word of the domain to represent. 11' that process would be done for the whole dictionary, we would obtain an l,l( II divided into multiple clusters of words, each represented by a CCK(\]. Then during text processing fin: example, a portion of text could be analyzed using the appropriate CCK(\] to lind implicit relations and hell) understanding the text.</Paragraph>
    <Paragraph position="4"> Our source of knowledge is the Americ~m iteritage First I)ictionary t which contains 1800 entries aml is designed for children of age six to eight. lit is made for yom~g l)eople learning the structure and the basic w)cabulary of their language. In comparison, an adult's dictiouary is more of a ref erence tool which assumes knowledge of a large basic vocabulary, while a learner's dictionary assumes at limited vocabulary but still some very sophisticated concepts. Using a children's dictionary allows us to restrict our vocabulary, but still work on general knowledge about day to day (:Oil-cel)tS and actions.</Paragraph>
    <Paragraph position="5"> In the folk)wing sections, we first present the l Copyright @1994 by \[Ioughton Miftlin Company.</Paragraph>
    <Paragraph position="6"> Reproduced by permission h'om TIlE AMERICAN ItERITAGI'; FIRST DIC'I?IONAIlY.</Paragraph>
    <Paragraph position="7">  transformation steps from the definitions into conceptual graphs, then we elaborate on the integration process, and finally, we close with a discussion. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML