File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/a92-1033_intro.xml
Size: 5,193 bytes
Last Modified: 2025-10-06 14:05:06
<?xml version="1.0" standalone="yes"?> <Paper uid="A92-1033"> <Title>Overview of Natural Language Processing of Captions for Retrieving Multimedia Data</Title> <Section position="3" start_page="0" end_page="231" type="intro"> <SectionTitle> 2 Methodology </SectionTitle> <Paragraph position="0"> The information retrieval system we have developed is based on two stages:.a coarse-grain match to reduce the list of possible information for a later fine-grain match (Rau 1987).</Paragraph> <Paragraph position="1"> Three tasks that we deemed essential for this system included the ability to represent and produce a logical form of the caption, the ability to generate keywords from the logical form, and the ability to load in previously stored caption logical forms for matching against the query logical form.</Paragraph> <Section position="1" start_page="0" end_page="231" type="sub_section"> <SectionTitle> 2.1 NL Parser </SectionTitle> <Paragraph position="0"> We have used an existing natural language processing program, the DBG Message Understanding System (Montgomery et al. 1989), as a starting point. This program was developed for understanding dialog conversations. To accommodate the existing captions at NWC, we had to make modifications to the grammar, functional parser, and template processor.</Paragraph> <Paragraph position="1"> The grammar rules were changed to enable parsing of punctuation, descriptive noun phrases, dates, geographic locations, numeric and descriptive vehicle designations.</Paragraph> <Paragraph position="2"> Additional rules were introduced to handle theme-oriented phrases as opposed to agent-initiated sentences. The structure of functional parse output was altered to accommodate mapping into the type hierarchy. Specifically, tokens were introduced to allow linking together words based on syntactic relationships. The resulting output structure appears similar to slot-assertion notation.</Paragraph> <Paragraph position="3"> In the original DBG system, the template processor produced frame structures for a semantic analysis of the sentence. This portion of the system was redone using an object-oriented programming methodology. We have created a single type hierarchy to hold both nouns and verbs.</Paragraph> <Paragraph position="4"> Producing the logical form is a matter of mapping the predicate expressions from the functional parse output into the type hierarchy. Methods are used to set inner cases for both nouns and verbs (e.g., theme, agent, location, etc.); set modifiers for nouns and verbs (e.g., adjectives and adverbs); set correlations between classes (e.g., part_of, has_part, program_about, etc.); and generate the logical form from class instances and associated slot values.</Paragraph> </Section> <Section position="2" start_page="231" end_page="231" type="sub_section"> <SectionTitle> 2.2 Generating the Keywords </SectionTitle> <Paragraph position="0"> Keyword records to be used in the coarse-grain match are obtained from the type hierarchy directly rather than from the logical form output. An instance of a class uses the class name as the keyword. The keyword is based on logically proper names, not definite descriptions as described by Frixione et al. (1989). Methods are defined for caching keyword records containing the caption identifier and any case information to a keyword file for each class instance.</Paragraph> <Paragraph position="1"> Each class has a keyword file maintained in sorted order.</Paragraph> </Section> <Section position="3" start_page="231" end_page="231" type="sub_section"> <SectionTitle> 2.3 Matching </SectionTitle> <Paragraph position="0"> Once an English query is instantiated within the type hierarchy to reflect the query logical form, the instances indicate which class and subclass keyword files need to be examined in the coarse-grain match. The corresponding keyword files are read, and the keyword records are intersected using the caption-id as the unique identifier. In the future, case information will be used at query time for specifying the role for a word (e.g, initiator of an action as opposed to the recipient) and treated as a filter in selecting the appropriate case records within the keyword file. Caption-ids whose intersection score exceed a coarse-grain match threshold become eligible for fine-grain matching.</Paragraph> <Paragraph position="1"> Fine-grain matching entails mapping the logical form for a stored parsed caption back into the type hierarchy and matching it against the query instances within the hierarchy.</Paragraph> <Paragraph position="2"> Figure 1 shows the appearance of the type hierarchy with the existence of both the query &quot;missile on stand&quot; and caption 262865, &quot;Sidewinder AIM 9R missile on stand,&quot; within it. the query instance for the class &quot;AIM-9R.&quot; Matching of relationships is currently based on exact matching. The matching process is being modified to allow relationship matching based on a predefined set of relationships. Caption-ids with match scores exceeding a fine-grain match threshold are presented to the user.</Paragraph> </Section> </Section> class="xml-element"></Paper>