File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/h89-1008_metho.xml
Size: 9,219 bytes
Last Modified: 2025-10-06 14:12:20
<?xml version="1.0" standalone="yes"?> <Paper uid="H89-1008"> <Title>RAPID PORTING OF THE PARLANCE tm NATURAL LANGUAGE INTERFACE</Title> <Section position="4" start_page="83" end_page="84" type="metho"> <SectionTitle> THE PARLANCE INTERFACE </SectionTitle> <Paragraph position="0"> The Parlance interface from BBN Systems and Technologies Corporation is an English language database front end. It has a number of component parts: a graphical user interface, a language understander that translates English queries into database commands for relational database systems such as Oracle and VAX Rdb, a control structure for interacting with the user to clarify ambiguous queries or unknown words, and a dbms driver to call the database system to execute database commands and to return retrieved data to the user.</Paragraph> <Paragraph position="1"> The Parlance system uses several domain-dependent knowledge bases: 1. A domain model, which is a class-and-attribute representation of the concepts and relationships that the Parlance user might employ in queries.</Paragraph> <Paragraph position="2"> 2. A mapping from this domain model to the database, which specifies how to find particular classes and attributes in terms of the database tables and fields of the underlying dbms.</Paragraph> <Paragraph position="3"> 3. A vocabulary, containing the lexical syntax and semantics of words and phrases that someone might use to talk about the classes and attributes.</Paragraph> <Paragraph position="4"> 4. Miscellaneous additional information about how information is to be printed out (for example, column headers that are different from field names in the database).</Paragraph> <Paragraph position="5"> The Learner is used to create these knowledge bases.</Paragraph> <Paragraph position="6"> The following queries illustrate the kinds of questions that one can ask the Parlance system after it is configured for the Navy database: What's the maximum beam of the Kitty Hawk? Show me the ships with a personnel resource readiness of C3.1 List the ships that are C1 or C2.</Paragraph> <Paragraph position="7"> Is the Frederick conducting ISE in San Diego? How many ships aren't NTDS capable? Which classes have a larger fuel capacity than the Wichita? How many submarines are in each gee region.</Paragraph> <Paragraph position="8"> Are there any harpoon capable C1 ships deployed in the Indian Ocean whose ASW rating is MI? List them.</Paragraph> <Paragraph position="9"> Show the current employment of the carriers that are C3 or worse, sorted by overall readiness.</Paragraph> <Paragraph position="10"> Where is the Carl Vinson? 2 What are the positions of the friendly subs?</Paragraph> </Section> <Section position="5" start_page="84" end_page="85" type="metho"> <SectionTitle> THE LEARNER </SectionTitle> <Paragraph position="0"> The Learner is a software tool that creates the domain-dependent knowledge bases that the Parlance system needs. It &quot;learns&quot; what Parlance needs to know from several sources: 1. The database system itself (i.e., the dbms catalogue that describes the database structure, and the values in various fields of the database).</Paragraph> <Paragraph position="1"> 2. A human teacher (who is probably a database administrator, someone familiar with the structure of the database, but who is not a computational linguist or AI expert).</Paragraph> <Paragraph position="2"> 3. A core domain model and vocabulary that are part of the basic Parlance system.</Paragraph> <Paragraph position="3"> 4. Inferences (about such things as morphological and syntactic features) that the Learner makes (subject to correction and modification by the teacher).</Paragraph> <Paragraph position="4"> Figure 1 shows the input and output structure of the Learner. We call the process of using the Learner configuring Parlance for a particular application.</Paragraph> <Paragraph position="5"> The human teacher uses the Learner by stepping through a series of menus and structured forms. The Learner incrementally builds a structure that can be output as the knowledge bases shown in Figure 1.</Paragraph> <Paragraph position="6"> The teacher chooses particular actions and is led through steps which elicit related information that Parlance must know. For example, when the teacher designates that a particular table or set of tables belong to a class named &quot;ship&quot;, the Learner immediately allows the teacher to give synonyms for this class, such as &quot;vessel&quot;. The Learner will then infer that the plural form of the synonym is &quot;vessels&quot;, instead of making the teacher supply the plural form, although the teacher can easily correct the Learner if the word has an irregular plural.</Paragraph> <Paragraph position="7"> Whenever information is optional, the teacher can decline to specify it at the first opportunity, and can later initiate an action to provide it. Both required and optional information can be changed by the teacher using the Learner's editing capabilities.</Paragraph> <Paragraph position="8"> The ability to assign names freely, the freedom to do many operations in the sequence that makes the most sense to the person using the Learner, and the fact that the Learner expresses instructions and choices in database terms wherever possible, make it easy for database administrators who are not computational linguists or AI experts to configure the Parlance interface.</Paragraph> </Section> <Section position="6" start_page="85" end_page="87" type="metho"> <SectionTitle> CONFIGURING PARLANCE </SectionTitle> <Paragraph position="0"> Before the Learner existed, Parlance configurations were created &quot;by hand&quot;.</Paragraph> <Paragraph position="1"> That is, highly skilled personnel had to use a separate set of programs (including a Lisp editor) to create the appropriate configuration files.</Paragraph> <Paragraph position="2"> Figure 2 compares this by-hand configuration process with the first experience using the Learner on the Navy database. The two examples used different databases, but in each case we began with a large set of sample queries in the target domain, and periodically tested the developing configuration by running those queries through the Parlance system. We measured our progress by keeping track of the number of those queries the system could understand as the configuration process went on. Figure 2 actually considerably understates the productivity enhancement realized with the Learner, because the personnel database used for the by-hand configuration was much smaller and less complex than the Navy database.</Paragraph> <Paragraph position="3"> The Navy database used to test the first version of the Learner was considerably restructured and enlarged, and we had an opportunity to configure Parlance for the newer database. Since we had a new, improved version of the Learner, we chose to configure Parlance to the second version of the Navy database &quot;from scratch&quot;, rather than by building on the results of the first configuration. This gave us an opportunity to measure the effort required to use the Learner to do a much larger system configuration, since the size of the target database (measured in terms of the number of fields) had nearly tripled.</Paragraph> <Paragraph position="4"> The results in Figure 3 and its accompanying notes show that the Learner robustly scaled up to the task, and that the time required to perform the configuration increased much less than the number of fields in the database, the vocabulary size, or any other simple metric of size. In fact, for a modest 1/3 increase in configuring effort, a configuration roughly 3 times larger was created.</Paragraph> <Paragraph position="5"> personnel 1st Navv 2nd Navv Notes to accompany Figure 3: (0) Changes in the underlying system since this configuration was created make it impossible to measure some of the numbers in this column accurately, so the numbers dealing with vocabulary are estimates.</Paragraph> <Paragraph position="6"> (1) Records were not kept at the time this configuration was created, but the configuration happened over a period of months.</Paragraph> <Paragraph position="7"> (2) That this level of effort includes not just time spent using the Learner but also time required to understand the domain, and to do some testing and revision. About 60% of this time was spent using the development version of the Learner.</Paragraph> <Paragraph position="8"> (3) Records were not kept at the time this configuration was done, but it involved many person-months.</Paragraph> <Paragraph position="9"> (4) This is an estimate which includes inflected forms of regular words and words that were acquired directly from database fields.</Paragraph> <Paragraph position="10"> (5) This includes words read from the database and all words directly represented in the vocabulary; it excludes inflected forms of morphologically regular words.</Paragraph> <Paragraph position="11"> (6) This is a rough measure of the semantic complexity of the domain, since it excludes words that are abbreviations or synonyms.</Paragraph> </Section> class="xml-element"></Paper>