File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/81/p81-1011_intro.xml
Size: 7,958 bytes
Last Modified: 2025-10-06 14:04:20
<?xml version="1.0" standalone="yes"?> <Paper uid="P81-1011"> <Title>Natural Language Processing: The EEL System as</Title> <Section position="3" start_page="0" end_page="39" type="intro"> <SectionTitle> NLI EVALUATIONS </SectionTitle> <Paragraph position="0"> A. METHODOLOGY AND SOME RESULTS It had been widely taken for granted some time ago Chat l~LI is as good as is its gr-~-r, and a grammar is as good as it is extensive. The specific needs of users, the requirements of special tasks and the like cook a back seat. The nature of ht--an discourse was yet to be explored. Happily, we have been in a different situation for some time. When the REL \[5, 5, 7\] system was getting into * reasonably sturdy shape with respect to speed and buss, I started planning experiments to test it. There yes important literature about discourse, especially in sociology, such as the work of Schegloff.</Paragraph> <Paragraph position="1"> It was thus clear that successful NLI experiments had Co be based on knowledge of hi, an discourse. St was also clear chat that was the way Co make the interface more natural. This ass~ption has already been fruitful: the NL interface in POL \[9\], a successor Co REL, has already been extensively improved as a result of the EEL-related experiments.</Paragraph> <Paragraph position="2"> Experiments were made in three modes: in addition to face-to-face and human-to-co~puter, cerainal-co-terminal communication was examined, since at present chat is the only practical mode of accessing the computer. Through early 1980, Over 80 subjects, 80,000 words, and over 50 hours were analyzed in great detail. In the fall of 1980, another 13 subjects were tested in the computational mode only, adding approximately 20 hours. From the start, the experiments were encouraging, although limited to ~wo modes: F-F and T-T. Interactions not only showed a great deal of structure but extensive similarities in both modes, the most important being the constancy of the nt=aber of words in sentences (about 70Z); the length of sentences (about 7 words); the existence of fragments (70Z of messages in F-F and 50Z in T-T containing them); and phatics (10Z of total for F-F and 5Z for T-T). Thus similarities between the =odes were a candidate for consideration in experiments in the computational mode, the T-T mode being seemingly quite far removed from natural F-F. The sentence having historically been the unit of analysis (and since phatits were considered of lesser Lmportance from the computational vi~, although of great interest in general), m 7 attention turned Co fragments. REL allowed for three non-sentence type structures: &quot;NP?&quot; (including number parsed into NP); &quot;all/none or uomber&quot; answers; and definitions introducible by the user which make ic possible to include individual knowledge and terminology.</Paragraph> <Paragraph position="3"> The analysis of F-F and T-T protocols, however, showed the existence of other fragment categories, finally analyzed ~nco a dozen categories (see \[8\]). Since they constitute a considerable amount of F-F conversations and even T-T protocols, they clearly had co be watched for in computational experiments.</Paragraph> <Paragraph position="4"> The experiments for actually observin~ user-system interaction were conducted in the winter Cem of 1979/80 and produced 21 protocols, the analysis of which was compared with results of eight F-F and fou~ T-T experiments. Another 13 computational experiments done in the fall coufimed the results of the earlier ones. The Cask in all three =odes was a real one: loading cargo onto a ship, the data coming from the actual envirooment of loading U.S. navy ships by a group in San Diego, California. In the F-F and T-T experiments, ~n,~o persons were involved -- one given cargo item~ Co be loaded, the other infot~nation about decks (details in \[8\]). In the computational mode (H-C) the ship data was in ~he computer and the list of cargo to be loaded was handed Co the subjects, all with Caltech background. Details being available elsewhere andspace limited here, only some major results are given here. Table 1 shows the comparison of the three modes.</Paragraph> <Paragraph position="5"> As can be seen, several statistics show siailaritias: sentence length, message length, fragment length, percentage of words in sentences and fragments. The closeness of the average of messages in T-T and parsed and uonparsed inputs in H--C is striking.</Paragraph> <Paragraph position="6"> Table 2 (the meaning of abbreviations is given below the cable) deals with fragments. Zt is mostly selfexplanatory, as is the absence of dsfiniclons from Y=-F and T-T (although some abbreviations used there fall in this category) and the absence of some other categories from T-T and K-C. At lease ~wo comaents, however, are necessary. The surprisingly low use of terse questions PSn H-C may be accounted for by the tendency toward a formal style in compuCacionnl interaction. The definitions used were often of quite complex character, although far fever than could be hoped for due apparently to lack of familiarity with this capability.</Paragraph> <Paragraph position="7"> The complex character of definitions undoubtedly had some effect on the length of sentences in the H-C mode.</Paragraph> <Paragraph position="8"> the other speaker's string. Often an NP, but it may be an elliptical structure of various forms.</Paragraph> <Paragraph position="9"> ADD (Added ~nformatiou): An elliptical structure, often NP, used to clarif 7 or complete a previous utterance, often ode&quot; s own, e.g., &quot;IC doesn&quot; ~: say anything here about weight, or breaking chins, down. Except for orushablee.&quot;, &quot;It's smaller.</Paragraph> <Paragraph position="10"> 36&quot;x20&quot;x17&quot;.&quot; Spelling out words was Lncluded here.</Paragraph> <Paragraph position="11"> CORE (Correction): This may be done by either speaker.</Paragraph> <Paragraph position="12"> Tf done by the smm speaker it is related Co false start, but semantic considerations suggest a correction, e.g., &quot;Those are 30, ,,h, 48 length by 40 width by 14 height.&quot; COMP (ComoleCion): Completion of the other speaker's utterance, distinguished from interruption by the cooperative nature of the utterance, e.g., &quot;As T've got a lot of...Z've toe B: two pages. A: Yeah.&quot; SZLY.(Ta~kin S co 0ueself~: Muttsrings, even to the point of undecipherabiliCy, noc intended for the other person.</Paragraph> <Paragraph position="13"> TR (Terse reply): An elliptical reply, often NP, e.go, &quot;No.&quot;, &quot;Probably meters.&quot;, &quot;50 and 7.62.&quot; TQ (Terse OuesCion) : An elliptical question, often NP, e.g., '~hy?&quot;, &quot;How about pyrotechnics?&quot;, '~hich ones?&quot; TI (Terse Information): A rather elusive category, neither question, reply nor co--and, an elliptical statement but one often requiring an action.</Paragraph> <Paragraph position="14"> F8 (False Sta~c): These are also abandoned utterances, but i~edistely followed by usually syntactically and semantically related ones, e.g., &quot;They may, they may be identical classes.&quot;, '~ell, the height, the next largest height I've got is 34.&quot; TRUN (Truncated.): An incomplete utterance, voluntarily</Paragraph> <Paragraph position="16"> nets is borrowed from Malinoweki degs tern &quot;phacic colmtmion&quot; with which he referred to chose vocal utterances chat serve to establish social relations racher than the direct purpose of communication.</Paragraph> <Paragraph position="17"> This term has been broadened to include all fragments which help keep the channel of communication open, such as '~ell&quot;, '~aic&quot;, and even '~ou Curkay&quot;. Two subcategories of phacics are:</Paragraph> </Section> class="xml-element"></Paper>