File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2066_metho.xml
Size: 24,323 bytes
Last Modified: 2025-10-06 14:12:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2066"> <Title>WHY HUMAN TRANSLATORS STILL SLEEP IN PEACE? ~FOUR ENGINEERING AND LINGUISTIC GAPS IN NI.P~</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> WHY HUMAN TRANSLATORS STILL SLEEP IN PEACE? ~FOUR ENGINEERING AND LINGUISTIC GAPS IN NI.P~ </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> Because they wilt keep their job quite for a few.</Paragraph> <Paragraph position="1"> lhis paper has been inspired by a recent editorial on the Financial Times, that gives a discouraging overview of commercial natural Language processing systems (~the computer that can sustain a natural language conversation.., is unlikely to exist for several decades'). Computational Linguists are not so much concerned with applications but computer scientists have the ultimate objective to build systems that can 'increase the acceptability of computers in everyday situations.' Eventually, linguists as well would profit by a significant break-through in natural L~ulguage processing.</Paragraph> <Paragraph position="2"> This paper is a brief dissertation on four erlgineering and linguistic issues we believe critical for a more striking success of NLP: extensive acquisition of the semantic lexicon, formal performance evaluation methods to evaluate systems, development of shell systems for rapid prototyping and customization, and finally a more linguistically motivated approach to word categorization.</Paragraph> </Section> <Section position="3" start_page="0" end_page="383" type="metho"> <SectionTitle> THE ENTANGLED FOREST </SectionTitle> <Paragraph position="0"> In the last decade, formal methods to express syntactic and semantic knowledge (whether in an integrated fashion or not), proliferated to form an entangled forest. New comers seem to prefer inventing a brand-new method, or at least a brand-new name, rather than trying to make sense of the dozens of ........ *-unification-G *-~.'i~, etc. Semantic languages are relatively fewer, but even fewer are the corr~only agr'eed principia about the type and quality of language phenomena to be expressed.</Paragraph> <Paragraph position="1"> Different are also the perspectives under which linguists and computer scientists proceed in their work: Linguists and psychologists are concerned with the nature of human communication, and use the computer as a tool to model very specific, and yet meaningful aspects of language. To them, any phenomenon is worth to bee Looked at, no matter how frequent, because the focus in on humans, not on computes.</Paragraph> <Paragraph position="2"> Computer scientists are interested in building computer programs that can ultimately be useful in some relevant field of social life, as machine translation, information retrieval, tutoring, etc. In order for a NLP system to be successful, it must cover the majority of language phenomena that are prominent to a given application. Coverage here is a primary demand, because the focus is on the use of computers, not on the modeling of mind. I believe that failing to state clearly these differences has been a source of misunderstanding and scarce cooperation. Recently Jacobs pointed out (Jacobs 1989) that Linguists measure the power of a parser against pathological cases, and this very fact 'has been damaging to natural language processing as a field'. Linguists may as well comgtain that the proliferation of NLP papers listing in detail the computational features of 'THE SYSTEM X' and claiming some 5% better performances, has been damaging to computational linguistics as a field.</Paragraph> <Paragraph position="3"> The author of this paper does not consider her past (and current) work untouched by these criticisms, but wishes that some more explicit and general re-thinking be shared by the computational linguistics + natural language processing community. This paper was inspired by a recent editorial on the Financial Times (Cookson 1989) that presents an overview of commercial and research systems based on NLP technology. The panorama of commercial systems is quite discouraging: the editorial is spread with such sentences as 'not yet robust enough' 'their grammatical coverage is modest' 'no computer has the background knowledge to resolve enough., linguistic ambiguities' and concludes: 'the computer that can sustain a natural free-flowing conversation on a subject of your choice is unlikely to exist for several decades.' On the other' side, the author highlights several times the importance of this discipline and its possible applications. He also quotes the UK bank's innovation manager David Barrow who says 'Natural language processing will be a key technology in increasing the acceptability of computers in everyday situation'.</Paragraph> <Paragraph position="4"> Yet, natural language processing began to appear as a discipline since 1950. Progress has been certainly made, but it is not a striking one, with respect to other disciplines equally mature. Why is that? The reader of this paper should be aware by now he run across one of those where-are-we-now-and-where-are-we-going kind of papers; but we hope he will keep following us in a brief walk through the rough pathway of NLP. But please remember.., some (not all) viewpoints expressed hereafter would seem narrow-minded if applied to corM3utational linguistics, but are perfectly reasonable if the perspective is robust NLP.</Paragraph> <Paragraph position="5"> In n~view, the major obstacle to a wider adoption of NLP systems is identified by four engineering and linguistic 'gaps'. Engineering gaps are: 1. Lack of format evaluation methods (Section 1); 2. Lack of tools and engineering techniques for rapid prototyping and testing of NLP modules (Section 4).</Paragraph> <Paragraph position="6"> Linguistic gaps are: 1. Poor encoding of the semantic lexicon (Section 2); 2. Poorly motivated models of word categorization (Section 3).</Paragraph> <Paragraph position="7"> This paper has two strictly related guideline ideas, that I would like to state at the beginning: 1. Breadth is more important than depth: In evaluating the pros and cons of linguistic and coa~uter methods for NLP we should always keep in mind their breadth. Methods that cannot be applied extensively and systematically are simply uninteresting. It is perfectly reasonable, and in fact very useful (despite what Hans Karlgren thinks about) to work on sub-languages, provided that the experiments we set to process such domains are reproducible on any other sub-domain. It is perfectly reasonable to define very fine-grained knowledge representation and manipulation frameworks to express deep 2.</Paragraph> <Paragraph position="8"> language phenomena, provided we can demonstrate that such knowledge can be encoded on an extensive basis. As long as the field of linguistic knowledge representation will neglect the related issues of knowledge identification and acquisition, we cannot hope in a breakthrough of NLP.</Paragraph> <Paragraph position="9"> Domain-dependency is not so bad. One of the early errors in AI was the attempt of devising general purpose methods for general purpose problems. Expert systems have been successful but they lie on the other extreme. Current At research is seeking for a better compromise between generality and knowledge power.</Paragraph> <Paragraph position="10"> Linguistic knowledge is very vast and a full codification is unrealistic for the time being. I believe that a central issue is to accept the unavoidable reality of domain-dependent linguistic knowledge, and seek for generalizable methods to acquire one such knowledge. As discussed in section 3, I also believe that useful linguistic insights can be gathered by the study of language sub-domains.</Paragraph> <Paragraph position="11"> I. THE 'TRULY VIABLE' APPROACH Let us maintain our forest-and-path methaphor. Why is it so difficult to get oriented? The cunning reader of technical papers might have noticed a very frequent concluding remark: 'we demonstrated that XYZ is a viable approach to sentence (discourse, anaphora) analysis (generation)'.</Paragraph> <Paragraph position="12"> But what is 'viable'? Other disciplines developed models and experiments to evaluate a system: one could never claim XYZ viable without a good dial of tables, figures and graphs. Why is it so difficult in the field of NLP? Very few papers have been published on the evaluation of NLP systems. Some well documented report on large NLP projects provides such performance figures as accuracy, in_telligibility and ua~\[~t, however these figures are not uniformly defined and measured. One good example is the Japanese Project (Nagao 1988). The evaluation is performed by humans, applying some scoring to the system output (e.g. translation quality).</Paragraph> <Paragraph position="13"> Other papers provide a list of language phenomena dealt with by their systems, or an excerpt of sentence types the system is able to process.</Paragraph> <Paragraph position="14"> These results give at best some feeling about the real power of a system, but by no means can be taken as a formal performance measure.</Paragraph> <Paragraph position="15"> 3 8,4 2 Two papers address the problem of performance ewJluation in a systemetic way: (Guido 1896) and (Reed 1988). The approaches are rather different: Guido and Mauri attempt an application of standard performance evaluation methods to the NLP discipline, introducing a formal expression for the =_performance measure of a NLP system. This is an hard task, as it comes out of the last section of the paper, where the formula is applied to a simple system. Nevertheless, we believe this work being seminal: formal methods are the most suitable for an uniform evaluation of NLP systems. In (Read 1988) a 'sourcebook approach' is pursued. The authors propose a fine-grained cataloguing of language phenomena, to be used as a reference for the evaluation of NLP systems. This method in our view is not in contrast with, but rather complementary to, a formal evaluation. However, the final results of this research are not readily available as yet. A second remark is that in measuring the competence of a system, linguistic issues should be weighed by the 'importance' they here in a given application. It is unrealistic to pretend that a system can address every possible phenoalenon, but it must be able to address those phenomena that are prominent to the application domain.</Paragraph> <Paragraph position="16"> One interesting question is: How do we evaluate the linguistic closure of a sub-language? Here is a list of measures, that have the interesting (to me) feature of being acquirable with the use of computers: 1.</Paragraph> <Paragraph position="17"> 2.</Paragraph> <Paragraph position="18"> 3.</Paragraph> <Paragraph position="19"> identification of the sub-language by a plot of different rootoform types per corpus size; Identification of contexts, by and analysis of word co-occurrences, and identification of semantic relations, by an analysis of functional words; Measures of complexity, to predict the co~N~Jtational tractability of a corpus. Some of these measures are listed in (Kittredge 1987), e.g. presence of copula, conjunctions, quantifiers, long nominal compounds, etc.</Paragraph> <Paragraph position="20"> Others are suggested in the very interesting studies on readability, originated by (Flesh 1946). To our knowledge these methods have never been applied to the study of linguistic closure in NLP, even though they reached a remarkable precision at measuring the effect of sentence structures and choice of words on language corr~)rehension by humans (and consequently by computers).</Paragraph> </Section> <Section position="4" start_page="383" end_page="383" type="metho"> <SectionTitle> 2. THE WORLD IN A BOX </SectionTitle> <Paragraph position="0"> Language resides in the lexicon: word knowledge is world knowledge. One of the major limitation of current NLP systems is a poor encoding of \[exicat semantic knowledge: the world fits a small box.</Paragraph> <Paragraph position="1"> The problem with lexica is twofold: First, there is no shared agreement about the type and quality of phenomena to be described in a lexicon. \[n (Evens 1988) three major c~npeting approaches to meaning representation in lexica are listed: relational semantics, structural semantics arid con~oonential/feature analysis. In (Leech 1981) 7 types of meaning are distinguished.</Paragraph> <Paragraph position="2"> Relational semantics, but for the type and number of conceptual relations (or cases) to be used, shows some uniformity among its supporters for what concerns the structure of the lexicon and the way this information is used to perform semantic analysis. The other approaches highlight much deeper phenomena than the semantic relations between the words in a sentence, lout it is a hard task to induce from the literature any firm principle or shared agreement on the type of information to be represented.</Paragraph> <Paragraph position="3"> In (Velardi forthcoming) it is otter)ted a mere detailed cataloguing of meaning types as found in NLP literature. It is shown that all types of semantic knowledge are in principle useful for the purpose of language understanding applications, but cannot be acquired on an extensive basis because the primary source of such knowledge are linguists ar~f psycholonguistic experiments.</Paragraph> <Paragraph position="4"> Again, relational semantics is somehow more intuitive than other methods and it is easier to acquire, because it can be induced using the evidence provided by texts rather than deduced by pre-defined conceptual primitives. But even then, acquiring n~re than a few hundred word definitions became a prohibitive task because of consistency, completeness, and boredom problems.</Paragraph> <Paragraph position="5"> Some work on com$)uter aided acquisition of \[exica</Paragraph> <Paragraph position="7"> 1987); during IJCA! 1989, a workshop was held on this topic (Zernik 1989b). All. the above works use corpora or on-line dictionaries as e source of semantic learning, but the methodologies employed to manipulate this texts are very different and still inadequate to the task. Personally, we believe corpora a mere adequate source of information than dictionaries.</Paragraph> <Paragraph position="8"> , on-line dictionaries are not easily available to the scientific community; . dictionaries mostly include taxonomic 3 385 information, that is hardly extracted because of circularity and consistency probtems, and because there is no clear method to extract and describe multiple senses in absence of examples; the information is not uniform within a given dictionary, and may be very different from dictionary to dictionary depending upon their purpose (e.g. etymological dictionaries, style dictionaries, etc.).</Paragraph> <Paragraph position="9"> most of all, the information in dictionaries is very general, whereas in NLP often are required domain-specific categories and definitions.</Paragraph> <Paragraph position="10"> Corpora provide rich examples of word uses, including idioms and metonymies. It is possible to identify different senses of a word by a context analysis (Vetardi 1989a) (Jacobs 1988). In addition, if the corpus used for lexicat acquisition is the application domain, one can derive a catalogue of relevant language issues. In any case, both research on corpora and dictionaries is very promising, and hopefully will provide in the near future more insight and experimontat support to meaning theories.</Paragraph> </Section> <Section position="5" start_page="383" end_page="383" type="metho"> <SectionTitle> 3. THE &quot;IS A&quot; DILEMMA </SectionTitle> <Paragraph position="0"> The core of any meaning representation method is a conceptual hierarchy, the \[S_A hierarchy. People that have experience on this, know how much time-consuming, and unrewarding, is the task of arranging words in a plausible hierarchy. The more concepts you put in, the more entangled becomes the hierarchy, and nobody is never fully satisfied. In (Niremburg 1987) a system is presented to assist humans in entering and maintaining the consistency of a type hierarchy.</Paragraph> <Paragraph position="1"> But this does not alleviate the inherent complexity of grouping concepts in classes.</Paragraph> <Paragraph position="2"> One could maintain that type hierarchies in NLP systems should not mimic human conceptual primitives, but rather they are a computer method to express semantic knowledge in a compact form and simulate som~ very partial reasoning activity. Even under this conservative perspective, it is quite natural for the human hierarchy builder to try to make sense of his own taxonomic activity (and get confused) rather than stay with what the specific application requires. Why not introducing such categories as MENTAL ACT and SOCIAL_PHENOMENON even though the texts to be processed only deals with files and disks? Several institutions devoted large efforts towards the definition of IS A hierarchies for NLP. Some of these hierarchies are claimed 'general-purpose': to me, this claim is a minus, rather than a plus.</Paragraph> <Paragraph position="3"> NLP systems have been often presented as a model of human activities. Now, our taxonomic activity is precisely one good example of activity that works very differently than in computers. In computers, hierarchies are used to assert that, if X has the feature Y, and Z is-a X, then Z has the feature Y. Things are in the same category iff they have certain properties in common. This is an 9bjectivist view of categorization that has been proved in several studies inadequate to model human behavior. Objectivism has been argued against in experimental studies by psychologists, anthropologists, and linguists. In his beautiful book (Lakoff 1987) Lakoff lists several phenomena relevant to the activity of categorization, like: family resemblance, centrality, generat iyit~ chaining, conceptual and funct!onal embodiment etc. Only the first of these phenomena has to do with the classical theory of property inheritance. But Lakoff shows that the elements of a category can be related without sharing any common property. The title of his book 'woman, fire and dangerous things' is an examples of apparently unrelated members of a single category in an aborigenal language of Australia. The categorization principle that relates these elements is called by Lakoff the domain-of-experience princip!e. Woman and fire are associated in myth. Fighting and fighting implements are in the same domain of experience with fire, and hence are in the same class.</Paragraph> <Paragraph position="4"> Birds also are in the same class, because they are believed to be the spirits of dead human-females.</Paragraph> <Paragraph position="5"> Other elements are 'catted' in a class by a chaining princip!e. Element x calls element y that calls z etc.</Paragraph> <Paragraph position="6"> It is outside the scope of this paper to summarize, or even list, the findings of lakoff and other researchers on human taxonomic activity. However the literature provides evidence and matter of thoughts concerning the inadequacy of property inheritance as a method to structure linguistic knowledge in NLP systems.</Paragraph> <Paragraph position="7"> But even if we stay with property inheritancedeg we should at \[east abandon the idea of seeking for general purpose taxonomies. Again, corpora are a useful basis to study categorization in sub-worlds. Categories in dictionaries are the result of a conceptualization effort by a linguist. Corpora instead are a 'naive' example of a culturally homogeneous group of people, that draw much unconsciously on their knowledge on the use, and meaning, of words. Corpora are more interesting than dictionaries to study categorization, just like tribes are more 4 386 interesting than 'civilized' t~nthropetogists.</Paragraph> <Paragraph position="8"> cultures to</Paragraph> </Section> <Section position="6" start_page="383" end_page="383" type="metho"> <SectionTitle> 4deg GET ACCUSTOMED TO CUSTOMIZATION </SectionTitle> <Paragraph position="0"> The main obstacle to a wider adoption of NLP systems in such activities as information retrieval and automatic translation are reliability and customization. These two issues are clearly related: NLP make errors not because the programs have bugs, but because their knowledge base is very limited. To cope with poor knowledge encoding, ad-hoc techniques are widely adopted, even though the use of ad-hoc techniques i=!; not advertised in papers, for obvious reasons. Ad-hoc techniques are the main cause of tong customization time, when switching from one a~lication domain to a slightly different one.</Paragraph> <Paragraph position="1"> Customization and reliability are in turn related with what we said so far: . we can't predict the time spent for customization, as it happens in database systems, because methods for knowledge acquisition and knowledge structuring do not exist or are far from being assessed; . we can't evaluate reliability, because there are not formal evaluation methods for NLP systems.</Paragraph> <Paragraph position="2"> Ag~in, we came to the same problems. But if we mu~;t forcefully abandon the idea of genera\[ purpose language processors, at \[east we should equip ourselves with shell systems and human-computer interfaces that can assist hu~lans in the creation, testing and maintenance of all data=entry activities in, stied by NLP systems. Thi:~ paper shewed that in semantics there are not as yet assessed theories. In syntax, we have too many, but not systematically tested. Shells and interfaces are useful at: 1. performing a wider experimentation of different theories; 2. i~king the data-entry activity by hLcnans mere constrained or at least supervised; 3. render the customization activity to some extent forecastab\[e; 4. ensure consistency with the linguistic principia embraced by the system designers.</Paragraph> <Paragraph position="3"> In the field of Expert Systems, shells began to appe~r when the expert system technology was welt asse~;sed. May be shells and interfaces have been disregarded so far by the computational linguistic co~Jnity because they are felt i,~ature, given the =~;~ate of art, or just because we are so much affectionate toward the idea of encoding the world .... However, several activities concerned with NLP systems can be co,~uterized or co.~ter-assisted. We already mentioned the work by Niremburg et at. to assist the creation of s concept ontology. A special extension of this system is under experimentation to guide the acquisition of a relational lexicon (Nirenhburg 1989). Other systems have been presented for prototyping and testing of syntactic parsers (Briscoe 1987) (Bougarev 1988) (Marotta 1990).</Paragraph> </Section> <Section position="7" start_page="383" end_page="383" type="metho"> <SectionTitle> 5. ! DON'T HAVE THE READY RECIPE </SectionTitle> <Paragraph position="0"> I know you knew it! Where-are-we-now papers never offer a panacea. This is a position paper: it did not present solutions, rather it pinpointed to problems and, where available, to current promising research (rather i,~1odestty, some is of our one). The following is a summary list of what the author considers her own guidelines for future work: , Performance evaluation: never say a method is 'viable' if you can't prove it formally.</Paragraph> <Paragraph position="1"> Lexical semantics: don't try to seek for the 'real meaning' of things. Use evidence provided by on=line corpora as a source, and test-bed, of lexical acquisition methods.</Paragraph> <Paragraph position="2"> . ~: property inheritance is inadequate. Is it possible to i,~le,~nt on a computer some of the observed human mechanisms of categorization? Customization: genera\[ purpese systems are unrealistic. Build shells arKt interface systems to allow for a faster and well=assisted customization activity.</Paragraph> </Section> class="xml-element"></Paper>