File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/81/p81-1025_metho.xml
Size: 10,573 bytes
Last Modified: 2025-10-06 14:11:25
<?xml version="1.0" standalone="yes"?> <Paper uid="P81-1025"> <Title>PERSPECTIVES ON PARSING ISSUES</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> IS SIMULATION OF HUMAN PROCESSING IMPORTANT? </SectionTitle> <Paragraph position="0"> Yes, very much so, even if all you are interested in is a good computer program. The reason why was neatly captured in ~rinciDles of Artificia~ lnte~lieence: &quot;language has evolved as a c~unication medium between intelliaen~ beings&quot; (Nilsson, p. 2). That is, natural language usage depends on the fact that certain things can be left ambiguous, left vague, or just left out, because the hearer knows almost as much as the speaker.</Paragraph> <Paragraph position="1"> Natural language has been finely tuned to the co-,-unicative needs of human beings. We may have to adapt to the limitations of our ears and our vocal chords, but we have otherwise been the masters of our language. This is true even if there is an innate universal grmmuar (which I don't believe in). A universal grammar applies few constraints to our use of ellipsis, ambiguity, anaphora, and all the other aspects of language that make language an efficient means for information transfer, end a pain for the progr----er.</Paragraph> <Paragraph position="2"> Because language has been fitted to what we do best, I believe it's improbable that there exist processes very unlike what people use to deal with it. Therefore, while I have no intention of trying to model reaction time data points, I do find human behavior important for two kinds of information. First, what do people do well, how do they do it, and how does language use depend on it? Second, what do people do poorly, and how does language use get around it? The question '~ow can we know what human processing is really like?&quot; is a non-issue. We don't have to know what human processing is really like. But if people can understand texts that leave out crucial background facts, then our programs have to be able to infer those facts.</Paragraph> <Paragraph position="3"> If people have trouble understanding information phrased in certain ways, then our programs have to phrase it in ways they can understand. At some level of description, our programs will have to be &quot;doing what people do,&quot; i.e., filling in certain kinds of blanks, leaving out certain kinds of redundancies, and so on. But there is no reason for computational linguists to worry about how deeply their programs correspond to human processes.</Paragraph> </Section> <Section position="4" start_page="0" end_page="105" type="metho"> <SectionTitle> WILL PARALLEL PROCESSING CHANGE THINGS? </SectionTitle> <Paragraph position="0"> People have been predicting (and waiting for) great benefits from parallelism for some time. Personally, I believe that most of the benefits will come in the area of interpretation, where large-scale memory search, such as Scott Fahlman has been worrying about, are involved.</Paragraph> <Paragraph position="1"> And, if anything, improvements in the use of semantics will decrease the attractiveness of syntactic parsing.</Paragraph> <Paragraph position="2"> But I also think that there are not that many gains to be had from parallel processing. Hash codings, discrimination trees, and so on, already yield reasonably constant speeds for looking up data. It is an inconvenience to have to deal with such things, but not an insurmountable obstacle. Our real problems at the moment are how to get our systems to make decisions, such as &quot;Is the question &quot;How many times has John asked you for money?&quot; rhetorical or not?&quot; We are limited not by the number of processors, but by not knowing how to do the job.</Paragraph> </Section> <Section position="5" start_page="105" end_page="105" type="metho"> <SectionTitle> TtI.~E LINGUISTIC PERSPECTIVE HAVE OUR TOOLS AFFECTED US? </SectionTitle> <Paragraph position="0"> Yes, and adversely. To partially contradict my statements in the last paragraph, we've been overly concerned with how to do things with existing hardware and software. And we've been too impressed by the success computer science has had with syntax-driven compilation of programming languages. I1 is certainly true that work on grammars, parsers, code generators, and so on, have changed compiler generation from maesive multi-man-year endeavors to student course projects. If compiler technology has benefited so much from syntactic parsers, why can't computational linguistics? The problem here is that the technology has not done what people think it has. It has allowed us to develop modern, well-structured, task-oriented languages, but it has not given us natural ones. Anyone who has had co teach an introductory progru~ing course knows that.</Paragraph> <Paragraph position="1"> High-level languages, though easier to learn than machine language, are very different from human languages, such as English or Chinese.</Paragraph> <Paragraph position="2"> Programming languages, to readjust Nilsson's quote, are developed for c~unication between morons. All the useful features of language, such as ellipsis and ambiguity, have to be eliminated in order co use the technology of syntax-driven parsing. Compilers do not point the way for computational linguistics. They show instead what we get if we restrict ourselves to simplistic methods.</Paragraph> </Section> <Section position="6" start_page="105" end_page="105" type="metho"> <SectionTitle> DO WE PARSE CONTEXT-FREELY? </SectionTitle> <Paragraph position="0"> My working assumption is that the syntactic knowledge used in comprehension is at most context-free and probably a lot less, because of memory limitations.</Paragraph> <Paragraph position="1"> This is mostly a result of semantic heuristics taking over when constructions become too complex for our cognitive chunking capacities. But this is not a critical assumption for me.</Paragraph> <Paragraph position="2"> ; ~rE~AC'~ ONS Since I don't believe in the pure gran~atical approach, I have to replace this last set of questions with questions about the relationship between our knowledge (linguistic and otherwise) and the procedures for applying i1. Fortunately, the questions still make sense after this substitution.</Paragraph> </Section> <Section position="7" start_page="105" end_page="106" type="metho"> <SectionTitle> DO OUR ALGORITHMS AFFECT OUR KNOWLEDGE STRUCTURES? </SectionTitle> <Paragraph position="0"> Of course. In fact, it is often hard to decide whether some feature of a system is a knowledge structure or a procedural factor. For example, is linear search a result of data structures or procedure designs? CAN WE TEST ALGORITHMS/KNOWLEDGE STRUCTURES SEPARATELY? We do indeed try experiments based on the shape of knowledge structures, independently of bow they are used (but I think that most such experiments have been inconclusive). I'm not sure what it would mean, however, for a procedure to be validated independently of the knowledge structures it works with, since until the knowledge structures were right, you couldn't tell if the procedure was doing the right thing or not.</Paragraph> <Paragraph position="1"> WHY DO WE SEPARATE RECOGNITION AND PRODUCTION? If I were trying to deal with this questio n on Erratical grounds, I wouldn't know what it meant.</Paragraph> <Paragraph position="2"> Cr~ars are not processes and hence have no direction.</Paragraph> <Paragraph position="3"> They are abstract characterizations of the set of well-formed strings. From certain classes of gra-w-ars one can mechanically build recognizers and rando~ generators. But such machines are not the gra-~ars, and a recognizer is manifestly not the same machine as a generator, even though the same grammar may underlie both.</Paragraph> <Paragraph position="4"> Suppose ve rephrase the question as '~hy do we have separate knowledge structures for interpretation and production?&quot; This presupposes that there are separate knowledge structures, and in our current systems this is only partially true.</Paragraph> <Paragraph position="5"> Interpreting and production programs abound in ad hoc procedures that share very little in common near the language end. The interpreters are full of methods for guessing at meanings, filling in the blanks, predicting likely follow-ups, and so on. The generators are full of methods for eliminating contextual items, picking appropriate descriptors, choosing pronouns, and so on. Each has a very different set of problems to deal with. On the other hand, our interpreters and generators do share what we think is the important stuff, the world knowledge, without which all the other processing wouldn't be worth a partridge in a parse tree. The world knowledge says what makes sense in onderstandins and what is important to talk about.</Paragraph> <Paragraph position="6"> Part of the separation of interpretation and generation occurs when the programs for each are developed by different people. This Tesults in unrealistic systems that write what they can't read and read what they can't write. Someday we'll have a good model of how knowledge the interpreter gains about understanding a new word is converted to knowledge the generator can use to validly pick that word in production. This viii have account for how we can interpret words without being ready to use them.</Paragraph> <Paragraph position="7"> For example, from a sentence like &quot;The car swerved off the road and struck a bridge abutment,&quot; we can infer that an abutment is a noun describing some kind of outdoor physical object, attachable to a bridge. This would be enough for interpretation, but obviously the generator will need co know more about what an abutment is before it could confidently say &quot;Oh, look at the cute abutment!&quot; A final point on sharing. There are two standard arguments for sharing at least gr=mmatical information. One is to save space, and the other is to maintain consistency. Without claiming that sharing doesn't occur, I would like to point out that both arguments are very weak. First, there is really not a lot of grammatical knowledge, compared against all the other knowledge we have about the world, so not that much space would be saved if sharing occurred. Second, if the generator derives it's linguistic knowledge from the parser's data base, then we'll have as much consistency as we could measure in people anyway.</Paragraph> </Section> class="xml-element"></Paper>