File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/84/p84-1026_intro.xml
Size: 7,468 bytes
Last Modified: 2025-10-06 14:04:26
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1026"> <Title>SYNTACTIC AND SEMANTIC PARSABILITY</Title> <Section position="3" start_page="0" end_page="112" type="intro"> <SectionTitle> I. INTRODUCTION </SectionTitle> <Paragraph position="0"> Parsing as standardly defined is a purely syntactic matter. Dictionaries describe parsing as analysing a sentence into its elements, or exhibiting the parts of speech composing the sentence and their relation to each other in terms of government and agreement. But in practice, as soon as parsing a natural language (NL) is under discussion, people ask for much more than that. Let us distinguish three kinds of algorithm operating on strings of words: recognition output: a decision concerning whether the string is a member of the language or not parsing output: a syntactic analysis of the string (or an error message if the string is not in the language) translation output: a translation (or set of translations) of the string into some language of semantic representation (or an error message if the string is not in the language) Much potential confusion will be avoided if we are careful to use these terms as defined. However, further refinement is needed. What constitutes a &quot;syntactic analysis of the string&quot; in the definition of parsing? In applications development work and when modeling the whole of the native speaker's knowledge of the relevant part of the language, we want ambiguous sentences to be repesented as such, and we want Time flies like an arrow to be mapped onto a whole list of different structures. For rapid access to a database or other back-end system in an actual application, or for modeling a speaker's performance in a conversational context, we will prefer a program that yields one syntactic description in response to a given string presentation. Thus we need to refer to two kinds of algorithm: null all-paths parser output: a list of all structural descriptions of the string that the grammar defines (or an error message if the string is not in the language) one-path parser output: one structural description that the grammar defines for the string (or an error message if the string is not in the language) By analogy, we will occasionally want to talk of all-paths or one-path recognizers and translators as well.</Paragraph> <Paragraph position="1"> There is a crucial connection between the theory of parsing and the theory of languages. There is no parsing without a definition of the language to be parsed. This should be clear enough from the literature on the definition and parsing of programming languages, but for some reason it is occasionally denied in the context of the much larger and richer multi-purpose languages spoken by humans. I frankly cannot discern a sensible interpretation of the claims made by some artificial intelligence researchers about parsing a NL without having a defined syntax for it. Assume that some program P produces finite, meaningful responses to sentences from some NL ~ over some terminal vocabulary T, producing error messages of some sort in response to nonsentences. It seems to me that automatically we have a generative grammar for 2. Moreover, since ~ is clearly recursive, we can even enumerate the sentences of L in canonical order. One algorithm to do this simply enumerates the strings over the terminal vocabulary in order of increasing length and in alphabetical order within a given string-length, and for each one, tests it for grammaticality using P, and adds it to the output if no error message is returned.</Paragraph> <Paragraph position="2"> Given that parsability is thus connected to definability, it has become standard not only for parser-designers to pay attention to the grammar for the language they are trying to parse, but also for linguists to give some thought to the parsability claims entailed by their linguistic theory. This is all to the good, since it would hardly be sensible for the study of NL's to proceed for ever in isolation from the study of ways in which they can be used by finite organisms.</Paragraph> <Paragraph position="3"> Since 1978, following suggestions by Stanley Peters, Aravind Joshi, and others, developed most notably in the work of Gerald Gazdar, there has been a strong resurgence of the idea that context-free phrase structure grammars could be used for the description of NL's. A significant motivation for the original suggestions was the existence of already known high-efficiency algorithms (recognition in deterministic time proportional to the cube of the string length) for recognizing and parsing context-free languages (CFL's).</Paragraph> <Paragraph position="4"> This was not, however, the motivation for the interest that signficant numbers of linguists began to show in context-free phrase structure grammars (CF-PSG's) from early 1979. Their motivation was in nearly all cases an interest sparked by the elegant solutions to purely linguistic problems that Gazdar and others began to put forward in various articles, initially unpublished working papers. We have now seen nearly half a decade of work using CF-PSG to successfully tackle problems in linguistic description (the Coordinate Structure Constraint (Gazdar 1981e), the English auxiliary system (Gazdar et al. 1982), etc.) that had proved somewhat recalcitrant even for the grossly more powerful transformational theories of grn~---r that had formerly dominated linguistics. The influence of the parsing argument on linguists has probably been overestimated. It seems to me that when Gazdar (1981b, 267) says our grammars can be shown to be formally equivalent to what are known as the context-free phrase structure grammars \[which\] has the effect of making potentially relevant to natural language grammars a whole literature of mathematical results on the parsability and learnability of context-free phrase structure grammars he is making a point exactly analogous to the one made by Robert Nozick in his book Anarchy, State and Utonia, when he says of a proposed social organization (1974, 302): We seem to have e realization of the economists&quot; model of a competitive market. This is most welcome, for it gives us immediate access to a powerful, elaborate, and sophisticated body of theory and analysis.</Paragraph> <Paragraph position="5"> We are surely not to conclude from this remark of Nozick's that his libertarian utopia of interest groups competing for members is motivated solely by a desire to have a society that functions like a competitive market. The point is one of serendipity: if a useful theory turns out to be equivalent to one that enjoys a rich technical literature, that is very fortunate, because we may be able to make use of some of the results therein.</Paragraph> <Paragraph position="6"> The idea of returning to CF-PSC as e theory of NL's looks retrogressive until one realizes that the arguments that had led linguists to consign CF-PSG's to the scrap-heap of history can be shown to be fallacious (cf. especially Pullom and Gazdar (1982)). In view of that development, I think it would be reasonable for someone to ask whether we could not return all the way to finite-state grammars, whichwould give us even more efficient parsing (guaranteed deterministic linear time). It may therefore be useful if I briefly reconsider this question, first dealt with by Chomsky nearly thirty years ago.</Paragraph> </Section> class="xml-element"></Paper>