File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/j85-4002_metho.xml
Size: 72,049 bytes
Last Modified: 2025-10-06 14:11:41
<?xml version="1.0" standalone="yes"?> <Paper uid="J85-4002"> <Title>PHRED: A GENERATOR FOR NATURAL LANGUAGE INTERFACES 1</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 THE PHRED KNOWLEDGE BASE </SectionTitle> <Paragraph position="0"> The knowledge base shared by the phrasal analyzer (PHRAN) and phrasal generator (PHRED) consists of pattern-concept pairs, where the pattern contains a linguistic structure and the concept its internal representation. While this representation may be classified as within the systemic/functional tradition (cf. Halliday 1968, Kay 1979) the implementation of the PHRED knowledge base differs in certain important details. The use of the PC pair in PHRED may be distinguished from some other language production mechanisms (McDonald 1980, Mann and Matthiessen 1983, McKeown 1982) in which grammatical information and conceptual information are separated: The &quot;pattern&quot; component of each PC pair may include conceptual information, and the properties associated with each PC pair may combine linguistic and conceptual attributes. Like the systems described above, however, PHRED uses these properties for indexing and applying each pattern, particularly using information about agreement among constituents of the pattern and relationships between properties of constituents and properties of the entire pattern.</Paragraph> <Paragraph position="1"> The following is a simple example of a pattern-concept pair, representing some of the knowledge about the use of the verb remove:</Paragraph> <Paragraph position="3"> Specifications of components of the pattern in angle brackets (< >) include linguistic information (root = remove) or conceptual categories (agent, container) or a combination of linguistic and conceptual specifications.</Paragraph> <Paragraph position="4"> Additional information associated with each PC pair determines the correspondences between elements of the conceptual structure and constituents of the linguistic structure: The special &quot;value&quot; indicator designates the association of a property of the PC pair with a property of one of its constituents, specified by number. Thus &quot;tense = (value 2 tense)&quot; implies that the tense of the pattern is the tense of the second constituent, the verb.</Paragraph> <Paragraph position="5"> &quot;cont = (value 5)&quot; indicates that the token unified with the variable &quot;?cont&quot; in the conceptual template corresponds to the fifth constituent, the object of from. The above PC pair can be used by PHRED, depending on the concept being expressed, to produce the sentence You shouM remove the files from your directory, or the infinitive phrase to remove a file from the top level directory. The final output is determined by the combination of this PC pair with the input attributes and one or more ordering patterns, which embody general linguistic constraints and constraints on surface order.</Paragraph> <Paragraph position="6"> In addition to the linguistic patterns and associated conceptual representation, PC pairs contain a set of properties, or attributes, and other information that guides their use. Some of this information, such as &quot;tense = (value 2 tense)&quot; above, is used to determine correspondences between a pattern and its constituents. Other properties are used for indexing purposes. There is also a facility for &quot;escapes,, or the ability to call a special procedure from within the declarative knowledge representation. While this facility was often exploited in early versions of PHRAN, it is problematic for knowledge bases shared with PHRED. Procedures called during analysis are seldom useful to the generator or vice versa. Therefore such procedure calls have seldom been used in PHRED, and an attempt has been made to encode all knowledge in a declarative form that can be used by both the generator and the analyzer.</Paragraph> <Paragraph position="7"> The &quot;pattern&quot; part of the PC pairs is a list of constituents, where each constituent in a pattern is generally described either as a pattern of speech (p-o-s) or as a member of a descriptive category (e.g., person, physical object). Patterns may also be formed by conjunction and disjunction of other patterns and may contain specifications of constraints. For example, the constituent <and root = remove voice = active form = infinitive> is a single-constituent pattern that would generate the infinitive verb to remove, while <and p-o-s = noun-phrase> <or person physob>> Computational Linguistics, Volume 11, Number 4, October-December 1985 221 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces represents a noun-phrase that refers to a person or physical object.</Paragraph> <Paragraph position="8"> Patterns are used to represent lexical entries, determiners and particles which refer to nothing, as well as very specific phrases which refer to particular objects.</Paragraph> <Paragraph position="9"> The pattern <word = the> <word = big> <word = apple> represents the phrase the big apple used to refer to New York City. This phrase can also be produced by the general pattern <p-o-s = article> <p-o-s = np2> when used to refer to an apple.</Paragraph> <Paragraph position="10"> Specialized linguistic constructs are often partially frozen patterns that behave as a particular grammatical unit. The phrase kick the bucket behaves as a verb that conjugates but does not passivize. It corresponds to the pattern <and p-o-s = verb root = kick> <word = the> <word = bucket> which functions as an intransitive verb.</Paragraph> <Paragraph position="11"> Part of the knowledge associated with a pattern-concept pair is the correspondence between the properties of the pattern's constituents and the properties of the entire pattern. Associated with the kick the bucket pattern above is the knowledge that the person, number, and tense of the pattern correspond to the person, number and tense of the first constituent, the form of the verb kick. In generation, this results in the recursive application of constraints from a pattern to its components: To generate a past-tense verb meaning died, the system will operate recursively on the pattern above to generate a past-tense form of kick.</Paragraph> <Paragraph position="12"> Patterns do not necessarily represent a fixed word order. For example, in <person> <root = tell> <person> <word = to> <word = get> <word = lost> the pattern retains its meaning when used in a passive form or infinitive phrase. Such patterns are used in combination with ordering patterns, which control the various ways in which a pattern may be linguistically realized. An example of an ordering pattern that could be used in conjunction with the get lost pattern above is the passive infinitive ordering, used to produce, for example, the man to be told to get lost or the file to be removed from the current directory:</Paragraph> <Paragraph position="14"> The &quot;#2&quot; and &quot;#3&quot; within the ordering pattern indicate that the constraints on the second and third constituents of the coordinated pattern are conjoined with the first and second constituents of the ordering pattern, respectively. The &quot;#rest&quot; indicates where additional constituents are generally inserted. This information guides the combination of the ordering pattern with other PC pairs.</Paragraph> <Paragraph position="15"> An extra set of angle brackets is used to mark a constituent that is optional to the pattern, such as the by phrase.</Paragraph> <Paragraph position="16"> The &quot;p-o-s = inf-phrase&quot; property specifies that the pattern produces an infinitive phrase, and the &quot;forms = (passive-s)&quot; property restricts the use of this ordering to patterns which have &quot;passive-s&quot; among their forms.</Paragraph> <Paragraph position="17"> Patterns that have an unspecified word order do not have a &quot;p-o-s&quot; attribute, and thus do not produce a particular pattern of speech independently. These are combined by PHRED with ordering patterns to allow for idioms or expressions which may appear in various forms, such as bury the hatchet in The hatchet was buried at Appomattox. The same effect could be accomplished without ordering rules by increasing the number of fixed-word-order patterns combinatorially. The use of the ordering patterns, however, has a certain elegance as well as a practical value: it allows the specification of certain specialized constructs as relations among particular constituents, regardless of where the constituents appear in the actual output. In this case, the specialized meaning of telling someone to get lost is effectively represented by the relationship between the verb tell and its complement to get lost. This meaning may be realized in a variety of forms; for example, the combination of the get lost pattern with a passive ordering may produce the sentence John was told by Bill to get lost.</Paragraph> <Paragraph position="18"> * While there are similarities between the ordering rules used by PHRED and transformational grammar rules, there are some important differences: PHRED assumes no syntactic derivation; rather, the final ordering of a pattern of speech is produced by combining a set of linguistic patterns. Furthermore, there is no strict sequence in which the patterns must be applied: A given ordering pattern may be chosen either before or after a pattern with which it is to be combined. The combina222 Computational Linguistics, Volume 11, Number 4, October-December 1985 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces tion of ordering patterns is constrained by the interactions among the properties of the patterns, instead of by controlling the order in which they are used. In this way PHRED is more flexible than other systems that handle word order as a final phase of the generation process (cf. Goldman 1975).</Paragraph> <Paragraph position="19"> The pattern-concept pair representation falls into a class of linguistic representations known as feature systems, including lexical functional grammar (Bresnan 1982), functional grammar (Kay 1979) and functional unification grammar (Kay 1984). These systems, which developed in parallel, may be described using a common notation, and vary mostly in the way in which they are typically applied. Pattern-concept pairs have been applied primarily to the problem of representing the specialized linguistic knowledge that seems necessary to use language as a communicative tool. This emphasis causes minor variations to seem important. For example, most unification grammar implementations require that a syntactic category be among the features, or attributes, of every linguistic pattern. The omission of this requirement for pattern-concept pairs facilitates the representation of patterns that have a specialized meaning but do not have rigid surface structures. This is illustrated by the get lost pattern and by the specialized knowledge about borrar and escribe in the UNIX domain.</Paragraph> <Paragraph position="20"> The next section describes how the knowledge base described here is utilized as part of a real-time generation system.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 THE GENERATION PROCESS </SectionTitle> <Paragraph position="0"> The production of an utterance in PHRED is a recursive process which can be divided into three phases: * Fetching is the retrieval of pattern-concept pairs from the knowledge base.</Paragraph> <Paragraph position="1"> * Restriction consists of validating a potential pattern-concept pair to confirm that it fulfills a given set of constraints and adding new constraints to the pattern.</Paragraph> <Paragraph position="2"> * Interpretation is the generation of lexical items that match the constraints of the restricted pattern.</Paragraph> <Paragraph position="3"> The generation algorithm implemented in PHRED is similar to those used in other unification-based systems (cf. McKeown 1982, Appelt 1983). Because of the expectation that PHRED would serve as part of a real-time interface, however, the system was designed to avoid the expensive unification process. Thus the fetching phase of PHRED accomplishes much of the task of checking the constraints of a pattern against the constraints to be satisfied, a function that could be performed by unification. The more time-consuming unification process is applied only after the fetching phase has produced a candidate pattern.</Paragraph> <Paragraph position="4"> A second important aspect of the PHRED algorithm, also addressed to the problem of avoiding unnecessary computation, is the overall strategy for handling alternative patterns. Once the fetching mechanism has retrieved a pattern, PHRED uses this pattern unless it is found to violate a constraint. This is similar to the strategy implemented in MUMBLE (McDonald 1980), which also avoids comparisons of linguistic structures of comparable validity. Unlike MUMBLE, PHRED does limited back-tracking under some circumstances. The backtracking mechanism, however, relies on the fact that the fetching mechanism generally produces some useful patterns and that most constraint violations are due to incorrect selections among ordering patterns.</Paragraph> <Paragraph position="5"> Each of these phases and its role in the generation process will now be discussed in further detail.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 FETCHING </SectionTitle> <Paragraph position="0"> While PHRAN and PHRED use the same knowledge structures, the way in which these structures are accessed for the purpose of generation naturally differs from their access by the analyzer. PHRAN must recognize a set of lexemes as possibly corresponding to a pattern and thereby retrieve an appropriate pattern-concept pair from the knowledge base. PHRED, on the other hand, accesses the knowledge base by fetching pattern-concept pairs whose template fits the concept and constraints to be expressed.</Paragraph> <Paragraph position="1"> Because fetching can be a time-consuming part of the generation process, it is important for the fetching mechanism to operate efficiently, but also to produce only those PC pairs likely to be useful. For this purpose, PHRED uses a hashing scheme designed to produce an ordering, or stream, of candidate patterns with a minimum of computation. Specifically, it performs some quick computation to select a sequence of PC pairs that might be of help in constructing a particular utterance.</Paragraph> <Paragraph position="2"> These pairs are then considered as PHRED continues its work. As the generator uses the first available appropriate utterance rather than evaluate all potential candidates, the ordering of this stream influences the choice process as well as the number of patterns ultimately considered.</Paragraph> <Paragraph position="3"> The implementation of PHRED permits conceptual attributes to influence the search of linguistic alternatives, but separates this process from other aspects of language planning. High-level text goals are not included in the knowledge structures that influence the fetching process. In this regard, the system within which PHRED operates does not promote the desirable interaction among text planning and structural choices, as suggested by Appelt (1982) and Danlos (1984). Higher-level planning in the UNIX Consultant, for modularity, is performed by a separable planning component.</Paragraph> <Paragraph position="4"> The role of fetching in PHRED is to provide access to the pattern-concept pairs in the knowledge base. The input to the fetching mechanism is a set of constraints and conceptual attributes. Using this input as a guide, the fetching mechanism chooses PC pairs that serve as building blocks for the language produced. The pattern components of these PC pairs may include general Computational Linguistics, Volume 1 I, Number 4, October-December 1985 223 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces patterns and ordering patterns as well as specialized phrases and lexical choices.</Paragraph> <Paragraph position="5"> In producing the phrase the file filename to be removed from the current directory, the fetching stage is given the following input:</Paragraph> <Paragraph position="7"> (to (not (concept (inside-of (object current-directory)))))) From this input, the fetching mechanism must retrieve the remove pattern shown earlier as well as the ordering patterns necessary to produce a passive infinitive phrase.</Paragraph> <Paragraph position="8"> The design of the hashing scheme that accomplishes this retrieval is based on the following reasoning: The input to the fetching mechanism may be described at least in part by a set of conceptual and linguistic properties, as may the pattern-concept pairs in the data base.</Paragraph> <Paragraph position="9"> The process of restriction, described in section 3.2, relies heavily on matching these two sets of properties. This process may therefore be expedited by computing an address in memory that &quot;points&quot; to PC pairs with a particular set of attributes. Since there are combinatorially many such sets, there must be (1) a large number of addresses, or &quot;buckets&quot;, and (2) an effective means of selecting which sets of &quot;important&quot; properties to use in computing each address.</Paragraph> <Paragraph position="10"> The selection of &quot;important&quot; properties is determined as follows: All conceptual attributes, including those included within the concept part of the input, are considered important, and the linguistic attributes used for each p-o-s type are specified in the knowledge base. The fetching mechanism first searches buckets found through large sets of attributes, then buckets that correspond to smaller sets of attributes. The idea of this process is to consider first the PC pairs that most closely fit the input to the fetching mechanism. Since a hash into an empty bucket takes very little time, there is no great loss of time efficiency in using a fairly large number of hashes.</Paragraph> <Paragraph position="11"> Although the access to a PC pair through multiple buckets requires some additional space, this space is negligible compared to the size of the knowledge structures themselves. null The fetching component of PHRED, like the other parts of the system, is geared towards simplicity and uniformity. In spite of some of the differences among, for example, the selection of a verb, the choice of a referring expression and the selection of an article that agrees with its head noun, the same method is used for fetching in all three cases. The same hashing scheme is employed also to retrieve ordering patterns from the knowledge base. Such orderings can be effectively retrieved from their attributes in the same manner that any other PC pair is fetched. Thus, while the nature of the knowledge contained in the attributes of a lexical structure is arguably different from the knowledge within an ordering PC pair, these different types of knowledge may be accessed through the same mechanism. The principle behind this uniformity is that the level of specificity of the knowledge required to realize particular concepts and constraint cannot be predetermined; thus general and specific knowledge should be accessed in the same fashion.</Paragraph> <Paragraph position="12"> The main loop of PHRED passes to the fetching component the set of constraints a PC pair must satisfy.</Paragraph> <Paragraph position="13"> Typically, if there is a specific phrase, structural formula or other pattern that directly satisfies these constraints, it will appear in the stream before more general patterns.</Paragraph> <Paragraph position="14"> A pattern of unfixed word order will generally appear in the stream before an ordering pattern, because the ordering patterns tend to have few or no conceptual attributes. Most often, the unfixed pattern is chosen based on the concept passed to the fetching mechanism, while the ordering pattern is chosen to select an ordering that produces the appropriate pattern-of-speech. The manner in which these patterns are combined is discussed in section 3.2. The fetching mechanism is repeatedly called to return patterns from the stream until all possible constraints are satisfied. For example, to produce the phrase ... not to remove the file, a negative ordering, infinitive, and remove pattern must all be fetched before the phrase can be restricted.</Paragraph> <Paragraph position="15"> The construction of hash keys based on successively smaller sets of attributes assures that the PC pairs whose concept most closely matches the input concept will be considered first. The fetching mechanism produces a stream of pattern-concept pairs which are returned one at a time as they are requested by the generator. The rest of the program is insulated from the retrieval process. This way, some of the hashing computation can be postponed until it is required.</Paragraph> <Paragraph position="16"> In the case of the remove example given above, the PC pair is indexed according to a combination of the semantic attributes &quot;state-change&quot;, &quot;location&quot;, &quot;inside-of&quot;, and &quot;not-inside-of&quot;. This combination is used at the time the PC pair is read in to determine which buckets should include the PC pair. The indexing mechanism ignores variables (e.g., &quot;?actor&quot;). During generation, a bucket indicating this PC pair will be found, based on the same semantic attributes. Some empty buckets, based on different combinations of attributes, will be searched also. A bucket including the passive infinitive ordering pattern is found by using the p-o-s and voice attributes.</Paragraph> <Paragraph position="17"> 224 Computational Linguistics, Volume l 1, Number 4, October-December 1985 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces Buckets that correspond to more complete sets of attributes are searched first. For example, if the &quot;delete&quot; pattern were constrained to be used only for the deletion of files, it would be retrieved before the remove PC pair because the bucket identified by the conjunction of the &quot;file&quot; attribute of filel with the other semantic attributes of the concept is searched first.</Paragraph> <Paragraph position="18"> A simple pattern, such as the word the, does not really have a concept associated with it, and thus is indexed according to sets of its linguistic attributes only: A search for a definite article would find a bucket based on the properties &quot;p-o-s = article&quot; and &quot;ref = def&quot; and would thereby yield the pattern for &quot;the&quot;.</Paragraph> <Paragraph position="19"> The fetching component of PHRED constitutes about 10K bytes of object code, one tenth of the total program.</Paragraph> <Paragraph position="20"> A profile of PHRED shows that more than half of the CPU time consumed by the generator is spent in the fetching process. Earlier versions of the program, which did no ordering of candidate patterns in the fetching phases, spent less time fetching but more time overall.</Paragraph> <Paragraph position="21"> When the fetching mechanism retrieves a pattern which has the appropriate &quot;p-o-s&quot; attribute, control is passed to the restriction phase. This phase is considered below.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 RESTRICTION </SectionTitle> <Paragraph position="0"> Each time a candidate pattern is returned from the stream by the fetching mechanism, it is passed to the restriction phase, along with any other unfixed-order patterns which have been retrieved. The restriction mechanism creates an instance of the pattern, adding new constraints to the pattern constituents while verifying that the PC pair meets the constraints given. There are three main aspects of this process: * unification of the variables within the PC pair's conceptual template and its associated properties with the target concept and properties, * elaboration of the pattern constituents to include properties from corresponding properties in the pattern indicated by the &quot;value&quot; marker, and * combination of the properties of constituents among the pattern and ordering patterns.</Paragraph> <Paragraph position="1"> The following is an example of an instance of the</Paragraph> <Paragraph position="3"> This PC pair is the product of applying the restriction process twice in succession, once to the passive infinitive ordering and once to the remove pattern. Unification has occurred to bind the variables &quot;?cont&quot; and &quot;?remobject&quot;. Elaboration has added the tokens bound to these variables to the individual constituents. Combination of the remove pattern with the passive infinitive ordering has produced a pattern whose constituents are specified by the conglomeration of constraints of the PC pairs used.</Paragraph> <Paragraph position="4"> Any of these three aspects of the restriction phase may result in failure. In the above example, unification would fail in an attempt to bind the multiple occurrences of &quot;?cont&quot; to different tokens, or if some variable binding violated an input constraint. Elaboration results in failure if a property to be added to a constituent does not fit the other properties. For example, if &quot;directoryl&quot; in the example is not a container, the pattern would be judged inappropriate. Combination could likewise result in failure if the constraints from the ordering rule were incompatible with those from the remove pattern, for example, if it had no passive form.</Paragraph> <Paragraph position="5"> Properties marked by &quot;value&quot; in the PC pair are treated as variables and unified along with the other properties. If these variables remain unbound throughout the restriction process, however, the pattern retains the prop-erty with its &quot;value&quot; marker. This is necessary for future stages of the production process to obtain the property on demand. For example, a noun-phrase pattern in Spanish, where there is gender agreement between the subject of a passive infinitive phrase and the past participle, maintains the &quot;gender = (value 2)&quot; property to Computational Linguistics, Volume 1 I, Number 4, October-December 1985 225 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces reflect that the gender of the NP is the gender of its NP2 This property is not determined until the head noun is chosen, after which it can be retrieved through the NP if necessary.</Paragraph> <Paragraph position="6"> Restriction uses about 60% of the code of the generator and most of the CPU time not consumed by fetching.</Paragraph> <Paragraph position="7"> The bulk of this time is spent doing repeated unification when a large number of patterns are required. Because the nature of the knowledge structures in the system seems to require such unification, the fetching mechanism, as described in section 3.1, is designed to prevent the consideration of patterns which might lead to failure during unification.</Paragraph> <Paragraph position="8"> The next step in the generation process, after restriction, is to go through each constituent of the restricted pattern and invoke the generation process on the individual constituents, if necessary. This phase is described in the next section.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 INTERPRETATION </SectionTitle> <Paragraph position="0"> The third major phase in PHRED is interpretation, the application of constraints to a restricted pattern to produce a surface structure suitable for output.</Paragraph> <Paragraph position="1"> The process of interpreting a given constituent may have three possible results: 1) the successful completion of an element of surface structure, 2) the recursive application of the fetch-restrict-interpret sequence on the given constituent, or 3) failure, if the generator is unable to produce a specified pattern of speech.</Paragraph> <Paragraph position="2"> The first result occurs when the pattern provides a complete specification of a word or words for output, such as the big apple, which is specified by the pattern <word = the> <word = big> <word = apple> The second case occurs if a constituent contains a more general set of constraints, for example, <and p-o-s = verb root = remove tense = past> which requires another recurrence of the fetch-restrictinterpret sequence.</Paragraph> <Paragraph position="3"> In the third result, where no output produces the desired pattern of speech subject to the constraints given by the uninterpreted pattern, the system must back up to select an alternate pattern. To be efficient, the system must utilize as much as possible the patterns already selected. If the constituent that fails in the interpretation phase is optional to the pattern to which it belongs, it is deleted. Otherwise, failure results in backing up to the level where the failed patter~ was fetched, getting another pattern from the stream, and attempting restriction of the new pattern. Most often this new pattern will be an ordering rule, and most of the failed pattern will be used in the restriction of the ordering pattern. A simple case of this is where the generator fails to produce a pattern of speech for the subject of a sentence and instead generates a passive sentence. In this case the restricted version of the PC pair as it was before the combination with the active ordering pattern is backed up on a stack so that the passive ordering can be tried.</Paragraph> <Paragraph position="4"> Failure during interpretation is rare, and generally results from an insufficiency of the knowledge base in producing a reference. While a better model of the generation process might allow for the anticipation of such failures, such anticipation would in general require decisions considerably more complex than those made by PHRED. This complexity would be underutilized in light of the infrequency with which back-up is necessary.</Paragraph> <Paragraph position="5"> Although the back-up algorithm employed in these failures is time-consuming, it increases the likelihood that some successful utterance will be produced.</Paragraph> <Paragraph position="6"> The agreement of constituents within a pattern is assured during the interpretation phase. A constituent that must agree with another has a form such as the following: <and p-o-s = verb root = remove tense = past number = (matches 1) person = (matches 1)> This specifies a past tense form of remove that matches its subject in person and number. Interpretation results in the substitution of properties from the matched constituent to produce, for example, <and p-o-s = verb root = remove tense = past number = singular person = third> In English there are only limited forms of agreement.</Paragraph> <Paragraph position="7"> There are few examples where it passes from right to left, such as in subject-aux inversion where the verb agrees with a subject that follows it. In other languages agreement within a pattern may be much more complex. In the Spanish example Juan les habld a sus amigos ('John spoke to his friends'), the indirect pronoun les, which precedes the verb, agrees with the indirect object, which follows the verb.</Paragraph> <Paragraph position="8"> In all cases PHRED can ensure proper agreement if some order of interpreting the constituents allows the correct application of constraints. The surface order of the constituents is the default order for their interpretation, but interpretation of a constituent where necessary is done only after that of constituents with which it must agree. In English, nouns within noun phrases are interpreted before their attached determiners, because the determiner must sometime agree in number with the head noun. In more inflected languages verbs must generally be produced last.</Paragraph> <Paragraph position="9"> Anaphora are handled specially during interpretation.</Paragraph> <Paragraph position="10"> In the case of constituents for which PHRED has already produced references, the generator applies a set of heuristics that will remove the constituent entirely if it is not necessary to the utterance, pronominalize, or regenerate the entire constituent. The principal heuristics are 226 Computational Linguistics, Volume 11, Number 4, October-December 1985 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces 1) If the anaphoric constituent is optional, remove it from the current pattern, and 2) pronominalize other anaphoric constituents wherever possible.</Paragraph> <Paragraph position="11"> There are of course many cases in which an alternative reference would be preferable, but the method used by PHRED is generally effective in producing coherent references. The heuristics lead, for example, to the production of Mary was tom by John that he wanted the book to be given to him rather than Mary was tom by John that John wanted the book to be given to John by Mary. It is apparent that these heuristics would break down in the generation of longer texts, a task for which neither PHRED nor the PHRAN/PHRED knowledge base was designed.</Paragraph> <Paragraph position="12"> The interpretation mechanism occupies about 20% of the code of the generator, and requires a small amount of time relative to the rest of the program.</Paragraph> <Paragraph position="13"> This discussion has described the overall design of PHRED and presented some details of its implementation. The next section traces an example of the generation process and discusses the role of each of the three phases considered here.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 A DETAILED EXAMPLE </SectionTitle> <Paragraph position="0"> Below is a trace of PHRED while generating the sentence, Typing &quot;'rm filename&quot; causes the file filename to be removed from the current directory. This is a fairly simple example, but demonstrates well the process used by PHRED to produce an output. At each step in the trace, the generator prints out which phase it is going through, and what the input to that phase is. Ellipses (...) are used to indicate information that has been omitted because it reduplicates other material. As earlier in the text, symbols preceded by a question mark indicate variables, such as &quot;?actor&quot;. Symbols surrounded by asterisks, e.g., &quot;*user*&quot;, are tokens that have special processing implications in the UNIX Consultant. Other special tokens are indicated by atoms followed by numerals, such as &quot;filel&quot;.</Paragraph> <Paragraph position="1"> The input to the generator is the concept which the UNIX consultant has chosen to express, in response to a question about removing files in UNIX. The concept represents UC's knowledge that using the 'rm' command is an established plan (here &quot;planfor&quot;) for deleting a file (here There are a number of patterns that could potentially be used to express the concept that an action is a plan for something.</Paragraph> <Paragraph position="2"> Two of the possible constructs in the PHRED knowledge base are an imperative, e.g., Use &quot;rm&quot; to delete a file, and a future or present tense declarative, e.g., &quot;Rm&quot; will delete a file. In this case, PHRED selects another pattern with the verb cause. The stream of candidate patterns includes first the constructs found in a bucket reached through the &quot;planfor&quot; concept, followed by other sentence-level PC pairs. In examples such as this one, where PHRED's fetching mechanism reaches several constructs through the same bucket, the generator selects a random order in the stream for the alternatives. For this example, therefore, a random selection ultimately determines the form of the output.</Paragraph> <Paragraph position="3"> After the selection is made, the restriction process is applied to the first pattern.</Paragraph> <Paragraph position="5"> The restriction process here results in the addition of the appropriate conceptual components to the constituents of the restricted pattern. The conceptual content of the first and third constituents, which will produce a gerund phrase and passive infinitive phrase, respectively, have been added. This results from the unification of the variables &quot;method&quot; and &quot;result&quot; in the list of properties above and the elaboration of the constituents specified by the terms &quot;(value 1)&quot; and &quot;(value 3)&quot; attached to these variables. Combination with an active sentence pattern adds the subject-verb agreement, and the restricted pattern enters the interpretation phase:</Paragraph> <Paragraph position="7"> At this point the generator has successfully applied the input concept to restrict the surface structure chosen, and recursively interprets this structure, starting with the Since there is no pattern that directly generates a gerund phrase (here &quot;p-o-s = act-phrase&quot;) with the given concept, the fetch above yields an ordering pattern which can be used for combination with other patterns to produce the final phrase. Thus another fetch is performed before any restriction is done, this time without the &quot;p-o-s&quot; attribute.</Paragraph> <Paragraph position="9"> PHRED searches for a way of expressing the &quot;mtrans&quot;, or communicative transfer, of the 'rm' command to the operating system. The hashing mechanism gives preference to the terms for technical transmission of commands, because the concepts associated with these terms match the input concept more closely, but a problematic pattern still results: This pattern fails during unification because it requires that the command not have arguments, something which the fetching mechanism failed to detect because the bucket that includes the pattern is found by considering less specific attributes. This failure is illustrative of a class of examples where PHRED's hashing mechanism, in short-cutting the complexity of unification, picks the wrong pattern.</Paragraph> <Paragraph position="10"> With the gerund ordering pattern still being saved, the fetching mechanism is called again for another candidate. The pattern returned here by the fetching mechanism is the next one in the stream after the failed &quot;do&quot; pattern. This new pattern, with the verb type, is then passed Unification of the variables in the above PC-pair with those in the input concept is followed by elaboration of the constituents and combination with the gerund ordering pattern. This yields the following result: The combination of the &quot;type&quot; pattern with the gerund ordering satisfies the necessary constraints, producing a two-constituent pattern which then proceeds to the interpretation phase: p-o-s = verb form = progressive root = type This fetch uses a hash on the root and form of the verb given to retrieve the progressive form typing, whose properties unify trivially with the given constraints: root = type p-o-s = verb root = type form = progressive The word typing and its properties are now completely specified, so no further restriction is needed. The next constituent in the gerund phrase, the noun phrase that describes the command 'rm', is thus passed to the interpretation mechanism: The pattern selected for the command is a specific formula for expressing commands to UNIX, the command name following by its arguments, in quotes: The formula for producing 'rm filename' is straightforward, and results in very little additional work by the generator:</Paragraph> <Paragraph position="12"> Having completed the clause Typing &quot;rm filename&quot; the generator now returns to the highest level of the surface structure to finish the sentence. The next constituent in this surface structure is the conjugated form of the verb cause: ***INTERPRETING*** PATYERN: <p-o-s = verb root = cause person = (matches 1) number = (matches 1)> The interpretation mechanism finds the person and number of the first constituent of the surface structure. Since this is a singular gerund phrase, it has the third person and singular properties. These are then used in fetching the appropriate verb form: ***FETCHING*** p-o-s = verb root = cause form = basic person = third number = singular As with typing, hashing results in the retrieval of the correct verb, and restriction is a simple process: Having completed the specification of the verb causes, PHRED continues its depth-first interpretation with the third and final top-level constituent, the infinitive phrase: The first fetch in this case again brings the ordering pattern, the second the &quot;remove&quot; pattern. The restriction process is applied first to the &quot;remove&quot; pattern:</Paragraph> <Paragraph position="14"> At this point, the generator is producing an expression for the passive infinitive phrase following the verb causes. After unification and elaboration of the pattern above, the pattern is then combined with the ordering pattern for the passive infinitive phrase, a somewhat more specialized pattern than is necessary for the construction of such phrases. The restriction process results in the determination of the final ordering of the constituents, and another round of restriction: Having completed the restriction of the infinitive, PHRED passes control to the interpretation mechanism, which then proceeds to generate each part of the infinitive phrase pattern: As the interpretation starts with the first constituent of the infinitive phrase, PHRED now must produce a reference to the specified file. To do this, it expands the token &quot;filel&quot; to get the necessary information from its attributes.</Paragraph> <Paragraph position="15"> This pattern is the default reference for files, which is superseded when more information about a given file must be conveyed. The noun phrase now reaches the interpretation phase, resulting in the simple verification that its constituents are concept .... p-o-s = noun-phrase ref = def Having completed the reference, the system now continues with the infinitive phrase. The second constituent of the infinitive phrase is the infinitive of the verb The third constituent of the passive infinitive phrase is the past participle of the verb remove, which is interpreted next. This process similarly results in the completed The final constituent of the infinitive phrase and of the sentence is the optional prepositional phrase specifying from where the file is being deleted. The extra angle brackets in the pattern below indicate to the interpretation mechanism that if it fails to produce a reference or if the reference in the prepositional phrase is anaphoric, the entire constituent may be omitted: Unlike the previous noun phrase, there is no specific structural formula for referring to the current directory. PHRED thus uses a general noun phrase pattern: &quot;Consonance&quot; here is the property used to handle the distinction between a and an, which depends on the choice of noun.</Paragraph> <Paragraph position="16"> &quot;Hard&quot; consonance is used for nouns or adjectives beginning with a consonant sound, and &quot;soft&quot; for those beginning with a vowel sound. For definite articles, the property is not used.</Paragraph> <Paragraph position="17"> Elaboration of the pattern above results in a two-constituent pattern to be interpreted, the second constituent of which must refer to the current-directory concept.</Paragraph> <Paragraph position="18"> While there is no special noun phrase for referring to the current-directory con- cept, there are special noun constructs. PHRED selects randomly between two ways of referring to this concept, current directory and working directory.</Paragraph> <Paragraph position="19"> p-o-s = noun number = singular person = third concept = current-directory The reference selected for the directory is the compound noun current directory.</Paragraph> <Paragraph position="20"> This is interpreted before the article within the noun phrase, since articles are produced after head nouns to ensure The interpretation mechanism judges the noun compound to be completed, and the final determiner is then interpreted: After the final part of the surface structure is complete, a walk through the surface structure tree is used to produce the final output: Typing 'rm filename' causes the file filename to be removed from the current directory.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 COMPARISON WITH OTHER RESEARCH </SectionTitle> <Paragraph position="0"> PHRED differs in design from most other natural language generation systems because of its conception as a generator to accompany PHRAN as part of a language interface. The application of specialized phrasal knowledge seems to be an effective means of satisfying the demands on a generator in a domain such as that of the UNIX Consultant. The use of a declarative knowledge base shared between analyzer and generator has helped to make the system practical and easily extensible.</Paragraph> <Paragraph position="1"> PHRED's simplicity and the speed with which it applies this knowledge have made it well-suited for use in real-time natural language interfaces.</Paragraph> <Paragraph position="2"> Primarily for historical reasons, most research in computational linguistics has focused on rules governing syntax. In language analysis, it is often practical to design systems whose principal function is to apply and Computational Linguistics, Volume I I, Number 4, October-December 1985 235 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces test such rules by determining the grammaticality of the input. Such systems generally use compositional rules, if any, for determining the semantic content of the input.</Paragraph> <Paragraph position="3"> The task of language generation, however, is inextricably tied to the appropriateness of the linguistic output as well as to its grammaticality. Because of this, work in generation focuses not on the representation of core syntactic rules but on the means by which a choice is made among syntactic and lexical constructs. Compositional rules generally fail to constrain this choice adequately. For this reason systems which are designed for language generation have often employed either special choice systems of the type found in systemic grammar (Halliday 1968), or have had pattern-based grammars of the type found in PHRAN/PHRED and in unification grammar (Kay 1984), which require a sophisticated mechanism for dealing with the interaction of the patterns. Thus PHRAN/PHRED is the first interface in a natural language-based artificial intelligence system to use an entirely common representation and knowledge base for linguistic knowledge employed in both analysis and production.</Paragraph> <Paragraph position="4"> The declarative pattern-concept pair representation, its theory, and its role in PHRED, are considered in the discussion that follows.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5.1 THE PC PAIR </SectionTitle> <Paragraph position="0"> The pattern-concept pair representation differs on the surface from traditional grammars because the grammar is embedded implicitly in the knowledge structures.</Paragraph> <Paragraph position="1"> These knowledge structures often require the combination of a number of patterns to produce an utterance. In this way the representation is comparable to unification grammar, which contains patterns associated with functional descriptions. The restriction process described in this paper is similar to the unification procedure in TELE-GRAM (Appelt 1983), which employs a unification grammar. null One difference between PHRED's knowledge structures and those in unification grammar is that conceptual attributes of the PC pairs, as well as functional attributes, or properties, are used to constrain a pattern. Unification grammar, like most feature systems, generally fosters the separation of conceptual and functional components.</Paragraph> <Paragraph position="2"> Another distinction is that, in unification grammar, the syntactic category is given special status; in pattern-concept pairs it is treated as an attribute, and does not necessarily have to be specified for every pattern. This is important for patterns that can be used in conjunction with many different orderings to produce a variety of syntactic structures.</Paragraph> <Paragraph position="3"> A general difference between the PC pair and other representations lies in the level of specificity of the patterns. The PC pair makes it easy to encode specialized phrases and constructs to be used by the generator. It allows the generator to apply the same mechanisms to both general and specific constructions, and to choose PC pairs based on their conceptual attributes. This is, naturally, a distinction based on how the pattern-concept pairs are used rather than on their basic structure. The same result might well be achieved within the basic framework of lexical functional grammar or unification grammar.</Paragraph> <Paragraph position="4"> Semantic grammar (Burton 1976) is another representation scheme which, like that of PHRED, facilitates the use of semantic attributes in language processing. There are versions of such grammars that allow for varying degrees of interaction between syntax, semantics, and pragmatics. PHRED differs from true semantic grammars primarily in that it facilitates the interaction of the more general patterns with the more specialized. Semantic grammars are often too constrained to be adapted to a new domain. Many of the knowledge structures in PHRED, by comparison, are general enough so that much of the linguistic knowledge used within the UNIX domain existed in the PHRAN/PHRED knowledge base before UC was even conceived.</Paragraph> <Paragraph position="5"> The pattern-concept pair representation has developed in parallel with research on idiomatic and specialized use of language, done primarily by cognitive linguists. Similar ideas may be found in a variety of grammatical theories emphasizing the study of levels of linguistic and conceptual knowledge and the relations between them (cf. Lockwood 1972, Makkai 1972). The concept of units of meaning linked to lexical units is described, for example, by Pike (1962) and Lamb (1973).</Paragraph> <Paragraph position="6"> Much of the work on specialized language questions the cognitive validity of traditional generative theories of grammar. Chafe (1968) identifies certain idioms, such as by and large and all of a sudden, which would be ungrammatical were they not given special status as idiomatic constructions. Other expressions, such as kick the bucket, are grammatical, but have a meaning that is not determined by any compositional relationship among their components. Chafe argues that these idiomatic constructs sufficiently pervade everyday language to warrant an approach to language that handles these constructs not as special cases or exceptions but as an integral part of a language.</Paragraph> <Paragraph position="7"> Becker (1975) presents the idea of the phrasal lexicon as a means of handling canned and idiomatic phrases.</Paragraph> <Paragraph position="8"> Becker identifies in particular a range of phrases which are grammatical and even comprehensible via compositional rules, yet which suggest specialized contextual knowledge. The expression It only hurts when I laugh can theoretically be handled using traditional theories of grammar, but treating it as such would be ignoring an important component of the expression's meaning. The existence of such expressions, which involve either partially or entirely specialized knowledge, has generally been treated as of minor importance in computational theories of language. However, a cognitively realistic representation must take into account the role of both general syntactic knowledge and specialized knowledge about particular phrases.</Paragraph> <Paragraph position="9"> 236 Computational Linguistics, Volume 11, Number 4, October-December 1985 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces While these arguments are directed at developing cognitively valid theories of linguistic representation, the handling of idiomatic constructs and of specialized phrasal knowledge has a substantial influence on the robustness and efficiency of a system. If specialized linguistic knowledge is indeed as pervasive as Chafe argues, a system that deals only with &quot;core&quot; grammatical and productive constructs will handle but a small portion of a language. A generator working within such a system would be severely limited in the range of utterances it could produce and in its ability to produce an output appropriate to a given context. On the other hand, failing to take advantage of linguistic generalizations can introduce redundancy and possibly inefficiency into the knowledge base. Robust and efficient language processing therefore demands a representation that takes advantage of both specialized idiomatic and general syntactic knowledge. Experience with the UNIX Consultant has suggested that the interaction of specialized and general linguistic knowledge is important for a natural language interface. This interaction is accomplished in PHRED by allowing the generator to combine ordering patterns with patterns used to relate linguistic constructs to their particular meanings.</Paragraph> <Paragraph position="10"> Fillmore (1979) gives arguments for the idea of the structural formula, a phrase or construction that cannot be described strictly as the composition of its components but may still have a certain degree of structural freedom. Fillmore presents &quot;<Time unit> in and <Time unit> out&quot; as an example of such a formula, manifest in expressions such as day in and day out and week in and week out. More recently, Fillmore and others extend this idea to a theory of grammatical constructions (cf. Fillmore, Kay, and O'Connor 1984; Lakoff 1984), focusing on expressions that exhibit certain regularities and obey some grammatical constraints but whose behavior cannot be determined by &quot;core&quot; grammar. Examples of such expressions are let alone as in He didn't make first lieutenant, let alone general, and the deictic there, as in There goes Harry, shootin.g his mouth off again Fillmore, Kay, and O'Connor point out the difference between attempting to develop a minimal base of knowledge from which a linguistic competence can be computed, and attempting to develop a knowledge base that represents how human linguistic knowledge is in fact stored.</Paragraph> <Paragraph position="11"> As an example of this distinction, consider the division drawn by Fillmore, Kay, and O'Connor between idioms of decoding, such as kick the bucket, and spill the beans, and idioms of encoding only, such as answer the door, and wide awake. All of these are grammatical idioms; that is, they have a syntactic structure and word order compatible with core grammatical constructs. The idioms of decoding, however, require specialized knowledge both for the comprehension of their meaning and their appropriate use. The idioms of encoding could possibly be comprehended using knowledge about their components only, but specialized knowledge is required to predict their use. Whether this specialized knowledge is to be stored in a given representational model therefore depends on what problem the model is addressing: competence, comprehension, or production. We have thus distinguished three potential classes of linguistic knowledge: 1) the knowledge required to determine the membership of a given phrase or sentence in a language, 2) the knowledge necessary to determine the meaning of a phrase, and 3) the knowledge that determines appropriate use of the phrase.</Paragraph> <Paragraph position="12"> Computational linguistics has emphasized the first class, and thus many systems have attempted to define the second and third knowledge classes by adding auxiliary knowledge to a grammar for a linguistic competence.</Paragraph> <Paragraph position="13"> The PHRAN/PHRED pattern-concept pair representation, on the other hand, attempts to subsume the three classes into a single framework. Since the goal of PHRAN and PHRED is proficient analysis and use of language, the distinction between grammatical and extra-grammatical idioms becomes of minor importance. It seems counterintuitive to treat phrases such as all of a sudden as of a different nature from kick the bucket simply because the former is extragrammatical. Further, the emphasis on the ability to compute a linguistic competence using a small set of rules is diminished. If specialized knowledge about a given phrase is required for its appropriate use, there is no reason why this knowledge cannot also be used for its syntactic analysis, even if, in a system that performs analysis alone, such knowledge would be redundant.</Paragraph> <Paragraph position="14"> Consider the phrase answer the door. A pure syntactic analyzer would require no special knowledge to recognize the construct as a valid verb phrase. It is possible as well that the meaning of the phrase could be determined based on the structure of the verb phrase and its constituents. However, in order for PHRED to give the phrase its deserved distinction from respond to the door or other less appropriate utterances, special knowledge, that answer the door means to open a door in response to a knock or doorbell, is required. Since this knowledge is encoded into the common knowledge base, it may also be used by PHRAN to determine the meaning of the phrase.</Paragraph> <Paragraph position="15"> The development of a knowledge base for the purposes of both language analysis and language production therefore changes the nature of the linguistic knowledge base and its use. Information that is redundant when considered from a formal linguistic standpoint may be important for a particular aspect of language processing. Such specialized knowledge may then be used by other components of the system. Thus the emphasis in the PHRAN/PHRED representation is on the storage of such redundant information rather than on its computation.</Paragraph> <Paragraph position="16"> Specialized knowledge about phrases and constructions is an integral part of the knowledge base and is used preferentially to general knowledge which Computational Linguistics, Volume 11, Number 4, October-December 1985 237 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces requires more computation, both for analysis and production.</Paragraph> <Paragraph position="17"> Of course, fundamental differences between analysis and generation still exist in PHRAN and PHRED. While the two programs have a shared knowledge base, they have entirely independent methods of accessing and applying their linguistic knowledge. PHRAN accesses patterns by recognizing sequences of constituents; PHRED must select a pattern based on the concept it is to express and the constraints which the pattern must satisfy. The PHRED approach to language generation is committed to the representation of linguistic knowledge in a declarative form which can be shared by the analyzer. The knowledge structures used by the generator are the same as those used by the analyzer, but the process that makes use of this knowledge to produce an utterance still reflects the basic choice problem.</Paragraph> <Paragraph position="18"> The appropriateness of natural language output seems enhanced by the pattern-concept pair representation.</Paragraph> <Paragraph position="19"> Much of the knowledge used to produce language, particularly in specialized domains, is specialized knowledge. A natural language program that treats grammatical constructions and canned or idiomatic phrases independently of &quot;core&quot; grammar requires special rules and procedures to make use of such phrases. In PHRED specialized constructs are selected and produced using the same mechanism as the more productive constructs, facilitating the interaction of linguistic knowledge of varying levels of generality. In this way a wider range of appropriate utterances may be produced from a given conceptual form.</Paragraph> <Paragraph position="20"> This discussion has focussed on the general representational aspects of PHRED. The next section concentrates on the details which relate specifically to other generation systems.</Paragraph> <Paragraph position="21"> 5.2. PHRED AND OTHER GENERATION SYSTEMS PHRED differs from other generation systems primarily in the way it applies its knowledge to the generation task. Many language generation systems used in conjunction with large programs separate the linguistic knowledge base and lexicon from the conceptual knowledge base of the system (McDonald 1980, Mann and Matthiessen 1983, McKeown 1982). This has a variety of advantages, particularly the ability to develop and modify one module without affecting another. It also has the disadvantage of inhibiting the use of conceptual information by the generator, or of requiring redundant representation of such information, unless the modules are specifically designed to utilize common knowledge. In PHRED, linguistic knowledge, e.g., pattern concept pairs, is maintained separately from world knowledge, e.g., knowledge about the UNIX domain, to permit such advantages as the interchangeability of English and Spanish knowledge bases in UC. However, the generator may access the conceptual knowledge base of the system and such knowledge may interact with the syntactic knowledge.</Paragraph> <Paragraph position="22"> For example, the verbs remove and delete are synonymous when used to refer to actions on files, but delete may not generally be used with physical objects. PHRED restricts the use of delete during elaboration by examining the semantic nature of its object. If the object is not a file, the use of delete to refer to the action of removing it is prohibited.</Paragraph> <Paragraph position="23"> Certain other complete natural language systems, like PHRAN/PHRED, exploit knowledge shared between analyzer and generator. The HAM-ANS question-answerer (Wahlster et al. 1983, Busemann 1984) makes use of a shared lexicon. The VIE-LANG system (Steinacker and Buchberger 1983) shares a &quot;syntacticosemantic&quot; lexicon, but the generator accesses this lexicon using a discrimination net with specialized choice knowledge. null A notable difference in implementation between PHRED and other generators is in the fetching mechanism. The division of the choice problem into an initial biasing and an evaluation component allows PHRED to bias its construction of utterances using a specialized hashing scheme. This has proven a boon for both simplicity and efficiency, as some of the rules which govern choice are carried out by a simple hashing process and thus fewer patterns reach the restriction phase. The basic choice mechanism as implemented in PHRED therefore encompasses two different phenomena, which may be viewed as predisposition and selection.</Paragraph> <Paragraph position="24"> Predisposition is the process by which access to a knowledge base is influenced by various factors - such as the context, the concept to be expressed, or specific constraints on the desired output - to influence the order or priority in which elements of the knowledge base are considered. Selection is the evaluation of an element from the knowledge base. Intuitively, predisposition is the underlying access process that influences the likelihood of considering a particular word or phrase; selection is the judgement process which determines whether the word or phrase is appropriate. This resembles the notion of &quot;register&quot; in the systemic tradition (cf. Halliday 1978), but the biasing is not limited to situational influences.</Paragraph> <Paragraph position="25"> There are three motivations for a design that provides for both a predisposition and a selection phase of the choice process. First, a system that employs as its principal choice mechanism, for example, a discrimination net such as Goldman's (Goldman 1975) or a unification scheme such as McKeown's (McKeown 1982) may apply its choice algorithm to many unlikely candidates, sometimes causing inefficiency. For example, the system might consider the verbs smoke and inhale every time it chooses the verb breathe. A fast indexing mechanism that quickly selects candidates trims the time spent evaluating inappropriate choices.</Paragraph> <Paragraph position="26"> The second motivating force lies in the distinction between utterances that are technically correct in expressing a given concept and those that are generally appropriate to a given context. John inhaled air is techni238 Computational Linguistics, Volume 11, Number 4, October-December 1985 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces cally correct but generally inappropriate in place of John breathed. This type of distinction can be embedded in a choice mechanism by attempting to axiomatize the rules that determine appropriateness, or it can be embedded in a predisposition mechanism which happens to order the choices according to the context. Predisposition thus provides a means for biasing choice without blurring the distinction between correctness and appropriateness.</Paragraph> <Paragraph position="27"> The third motivation is cognitive validity. The predisposition-selection distinction fits the intuition that people have when they hear an unusual sentence: It's okay but I wouldn't say it. In the example of breathe and inhale air both utterances may fit the input conceptualization, but fluent speakers tend to choose the former. Fluent speakers also bias their predisposition mechanisms according to the nature and formality of the context. Pawley and Syder (1980) find that one of the differences between native and non-native speakers of a language is that non-native speakers take a long time to develop the predisposition component necessary for fluency. Chafe (1984) has pointed out some of the influential factors in the variations between spoken and written, or informal and formal, language. While some of this work is still in its early stages, the evidence strongly suggests a contextual biasing component distinct from the selection or evaluation phase of production.</Paragraph> <Paragraph position="28"> The goal behind the PHRED indexing scheme is to incorporate as much of the choice problem as possible into the fetching, or predisposition, phase. Some language generators (Goldman 1975, McDonald 1980) use indexing tools that model choice as a multistage evaluation or decision-making process. The division of this process in PHRED into an &quot;automatic&quot; biasing component and a judgment component has some practical advantages. The hashing algorithm which drives the fetching mechanism orders the stream of patterns retrieved before any of them is actually evaluated, and thus the more time-consuming restriction process is spared having to apply heuristics to make certain choices. For example, a general heuristic used by a number of language generators can be expressed as &quot;Choose the most specific pattern which matches the input constraints&quot;. In PHRED, this heuristic is realized by the hashing mechanism, which orders candidate patterns in terms of the number of buckets that yield them. In this way the sentence John asked Bill to leave is generally produced without considering the alternative John informed Bill that he wanted him to leave.</Paragraph> <Paragraph position="29"> Appelt (1982) has presented language generation as the multi-level process of planning utterances to satisfy multiple goals. A division in this multi-stage process can be made between the task domain and the linguistic domain, i.e., between the system level and the interface level. PHRED operates at the interface level. User input to the UNIX consultant system is first analyzed by PHRAN, producing a conceptual knowledge structure which motivates the system's response (Wilensky, Arens, and Chin 1984). The planning component of the system exists entirely within the task domain of UC. Independent of the language being used, the UC planner makes the choice of illocutionary act, speech act, and the message to be conveyed. PHRED expresses the message in natural language.</Paragraph> <Paragraph position="30"> While the ability to handle complex problems in language planning, such as the generation of references requiring knowledge about the hearer's knowledge, might be desirable even at the PHRED level, it is difficult to perform such planning within a real-time system. It is both counter-intuitive and inefficient to treat language production as primarily a reasoning process involving complex inference mechanisms. In fact, the need for such reasoning in language production seems rare. Thus the UC system draws a convenient, if arbitrary, division between the choices of responses and speech acts made by the UC planner and the lexical and structural choices made by PHRED.</Paragraph> <Paragraph position="31"> Other systems such as Penman (Mann 1983), and TEXT (McKeown 1982) attack the problem of generating coherent multisentential text. This involves the influence of linguistic rules governing reference and focus on the process of deciding what to say. PHRED is not well equipped for this problem. While PHRED produces multisentential text when UC passes it successive concepts to express, it has no knowledge of coherence.</Paragraph> <Paragraph position="32"> Nor is there substantial communication between the PHRED level of production and the higher levels of language planning. Such communication, as described by Appelt (1982), would allow the generator to subsume multiple UC goals. In PHRED and UC much of the process of producing utterances is not considered as planning per se, but as the application of prestored knowledge about how language is used. The distinction between this prestored knowledge and general planning is analogous to the difference between compiled and interpreted code in programs. More research is required on how knowledge is compiled and on how the use of prestored knowledge about patterns of speech can be used in conjunction with general knowledge about planning.</Paragraph> <Paragraph position="33"> This discussion has described some of the advantages of the PHRED approach to language generation, as well as some of the areas not really addressed in PHRED. The next section considers some of the promising ways in which the research described here can be extended.</Paragraph> </Section> class="xml-element"></Paper>