File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/83/j83-2001_metho.xml
Size: 43,394 bytes
Last Modified: 2025-10-06 14:11:34
<?xml version="1.0" standalone="yes"?> <Paper uid="J83-2001"> <Title>Natural Language Access to Data Bases&quot; Interpreting Update Requests 1</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Authors' current address: Teknowledge Inc., 525 University Avenue, Palo Alto, CA 94301 </SectionTitle> <Paragraph position="0"> The provision of update capabilities introduces problems not seen in handling queries. These problems arise because the user is phrasing his requests with respect to his view of the data base, which may be a simplification or transformation of the actual data base structure. While a well-formed query expressed in terms of the user's view of the data base will always result in the same answer, regardless of how the query may be mapped into the actual data base structure for execution, this is not the case for an update expressed on a view.</Paragraph> <Paragraph position="1"> Since updates request modification of the content of the data base, different mappings of the update request into the actual data base structure may result in different effects. Some of these effects may be undesirable or unanticipated. Specifically, the user may make requests that are impossible (cannot be performed in any way, due to hidden constraints on the data base), ambiguous (can be performed in several ways), or pathological (can be performed only in ways that cause unanticipated side effects). While human Copyright 1983 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the Journal reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. 0362-613X/83/020057-12503.00 American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 57 James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests speakers would intuitively reject these unusual readings, a computer program may be unable to distinguish them from more appropriate ones.</Paragraph> <Paragraph position="2"> For example, a simple request to &quot;Change the teacher of CS345 from Smith to Jones&quot; might be carried out by altering the number of a course that Jones already teaches to be CS345, by changing Smith's name to be Jones, or by modifying a &quot;teaches&quot; link in the data base. While all of these may literally carry out the update, they may implicitly cause unanticipated changes such as altering Jones's salary to be Smith's.</Paragraph> <Paragraph position="3"> Our approach to this problem is to treat updates as requesting that the data base be put into a selfconsistent state in which the request is satisfied; the problem is then to select the most desirable of (potentially) several such states. The most desirable such state is considered to be the &quot;nearest&quot; one to the current state (in the sense that it involves the least disruption). A set of domain-independent heuristics is used to rank the potential changes along these dimensions. null This process may be guided by various linguistic considerations, such as the difference between transparent and opaque readings of the user's request, the distinction between the sense and reference of referring expressions, and the interpretation of counterfactual conditionals.</Paragraph> <Paragraph position="4"> This paper describes a system, PIQUE, which implements this approach by retaining a model of the user's view and considering possible methods of performing the update in light of the model. Given an update request, the system generates the set of possible changes to the underlying data base that will literally fulfill the request. These candidate changes are then evaluated as to their effects on the user's view, the underlying data base, and the data base constraints. If possible, an appropriate one is selected; otherwise an informative message is presented to the user.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. The Problem </SectionTitle> <Paragraph position="0"> As a hypothetical example of the problems that can arise during updates, consider a relational data base of employees, salaries, departments, and managers, consisting of two relations: Q2: Change Brown's manager from Jones to Baker.</Paragraph> <Paragraph position="1"> R2 : Done.</Paragraph> <Paragraph position="2"> The system has apparently fulfilled the user's request. Q3: What is the average salary paid to Jones's employees? R3: $0.</Paragraph> <Paragraph position="3"> Q4: List Jones's employees.</Paragraph> <Paragraph position="5"> From these responses, the user realizes that something has gone wrong.</Paragraph> <Paragraph position="6"> Q5: List the employees and their managers.</Paragraph> <Paragraph position="7"> The user sees that the system has made two unanticipated changes - changing Smith's and Pullum's managers - in addition to the one that was requested. From the user's point of view, his request is meaningful and unambiguous. He sees a set of values, and asks to change one of them. (He might not even know that employees and managers are linked via their departments.) The problem lies in the fact that his update request can be performed in two ways: a) by making the manager of the Sales department be Baker.</Paragraph> <Paragraph position="8"> b) by moving Brown from the Sales department to the Marketing department; 58 American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests Both of these literally fulfill the request. The system, lacking any means for deciding between these, has apparently chosen (a), making Baker the manager of the Sales department, with the unanticipated effect that two other employees have had their managers changed.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3. A More Formal Characterization </SectionTitle> <Paragraph position="0"> This problem can be explained more formally. Given a data base structure, define the user's view function F as the transformation that is applied to the data base to yield the conceptualization with which the user works. For instance, in the example in section 2, the view function, as defined by Q1, is a transformation consisting of a join and a projection, which is applied to the original two files to yield a single new file with only two attributes. Define the user's view as the result of applying the view function to a given state of the data base; in the example, this produces a file with five entries, as shown in R1.</Paragraph> <Paragraph position="1"> A user's update request (call it u) is a request to update the view. In the example, the request is stated in Q2. Since the view is only 'virtual' (derived from the data), we cannot modify it directly, but must make changes to the underlying data base. Call the result of translating the update request to the data base level T(u). The object is to find the change to the underlying data base that comes closest to having the desired effect on the user's view. That is, we want the translation T(u) that produces a revised data base such that, when the view function is applied to that data base, the result is the view requested by the user.</Paragraph> <Paragraph position="2"> In graphical terms: D represents the initial state of the data base, D t the state that results after applying the trans-</Paragraph> <Paragraph position="4"> In mathematical terms, the mapping F from the underlying data base D to the user's view F(D) induces a homomorphism. Loosely defined, a homomorphism is a function that preserves the structure of its arguments under given operations. In this case, the operations are changes to the underlying data base, and corresponding changes to the user's view. The difficulties with updates expressed on the view, rather than the underlying data base, arise from the characteristics of the inverse of this homomorphism: elements in the user's view (states of the &quot;conceptual&quot; data base) map under F 1 into a set of states of the underlying data base. This set may be empty (if the view update cannot be accomplished in any way), or have many elements (in the case of a request that is ambiguous with respect to D). If the mapping F is invertible, i.e. F n is also a function, then an isomorphism is induced. In this case, each requested update will have a single, unambiguous interpretation in the underlying data base, and the difficulties addressed here do not arise.</Paragraph> <Paragraph position="5"> However, in general this is not the case.</Paragraph> <Paragraph position="6"> The ideal update translation will produce a state of the data base that, when transformed by the user's view function, exactly yields the revised state that he requested. In actuality, our implementation will consider changes to the data base that literally fulfill the user's request but may not yield precisely the intended view u(F(D)). In the example, there were two translations of the user's request; update (b) yielded the exact view, update (a) a different one.</Paragraph> <Paragraph position="7"> 4. Description of the PIQUE System We have implemented a prototype system (PIQUE) that addresses this problem by processing update requests in four phases.</Paragraph> <Paragraph position="8"> (1) Decide what the user's current view of the data base is.</Paragraph> <Paragraph position="9"> The system maintains an ongoing model of the user's conception of the data base, derived from the dialogue.</Paragraph> <Paragraph position="10"> (2) Use the view to generate a set of candidate updates T(u), which perform the update.</Paragraph> <Paragraph position="11"> When an update comes in, it is assumed to be an update to the user's view. That is, the user requests changes with respect to his conceptualization of the data base. The candidate translations are updates to the data base, each of which literally accomplishes the user's request.</Paragraph> <Paragraph position="12"> (3) Use a set of ordering heuristics to rank these candidates, in terms of how accurately they fulfill the user's request.</Paragraph> <Paragraph position="13"> These candidates are evaluated according to the ordering heuristics, to measure how much impact they have on the user's view. For example, a candidate that causes side effects (unrequested changes to the user's view) is ranked lower than one that does not cause such side effects.</Paragraph> <Paragraph position="14"> &quot;Pragmatic&quot; information contained in the data base schema is also used in making the decision.</Paragraph> <Paragraph position="15"> (4) Take action, depending on the number of candidates and their ranking.</Paragraph> <Paragraph position="16"> When the candidates have been ranked, action is taken. This might consist of performing one of the candidates, offering a choice to the user, or explaining why the update cannot be performed at all.</Paragraph> <Paragraph position="17"> These phases are considered in turn.</Paragraph> <Paragraph position="18"> American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 59 James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Inferring the user's view </SectionTitle> <Paragraph position="0"> The user of a natural language data base system typically has a conception of the data base that is a subset of the relations, attributes, connections, and records actually present. In order to interpret updates correctly, the system must take into account the user's current conception of the data base. Our approach is to build a user model based on the concepts of which the user has indicated an awareness, those that have occurred in his queries and updates.</Paragraph> <Paragraph position="1"> This is implemented by making use of the connection graphs corresponding to the user's inputs. A system that processes natural language inputs must find paths through the data base, defined by operations such as joins, which connect the concepts mentioned in the input. (The LADDER system, for example, provides this service with the help of navigation information stored in a separate structural schema.) This set of paths is called the connection graph.</Paragraph> <Paragraph position="2"> The importance of this work is that the connection graph provides a good model for the structure of the user's view. That is, each query implicitly induces a view of the data base that the user holds, at least until the next input. When an update is received, it can be checked for compatibility with the current view, to see if it could be an attempt to update that view. This compatibility test basically checks to see whether the concepts and relationships mentioned in the update are completely contained in the view. (The actual matching criterion is more complicated than simple inclusion, but this will serve for explanatory purposes.) If the update and view are compatible, the user is assumed to be continuing an interaction with that view.</Paragraph> <Paragraph position="3"> Consider the example of section 2. The user poses a query that mentions employees and their managers.</Paragraph> <Paragraph position="4"> He then makes an update request of a similar form.</Paragraph> <Paragraph position="5"> Because the update request is compatible with the view induced by the previous query, the user is assumed to be referring to that view and to be asking to change it. Note that although departments are needed in the connection graph, they are not mentioned by the user, and therefore do not appear in the view.</Paragraph> <Paragraph position="6"> Views are stacked as the dialogue progresses, and updates can be checked for compatibility with all previous views (most recent first). This enables the system to correctly handle a situation in which a user returns to a previous view for further work.</Paragraph> <Paragraph position="7"> Note that an update also induces a connection graph, just as a query does. If an update request is not compatible with any of the views defined previously, the connection graph for the update itself can be used to define the view. This occurs if the user is making an update unrelated to any of the information that he has examined. In this case, the view must be inferred from the update alone. Thus, to return to the example of section 2, &quot;Change Brown's manager from Jones to Baker&quot; might be meaningful even if the user has not previously asked about these things.</Paragraph> <Paragraph position="8"> This strategy is conservative, in that the only concepts that will appear in views are those of which the user has indicated at least some awareness. As a resuit, the system will never assume a view that is more complex than the one actually held by the user, and thus will never mislead him by introducing a new concept during a response or explanation. The errors that occur will consist of underestimating the user's familiarity with the data base; the system will tend to be pedantic, rather than mysterious.</Paragraph> <Paragraph position="9"> This strategy also provides a notion of focus: as the user discusses different parts of the data base, the view changes automatically. This is important, because the notion of side effect changes as the user's focus changes. Changes occurring to previous views are less important than changes occurring to the current view.</Paragraph> <Paragraph position="10"> The concept of user modelling is well known in artificial intelligence (Mann et al. 1977). A common approach is to record an explicit list of the things the user knows (for example, Cohen 1978). Our model, however, is much simpler. Given the role of the view information in the inferencing heuristics, this model is adequate for our purposes. Davidson (1982) discusses the issue of modeling in more detail.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Generating candidate updates </SectionTitle> <Paragraph position="0"> One of the crucial steps of the algorithm described above is the generation of candidate updates that can then be evaluated for plausibility. In most cases, an infinite number of changes to the data base are possible that would literally carry out the request (mainly by creating and inserting &quot;dummy&quot; values and links). However, this process can be simplified by generating only candidate updates that can be directly derived from the user's phrasing of the request. This limitation is justified by observing that most reasonable updates correspond to different readings of expressions in referentially opaque contexts.</Paragraph> <Paragraph position="1"> A referentially opaque context is one in which two expressions that refer to the same real world concept cannot be interchanged in the context without changing the meaning of the utterance (Quine 1971). Natural language data base updates often contain opaque contexts.</Paragraph> <Paragraph position="2"> For example, consider that a particular individual (in a suitable data base) may be referred to as &quot;Dr.</Paragraph> <Paragraph position="3"> Smith&quot;, &quot;the instructor of CS100&quot;, &quot;the youngest assistant professor&quot;, or &quot;the occupant of Room 424&quot;. While each of these expressions may identify the same data base record (that is, they have the same extension), they suggest different methods for locating that record (their intensions differ). In the context of a data base query, where the goal is to unambiguously specify the response set (extension), the method by 60 American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests which they are accessed (the intension) does not normally affect the response. Updates, on the other hand, are often sensitive to the substitution of extensionally equivalent referring expressions. &quot;Change the instructor of CSI00 to Dr. Jones.&quot; may not be equivalent to &quot;Change the youngest assistant professor to Dr. Jones.&quot; or &quot;Change Dr. Smith to Dr. Jones.&quot; Each of these may imply different updates to the underlying data base.</Paragraph> <Paragraph position="4"> For operating with an expression in an opaque context, therefore, we must consider the sense of the expression, in addition to its referent (Frege 1952). In a data base system, this sense is embodied in the procedure used to evaluate the referring expression; the referent is the entity obtained via this evaluation. A request for a change to a referring expression is thus not specifically a request to perform a substitution on the referent of the expression, but rather a request to change the data base so that the sense of the expression now has a new referent. That is, after the update, evaluating the same procedure should yield the new (requested) result.</Paragraph> <Paragraph position="5"> For example, consider a data base of ships, ports, and docks, where ships are associated with docks, and docks with ports. Assume that there is currently a ship named Totor in dock 12 in Naples (and no other ship in Naples), and consider the following updates: Change Totor to Pequod.</Paragraph> <Paragraph position="6"> Change the ship in dock 12 to Pequod.</Paragraph> <Paragraph position="7"> Change the ship in Naples to Pequod.</Paragraph> <Paragraph position="8"> The referring expressions (italicized) have the same referent in all three cases, but the senses differ. The expression &quot;Totor&quot; is resolved by means of a lookup in the ships relation; &quot;the ship in dock 12&quot; requires a join between the ships and docks relations; &quot;the ship in Naples&quot; requires a join between all three relations.</Paragraph> <Paragraph position="9"> Consider the ways of performing each request, as indicated by the sense of the referring expression.</Paragraph> <Paragraph position="10"> The first version can be implemented only by making a direct substitution on the ships relation, corresponding to renaming the ship. The second admits this possibility, but also the possibility of moving a new ship into the dock (if there is already a ship named Pequod).</Paragraph> <Paragraph position="11"> The third allows the first two, plus the possibility of moving a different dock into Naples (if there is a dock somewhere else with the Pequod in it). (This will later be ruled out for other reasons, as explained in the next section, but cannot be excluded on purely linguistic grounds.) Thus, the particular referring expression selected by the user motivates a set of possible actions that may be appropriately taken, but does not directly indicate which is intended or preferred.</Paragraph> <Paragraph position="12"> This characteristic of natural language updates suggests that the generation of candidate updates can be performed as a language driven inference (Kaplan 1978) without severely limiting the class of updates to be examined. Language driven inference is a style of natural language processing in which the inferencing process is driven (and hence limited) by the phrasing of the user's request.</Paragraph> <Paragraph position="13"> In this instance, the candidate updates are generated by examining the referring expression presented in the update request. The procedure implied by this expression follows an &quot;access path&quot; through the data base structure. The candidate updates computed by the program consist of changing links or pointers along that path, or substituting values in the final record(s) identified.</Paragraph> <Paragraph position="14"> For example, consider the structure of the &quot;ships&quot; data base: The candidate translations for the third request (changing &quot;the ship in Naples&quot;) correspond to the following changes to the data base: (1) making a change to the Ships file (i.e., renaming the ship); (2) changing link (b) (moving a new ship into the dock); (3) changing link (a) (moving a new dock into the port).</Paragraph> <Paragraph position="15"> If the expression &quot;the ship in dock 12&quot; were used, only options 1 and 2 would be generated; similarly, if &quot;Totor&quot; were used, only option 1 would be generated.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.3 The selection of appropriate updates </SectionTitle> <Paragraph position="0"> At first examination, it would seem to be necessary to incorporate a semantic model of the domain to select an appropriate update from among the candidate updates. While this approach would surely be effective, the overhead required to encode, store, and process this knowledge for each individual data base may be prohibitive in practical applications. In general, the required information might not be available. What is needed is a general set of heuristics that will select an appropriate update in a reasonable majority of cases, without specific knowledge of the domain.</Paragraph> <Paragraph position="1"> The heuristics that are applied to rank the candidate updates are based on the idea that the most appropriate one is likely to cause the minimum disruption to the user's conception of the data base. This concept is developed formally in the work of Lewis, presented in his 1973 book, Counterfactuals. In this work, Lewis examines the meaning and formal representation of such statements as &quot;If kangaroos had no tails, they would topple over.&quot; (p. 8) He argues that to evaluate the correctness of this statement (and similar counterfactual conditionals) it is necessary to construct in one's mind the possible world minimally different from the real world that could potentially contain the conditional (the &quot;nearest&quot; consistent world). He points out American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 61 James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests that this hypothetical world does not differ only in that kangaroos don't have tails, but also reflects other changes required to make that world plausible. Thus he rejects the idea that in the hypothetical world kangaroos might use crutches (as not being minimally different), or that they might leave the same tracks is the sand (as being inconsistent).</Paragraph> <Paragraph position="2"> The application of this work to processing natural language data base updates is to regard each transaction as presenting a &quot;counterfactual&quot; state of the world, and request that the &quot;nearest&quot; reasonable world in which the counterfactual is true be brought about.</Paragraph> <Paragraph position="3"> For example, the request &quot;Change the teacher of CS345 from Smith to Jones.&quot; might correspond to the counterfactual &quot;If Jones taught CS345 instead of Smith, how would the data base be different?&quot; along with a speech act requesting that the data base be put in this new state.</Paragraph> <Paragraph position="4"> To select this nearest world, three sources of information are used: (a) the side effects entailed by the different candidates; null (b) pragmatic information contained in the data base schema; (c) semantic constraints attached to the data base schema.</Paragraph> <Paragraph position="5"> (a) Side effects As illustrated in the example of section 2, updates may have effects on the user's view and the data base beyond those literally requested. Using the rationale of &quot;minimal disruption&quot;, updates that do not have side effects are preferable to those that do. For each candidate, we consider the number and type of side effects caused, and rank the candidates accordingly. In data base management terms, the update with the fewest side effects on the user's data sub-model is selected as the most appropriate.</Paragraph> <Paragraph position="6"> Considering the example from section 2, note that the two candidates have different effects on the user's view. The one that was actually performed - candidate (a), changing the name of the manager of the Sales department - also changes two other values in the view. The other candidate - (b), moving Brown to the Marketing department - does not have these effects. Therefore, the latter more exactly fulfills the user's request, and would be preferred.</Paragraph> <Paragraph position="7"> The side effects that actually occur for a particular candidate are in a sense accidental, in that they depend on the particular state of the data base. For example, the number of side effects caused by changing the manager of the Sales department depends upon how many other employees happen to work in that department. To avoid this property of contingency, a more stable approach is to consider what side effects could result from performing the given candidate in any state of the data base. This set of potential side effects can be determined by analyzing the restrictions in the data base schema concerning the cardinality and dependency of relationships between entities. The significance of this concept is that the constraints on cardinality and dependency may be strong enough to ensure that the set of potential side effects (and hence the set of actual ones) is empty - indicating that the given candidate does not have any side effects in the current state, and more important, could not have side effects in any state.</Paragraph> <Paragraph position="8"> Consider once again the example of section 2. Of the two updates, (a) causes actual side effects, (b) doesn't. A stronger reason for preferring (b) is that it cannot cause side effects to the user's view, regardless of the state of the data base. To see this, note that the cardinality of the relationship between employees and departments is typically N:I - each employee works for only one department. Thus, an employee can have only one manager, and moving the employee to a new department cannot cause any changes to this aspect of the view beyond the one requested. The potential side effects of (a) consist of changes to the managers of employees other than Brown; the two actual side effects are an example of this.</Paragraph> <Paragraph position="9"> This calculation can be generalized, by considering a graphical representation of the view, in which nodes represent relations, and arcs stand for the joins (relationships) between relations. For relationships that are N:I as in the example, the arc is labeled to indicate the direction of the functional determination.</Paragraph> <Paragraph position="10"> Thus, the graph for the example would be: The view graph can be used to evaluate the side effects for each translation, with the following rule: Consider the value or link being changed by the translation in question, and the relation of which it is a part. If that relation is a root of the view graph, that is, if there exist paths following the arrows, from the relation in question to all the other relations of the graph, then the translation will not have any side effects.</Paragraph> <Paragraph position="11"> For the example in question, translation (a) consisted of a change to Departments, while (b) entailed a change to Employees (to move Brown to another department). In the graph, Employees is a root, but Departments is not - the link from Departments to Employees runs the &quot;wrong&quot; way. Thus, the translation (b) cannot entail side effects, although (a) may; this is consistent with the previous observation.</Paragraph> <Paragraph position="12"> In Davidson (1983), this analysis is carried further and developed more formally. We identify a number of different types of side effects and establish graphical conditions for the presence and absence of each. Further, theorems are introdueed concerning compari-</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 62 American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 </SectionTitle> <Paragraph position="0"> James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests son of side effects for different translations, and the optimality of certain translations is proved.</Paragraph> <Paragraph position="1"> In the ranking of candidates for appropriateness, only potential side effects are considered. Explanations, when needed, are phrased with respect to actual side effects, if any exist, otherwise to potential ones.</Paragraph> <Paragraph position="2"> (b) Pragmatic information There may be information in the data base schema to help the selection among candidate updates. For example, certain attributes and links in the schema may be designated at design time as static, indicating that they rarely change, or dynamic, indicating that they frequently change. This information is used during implementation to select methods for accessing the information. It may also be of use when ranking candidate updates.</Paragraph> <Paragraph position="3"> Considering the last example from section 4.2, we note that one of the candidates changes the ship by moving a new dock into Naples. This is consistent within the data base and fulfills the update request; but, the data base schema would indicate that such a change is unlikely (because the location of a dock is a static attribute), and this candidate's desirability would be downgraded Similarly, there may be general rules that names change less often than other attributes.</Paragraph> <Paragraph position="4"> Note that this information is merely heuristic; if the only candidate is one that involves such a change, it will be performed.</Paragraph> <Paragraph position="5"> (c) Semantic constraints The schema will often contain semantic constraints that restrict the allowable states of the data base. Examples of these are functional dependencies (for example, &quot;Two employees cannot have the same employee number.&quot;), range constraints (&quot;No employee can make more than $45,000.&quot;), and existence constraints (&quot;If an employee works in a particular department, there must be a record for that department in the departments relation.&quot;).</Paragraph> <Paragraph position="6"> These figure in the process of update interpretation, in the elimination of candidates that are otherwise acceptable. In the example of section 4.2, if there is already a ship named Pequod in the data base, the renaming change could cause a name conflict, resulting in the rejection of this candidate.</Paragraph> <Paragraph position="7"> Whereas the pragmatic information discussed above was heuristic, the semantic constraints are absolute.</Paragraph> <Paragraph position="8"> Candidates that violate semantic constraints will never be performed. However, it is still advantageous to generate and consider these candidates, since it is often possible to formulate a meaningful explanation for the user about the nonfulfillment of the request.</Paragraph> <Paragraph position="9"> Our current ordering heuristics incorporate these sources of information. In increasing order of preference, they are: - updates that violate semantic constraints associated with the deta base; - updates that violate pragmatic guidelines; - updates with side effects on the user's current view; - updates with no side effects.</Paragraph> <Paragraph position="10"> While this approach can certainly fail in cases where complex domain semantics rule out the &quot;simplest&quot; change, in most cases it is sufficient to select a reasonable update from among the various possibilities. Consider again Lewis' &quot;Counterfactual&quot; framework. We see that the method of generating candidates discussed in section 4.2 defines the accessibility of different states of the world (data base); the semantic constraints define consistency,&quot; pragmatic constraints and side effect information are measures of distance between states of the data base.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.4 Action taken </SectionTitle> <Paragraph position="0"> If one candidate is better than the others, it is performed. If there are a number of candidates that cannot be distinguished by the heuristic ranking, the user is told about them and offered a choice. If no candidate is admissible (because, for instance, all candidates violate semantic constraints on the data base), the user is so informed.</Paragraph> <Paragraph position="1"> In a number of cases, circumstances must be explained to the user. For instance, if the candidate actually performed has side effects, the user must be notified. If a semantic constraint is violated, the user must be told how.</Paragraph> <Paragraph position="2"> Our approach to explanation assumes that the user is familiar only with his own view of the data base, and so all explanations must be phrased with respect to this understanding (following McKeown 1979).</Paragraph> <Paragraph position="3"> Therefore, options are presented in terms of their effects on the user's view (rather than the actual changes proposed), and violations of semantic constraints are discussed with respect to attributes that the user has already seen. In this way, we ensure that explanations are always comprehensible.</Paragraph> <Paragraph position="4"> 5. Examples of the System in Operation PIQUE runs in INTERLISP (Teitelman 1978) on the DEC System-20 at SRI International as part of the KBMS system (Wiederhold et al. 1981). The natural language parser is written in LIFER, a semantic grammar system designed by Gary Hendrix (1977). The data base access uses SODA, a LISP-compatible variant of the relational calculus developed by Bob Moore (1979); the SODA interpreter used was written by Bil Lewis, and has been modified and extended by Jim Davidson to handle updates.</Paragraph> <Paragraph position="5"> Note that some of the information printed by the current system is presented merely for pedagogical purposes, to show the intermediate stages of the computation. In the course of a real run, such information (shown indented in the transcripts below) would be suppressed. The user's input is preceded by >.</Paragraph> <Paragraph position="6"> American Journal of Computational Linguistics, Volume 9, Number 2, April-June 1983 63 James Davidson and S. Jerrold Kaplan Natural Language Access to Data Bases: Interpreting Update Requests Assume a sample data base containing the following information: Individual employees, with salary, department, and 1. Example of an update performed using side effect heuristics.</Paragraph> <Paragraph position="7"> Consider once again the example shown in section 2, this time with the real system: The program now generatesthecandidate updates. As indicated, there aretwo ofthese.</Paragraph> <Paragraph position="8"> The possible ways of performing the update: I. In the relation DMLD change the MGR attr of the tuple</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> DEPT MGR LOC DIV SALES JONES SF I </SectionTitle> <Paragraph position="0"> to the value BAKER 2. In the relation ESD change the DEPT attr of the tuple</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> EMP SAL DEPT BROWN 25 SALES </SectionTitle> <Paragraph position="0"> to the value MKTING Note that the update of changing the DEPT attribute of the tuple (MKTING BAKER) to Sales would make Baker one of the managers of Brown. But, it would also leave Jones as a manager of Brown, and thus does not fulfill the user's request, even literally. For this reason, it has not even been considered by the system. These candidates are then evaluated, in terms of the accuracy with which they fulfill the user's request. In the absence of any strong preference for either one, the decision is made on the basis of potential side effects to the user's view.</Paragraph> <Paragraph position="1"> These translations have the following effects-I. Effects are: In the view: potentially changing the MGR of other EMPS.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. Effects are: </SectionTitle> <Paragraph position="0"> None.</Paragraph> <Paragraph position="1"> The program concludes that update (2) is superior to (1), since (1) has the possibility of changing the manager of other employees. (In actuality, (1) would change the managers of both Smith and Pullum.) Desired translation is: 2.</Paragraph> <Paragraph position="2"> Revised view is:</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> EMP MGR ADAMS FISHER WHITE BAKER BROWN BAKER SMITH JONES PULLUM JONES </SectionTitle> <Paragraph position="0"> The result accords with the user's wishes; only the requested change has been made to his view.</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> >(MOVE ADAMS FROM SD TO LA) </SectionTitle> <Paragraph position="0"> The system interprets this as a request to change Adams'location.</Paragraph> <Paragraph position="1"> The possible ways of performing the update: I. In the relation DMLD change the LOC attr of the tuple</Paragraph> </Section> <Section position="11" start_page="0" end_page="0" type="metho"> <SectionTitle> DEPT MGR LOC DIV INVNTRY FISHER SD I </SectionTitle> <Paragraph position="0"> to the value LA 2. In the relation ESD change the DEPT attr of the tuple</Paragraph> </Section> <Section position="12" start_page="0" end_page="0" type="metho"> <SectionTitle> EMP SAL DEPT ADAMS 30 INVNTRY </SectionTitle> <Paragraph position="0"> to the value MKTING Two candidates are identified, corresponding to (1) physically moving the department to a different location, or (2) reassigning the employee.</Paragraph> <Paragraph position="1"> Now, the candidates are evaluated.</Paragraph> <Paragraph position="2"> These translations have the following side effects on the view: I. Effects are: Violation of pragmatic constraints. null</Paragraph> </Section> <Section position="13" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. Effects are: </SectionTitle> <Paragraph position="0"> None.</Paragraph> <Paragraph position="1"> The &quot;location&quot; attribute of the DMDL relation, representing the location of the department, is marked in the data base schema as &quot;static&quot;, indicating that it rarely changes. Thus, update (1) is unlikely. The system detects this. Note that update (1) also has potential side effects on the user's view, but the violation of the pragmatic constraint is a stronger reason for rejection.</Paragraph> <Paragraph position="2"> Desired translation is: 2.</Paragraph> <Paragraph position="3"> The program generates the ways of performing the update. There is only one of these.</Paragraph> <Paragraph position="4"> The possible ways of performing the update: I. In the relation EN change the EMPNO attr of the tuple</Paragraph> </Section> <Section position="14" start_page="0" end_page="0" type="metho"> <SectionTitle> EMP EMPNO SMITH 222 </SectionTitle> <Paragraph position="0"> to the value 103.</Paragraph> <Paragraph position="1"> \[The effects engendered by this candidate are now listed; the candidate would violate a semantic data base constraint.\] These translations have the following effects: I. Effects are: Violation of semantic constraints. The system now tells the user what has happened, explaining why the update couldn't be performed, and how the semantic constraint would be violated. This update could not be performed, because of semantic constraints: The EMPNO value of 103 has already been assigned to the tuple</Paragraph> </Section> <Section position="15" start_page="0" end_page="0" type="metho"> <SectionTitle> EMP EMPNO ADAMS I 03 </SectionTitle> <Paragraph position="0"> which has the DEPT value of INVNTRY.</Paragraph> <Paragraph position="1"> This update would violate the functional dependency EMPNO --> EMP.</Paragraph> <Paragraph position="2"> Note that, without the DEPT value printed out, the user may not realize why he cannot see the (ADAMS 103) tuple. The explanation is thus phrased with respect to the user view.</Paragraph> <Paragraph position="3"> 4. Example of a genuinely ambiguous update.</Paragraph> <Paragraph position="4"> Now, a dialogue concerning a different part of the data base:</Paragraph> </Section> class="xml-element"></Paper>