File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1410_metho.xml

Size: 22,743 bytes

Last Modified: 2025-10-06 14:15:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1410">
  <Title>Macroplanning with a Cognitive Architecture for the Adaptive Explanation of Proofs</Title>
  <Section position="3" start_page="88" end_page="89" type="metho">
    <SectionTitle>
2 AcT-R: A Cognitive Architecture
</SectionTitle>
    <Paragraph position="0"> In cognitive science, there is a consensus that production systems are an adequate framework to_ describe the functionality of the cognitive apparatus. Production systems that model human cognition are called cognitive architectures. In this section We describe the *cognitive architecture ACTrR 1 .... \[Anderson, 1993\], which is well suited for user adaptive explanation generation because of its conflict resolution mechanism. Further examples for cognitive architectures are SOAR \[Newell, 1990\] and EPIC \[Meyer and Kieras, 1997\].</Paragraph>
    <Paragraph position="1"> AcT-R has two types of knowledge bases, or memories, to store permanent~ knowledge in: declarative and procedural representations of knowledge are explicitly separated into the declarative memory and the procedural production rule base, but are intimately connected.</Paragraph>
    <Paragraph position="2"> Procedural knowledge is represented in production rules (or simply:productions) xvhose conditions and actions are defined in terms of declarative structures. A production can only apply, if its conditions are satisfied by the knowledge currently available in the declarative memory. An item in the declarative memory is annotated with an activation that influences its retrieval. The application of a production modifies the declarative memory, or it results in an observable event.</Paragraph>
    <Paragraph position="3"> The set of applicable productions is called the conflict set. A conflict resolution heuristic derived from a rational analysis of human cognition determines which production in the conflict set will eventually be applied.</Paragraph>
    <Paragraph position="4"> In order to allow for a goal-oriented behavior of the system, ACT-R manages goals in a goal stack. The current goal is that on the top of the stack. Only productions that match the current goal are applicable.</Paragraph>
    <Section position="1" start_page="88" end_page="89" type="sub_section">
      <SectionTitle>
2.1 Declarative Knowledge
</SectionTitle>
      <Paragraph position="0"> Declarative knowledge is represented in terms of chunks in the declar- :fe.atFsubsetG ative memory. On the right is an example for a chunk encoding the +-sa subset-fact fact that F C_ G, where subset-fact is a concept and F and G are setl F contextual chunks associated to ~actFsubsetG. Chunks are anno- set2 G tated with continuous activations that influence their retrieval..The activation Ai of a chunk Ci is defined as</Paragraph>
      <Paragraph position="2"> where Bi is the base-level activation, Wj is the weighting of a contextual chunk Cj, and Sji is the strength of the association of C/ with Cj. In Bi, which is defined such that it decreases logarithmically when Ci is not used, AcT-R models the forgetting of declarative knowledge. Note IActually, I am discussing AcT-R 4.0, which has some substantial changes to older versions. The acronym ACT denotes adaptive control of thought, R refers to the rational analysis that influenced the theory.</Paragraph>
      <Paragraph position="3">  that the definition of the activation establishes a spreading activation to adjacent chunks, but not further; multi-link-spread is not supported.</Paragraph>
      <Paragraph position="4"> The constraint on the capacity of the human working memory is approached by defining a retrieval threshold r, where only those chunks Ci can be matched whose activation Ai is higher than r. Chunks with an activation less than ~- are considered as forgotten.</Paragraph>
      <Paragraph position="5"> :New declarative knowledge is acquired when a new chunk is stored in the declarative memory, as is always the case when a goal is popped from the goal stack. The application of a production may also cause a new chunk tobe Stored if required by the production's action part.</Paragraph>
    </Section>
    <Section position="2" start_page="89" end_page="89" type="sub_section">
      <SectionTitle>
2.2 Procedural Knowledge
</SectionTitle>
      <Paragraph position="0"> The operational knowledge of ACT:R is formalized in terms of productions. Productions generally consist of a condition part and an *action part, and can be appliedl if the condition part is fulfilled.</Paragraph>
      <Paragraph position="1"> In AcT-R both parts are defined in terms of chunk patterns. The condition is fulfilled if its first chunk pattern matches the current goal and the remaining chunk patterns match chunks in the declarative memory. An example for a production is .IF: the current goal is to show that x E $2 and it is known that x E S1 and $1 .C_ $2 * THEN conclude that x ES2 by the definition of C: &amp;quot; * Similar to the base-level activation of chunks, the strength of a production is defined such that it decreases logarithmically when the production is ,not used. The time spent to match a production with a chunk depends on the activation of the chunk. 2 It is defined such that it is negative exponential to the sum of the activation of the chunk and the strength of the production. Hence, the higher the activation of the chunk and the strength of the production, the faster the production matches the chunk. Since the activation must be greater *than the retrieval threshold r; T constrains the time maximally available to match a production with a chunk.</Paragraph>
      <Paragraph position="2"> : The conflict resolution heuristic starts from assumptions on the probability P that the application of the current production leads to the goal and on the costs C of achieving that goal by this means. * Moreover G is the time maximally available to fulfill the goal. The net utility E of the application of a production is defined as</Paragraph>
      <Paragraph position="4"> We do not go into detail on how P, G and C are calculated. For the purposes of this paper, it is sufficient to note that G only depends on the goal, but not on the production, and that the costs C depend among other things On the time to match a production. The faster the production matches, i.e. the stronge r it is and the greater the activations of the matching chunks are, the lower are the costs. .</Paragraph>
      <Paragraph position="5"> To sum up, in AcT-R the choice of a production to apply is as follows:  1. The conflict set*is determined by testing the match of the productions with the current goal. 2. The production p with the highest utility is chosen.</Paragraph>
      <Paragraph position="6"> 3. The actual instantiation of p is determined via the activations of the corresponding chunks. If  no instantiation is possible (because of v), p is removed from the conflict set and the algorithm resumes in step 2, otherwise the instantiation of p is appiied.</Paragraph>
      <Paragraph position="7"> 21n this context, time does not mean the CPU time needed to calculate the match, but the time a human would need for the match according to the cognitive model.</Paragraph>
      <Paragraph position="9"> AcT-R provides a learning mechanism, called knowledge compilation, which allows for the learning of new productions. We are currently exploring this mechanism for its utility for the explanation of proofs.</Paragraph>
      <Paragraph position="11"/>
    </Section>
  </Section>
  <Section position="4" start_page="89" end_page="92" type="metho">
    <SectionTitle>
3 The Architecture of P. rex
</SectionTitle>
    <Paragraph position="0"> P.rex is planned as a generic explanation system that can be connected to different theorem provers. It adopts the following features of the interactive proof development environment f~MC.GA \[Benzmiiller et at., 1997\]: * Mathematical theories are organized in a hierarchical knowledge base. Each theory in it may contain axioms, definitions, theorems along with proofs, as well as proof methods, and Control rules how to apply proof methods.</Paragraph>
    <Paragraph position="1"> * A proof of a theorem is represented in a hierarchical data structure called proof plan data structure :(PDS). The PDS makes explicit the various levels of abstraction by providing several justifications for a single proof node, where each justification belongs to a different lexiel of abstraction. The least abstract level corresponds to a proof in Gentzen's natural deduction (ND) calculus \[Gentzen, 1935\]. Candidates for higher levels are:proof plans, where justifications are mainly given by more abstract proof methods that belong to the theorem's mathematical theory or to an ancestor theory thereof.</Paragraph>
    <Paragraph position="2"> An example for a PDS is given below on the left. Each line consists of four elements (label, antecedent, succedent, and justification) and describes a node in the PDS. The label is used as a reference for the node. The antecedent is a list of labels denoting the hypotheses under which the formula in the node, the succedent, holds, a This relation between antecedent and succedent is denoted by F-.</Paragraph>
    <Paragraph position="4"> of the fact in the node is given by its justification. A justification consists of a rule and a list of labels, the premises of the node. Ji denotes an unspecified justification. HYP and DefU stand for a hypothesis and the definition of U, respectively. L3 has two justifications on different levels of abstraction: the least abstract justification with the ND-rule CASE (i.e. the rule for case analyses) and the more abstract justification with the rule U-Lemma that stands for an already proven lemma about a property of U. By agreement, if a node has more than one justification, these are sorted from most abstract to least abstract.</Paragraph>
    <Paragraph position="5"> The proof is as follows: From a E U V a E V we can conclude that a E U U V by the U-Lemma. If we do not know the U-Lemma, we can come to the conclusion by considering the case analysis with the cases that a E U or a E V, respectively. In each case, we can derive that a E U O V by the definition of U.</Paragraph>
    <Paragraph position="6"> A formal language for specifying *PDSs is the interface by which theorem provers can be connected to P. rez. An overview of the architecture of P. rex is provided i n Figure 1. The crucial component of the system is the dialog planner. It is based on AcT-R, i.e. its operators are defined in terms of productions and the discourse history is represented in the declarative memory by storing conveyed information as chunks (details are given in Section 4). Moreover,</Paragraph>
    <Paragraph position="8"> presumed declarative and procedural knowledge of the *user is encoded in the declarative memory and the:production rule base, respectively.' * In order to explain a particular proof, the dialog planner first assumes the user's supposed H cognitive state by Updating its declarative and procedural memories. This is done by looking up g the user's presumed knowledge in the user model, which was recorded during a previous session. ! An individual model for each user persists between the sessions. ! The user model contains assumptions on the knowledge of the user that are relevant to proof explanation. In particular, it makes assumptions on which mathematical theories the user knows, whiChhe hasdefinitidegnS'alreo;dy learned:Prdegdegfs' proof methods and mathematical facts he knows, and which productions !' wg * After updating the declarative and procedural memories, the dialog planner sets the global goal tO show the conclusion of the PDS's theorem. ACT-R tries to fulfill this goal by successively ' * applying productions that decompose or fulfill goals. Thereby, the dialog planner not only produces a multimodal dialog plan (see Section 4.1), but also traces the user's * cognitive states in the course of the explanation. * This allows the system both to always choose an explanation adapted to the ~ * user (see Section 4.2), and to react to theuser's interactions in a flexible way: The dialogplanner | analyzes the interaction in terms of applications of productions. Then it plans an appropriate response. I The dialog plan produced by the dialog planner is passedon to the multimodal presentation componen~ which supports the modalities graphics, text, and speech. It consists of the following subcomponents: D A multimodal microplanner to be designed plans the scope of the sentences and their internal structure, as well as their graphical arrangement. It also decides, whether a graphical or a textual realization is preferred. Textual parts are passed on to a linguistic realizer that generates the nab surface sentences: Then a planned layout component displays the text and graphics, while a speech | system outputs the sentences in speech. Hence, the system should provide the user with text and graphics, as well as a spoken output. The metaphor we have in mind is the teacher who explains I what he is writing on the board. * An analyzer :to be designed receives the *user's interactions and passes them on to the dialog i planner. .!</Paragraph>
    <Paragraph position="10"/>
  </Section>
  <Section position="5" start_page="92" end_page="93" type="metho">
    <SectionTitle>
4 The Dialog Planner
</SectionTitle>
    <Paragraph position="0"> In the community of NLG, there is a broad consensus that the generation of natural language should be done in three major steps \[Reiter, 1994\]. First a macroplanner (text planner) determines what to say, i.e. content and order of the information to be conveyed. Then a microplanner (sentenCe planner) determines how to say it, i.e. it plans the scope and the internal structure of the sentences.</Paragraph>
    <Paragraph position="1"> Finally, a realizer (surface generator) produces the surface text. In this classification, the dialog planner is a macroplanner for managing dialogs.</Paragraph>
    <Paragraph position="2"> As Wahlster et al. argued, such a three-staged architecture is also appropriate for muitimodal generation \[Wahlster et al., 1993\]. By defining the operators and the dialog plan such that they are independent of the communication mode, our dialog planner plans text, graphics and speech.</Paragraph>
    <Paragraph position="3"> Since the dialog planner in P. rex is based on AcT-R, the plan operators are defined as productions. A goal is the task to show the fact in a node n of the PDS. A production fulfills the goal directly by communicating the derivation of the fact in 7z from already known *facts or splits the goal into new subgo~ls such as to show the facts in the premises of n. The derivation of a fact is conveyed by so-called mathematics communicating acts (MCAs) and accompanied by storing the fact as a chunk in the declarative memory. Hence the discourse history is represented in the declarative memory. AcT-R's conflict resolution mechanism and the activation of the chunks ensure an explanation tailored to the user. The produced dialog plan is represented in terms of MCAs.</Paragraph>
    <Section position="1" start_page="92" end_page="92" type="sub_section">
      <SectionTitle>
4.1 Mathematics Communicating Acts
</SectionTitle>
      <Paragraph position="0"> Mathematics communicating acts (MCAs) are the primitive actions planned by the dialog planner.</Paragraph>
      <Paragraph position="1"> They are derived from PROVERB's proof communicative acts \[Huang, 1994\]. MCAs are viewed as * speech acts that are independent of the modality to be chosen. * Each MCAat least can be realized as a portion of text. Moreover some MCAs manifest themselves in the graphical arrangement of the text (see below for examples).</Paragraph>
      <Paragraph position="2"> In P. rez we distinguish between two types of MCAs: * MCAs of the first type, called derivational MCAs, convey a step of the derivation. An example for a derivati0n~l MCA with a possible verbalization is: .(Derive :Reasons (a 6 U, U C_ V) :Conclusion a 6 V :Method Deft) &amp;quot;Since a is an element of U and U is a subset of V, a is all element of V by the definition of subset.&amp;quot; A graphical realization is shown in Figure 2(a).</Paragraph>
      <Paragraph position="3"> * * MCAs of the second type, called structural MCAs, communicate information about the structure of a proof. For example case analyses are introduced by:</Paragraph>
    </Section>
    <Section position="2" start_page="92" end_page="93" type="sub_section">
      <SectionTitle>
(Case-Analysis :Goal C/ :Cases (%01, %02)) &amp;quot;
</SectionTitle>
      <Paragraph position="0"> &amp;quot;To prove C/, let us consider the two cases by assuming T1 and %02.&amp;quot; Unless the two cases only enclose a few steps each, the graphical realization shown in Figure 2(b) should be preferred for the visual presentation.</Paragraph>
    </Section>
    <Section position="3" start_page="93" end_page="93" type="sub_section">
      <SectionTitle>
4.2 Plan Operators -
</SectionTitle>
      <Paragraph position="0"> Operational knowledge concerning the presentation is encoded as productions in AcT-R that are independent from the modality to be chosen. In this paper, we concentrate on production s which allow for the explanatio n of a proof. We omit productions to react to the user's interactions.</Paragraph>
      <Paragraph position="1"> Each production either fulfills the current goal directlyor splits it into subgoals. Let us assume that the following nodes are in the current PDS:</Paragraph>
    </Section>
    <Section position="4" start_page="93" end_page="93" type="sub_section">
      <SectionTitle>
Label Antecedent Succedent Justification
</SectionTitle>
      <Paragraph position="0"/>
      <Paragraph position="2"> An example for a production is:  (P1) IF The current goal is to show F * C/ and R is the most abstract known rule justifying the current goal and A 1 I- ~01,... , A n * ~n are known THEN produce MCA (Derive :Keasons (~i,--,~n) :Conclusion~b :Method R) and pop the current goal (thereby storing F I- C/ in the declarative memory) * By producing the MCA the current goal is fulfilled and can be popped from the goal stack. An example for a production decomposing the current goal into several subgoals is: (P2) IF &amp;quot; The current goal is to show F I- C/ and R is the most abstract known rule justifying * the current goal and (I) = {~oiiA i * ~i is unknown for 1 &lt; i &lt; n} ~ 0  THEN for each 9i E (I) push the goal to show Ai t-- 9i Note that the Conditions of (P1) and (P2) only differ in the knowledge of the premises 9i for rule R. (P2) introduces the subgoals to prove the unknown premises in (I). As soon as those are derived, (P1) can apply and derive the Conclusion.</Paragraph>
      <Paragraph position="3"> Now assume that the following nodes are in the current PDS:</Paragraph>
      <Paragraph position="5"> and CASE is the most abstract known rule justifying the current goal and F I- ~o 1 V ~0 2 is known and r, H1 ~- C/ and F, H2 ~- C/ are unknown THEN push the goals to show F, H1 ~- C/ and F,//2 h and produce MCA (Case-Analysis :Goal C/ :Cases (~1,~2)) This production introduces new subgoalS and ,motivates them by producing the MCA. Since more specific rules treat common communicative standards used in mathematical presentations, they are assigned a higher strength than more general rules. Therefore, the strength of (P3) is higher than the strength of (P2), since (P3) has fewer variables. Moreover, it is supposed that each user knows all natural deduction (ND) rules. This is reasonable, since ND-rules are the least abstract possible logical rules in proofs. Hence, for each production p that is defined such that its goal is justified by an ND-rule in the PDS, the probability Pp that the application of p leads to the goal to explain that proof step equals one. *Therefore, since CASE is such an ND-rule, P(P3) = 1.</Paragraph>
      <Paragraph position="6"> hrorder to elucidate how a proof is explained by Rrex let us consider the following situation: *Tlie following nodes are in tile current PDS:</Paragraph>
    </Section>
    <Section position="5" start_page="93" end_page="93" type="sub_section">
      <SectionTitle>
Label Antecedent Succedent Justi ficatio~
</SectionTitle>
      <Paragraph position="0"/>
      <Paragraph position="2"> * the current goal is to show the fact in L3, * the rules HYP, CASE, Defo, and U-Lemma are known, * the fact in Lo is known, the facts in H,, La, H2, and L2 are unknown. The only applicable production is (P1). Since o-Lemma is more abstract than CASE and both are known, it has a higher activation and thus is chosen to instantiate (P1). Hence, the dialog planner produces the MCA (Derive :Reasons (a 6 U V a 6 V) :Conclusion a 6 UU V :Method U-Lemma) that could be verbalized as &amp;quot;Since a 6 U or a 6 V, a 6 U u V by the O-Lemma.&amp;quot; Suppose now that the user interrupts the explanation throwing in that he did not understand this step. Then the system invokes productions that account for the following: The assumption that O-Lemma is known is revised by decreasing its base-level activation (cf. equation 1). Similarly, the just stored chunk for h a 6 U U V is erased from the declarative memory. Then the goal to show ~- a 6 U u V is again pushed on the goal stack.</Paragraph>
      <Paragraph position="3"> Now, since CASE is the most abstract known rule justifying the current goal, both decomposing productions (P2) and (P3) are applicable. Recall that the conflict resolution mechanism chooses the production with the highest utility E (cf. equation 9). Since P(P3) = 1 and Pp &lt; 1 for all</Paragraph>
      <Paragraph position="5"> production match the same chunks, C(p3} &lt; C(p2). Thus</Paragraph>
      <Paragraph position="7"> Therefore, the dialog planner chooses (P3) for the explanation, thus producing ghe MCA (Case-Analysis :Goal a E UUV :Cases (a E U,a E V)) that could be realized as &amp;quot;To prove a E U O V let us consider the two cases by assuming a E U and a E V,&amp;quot; and then explains both cases. This dialog could take place as follows: P. rez: Since a E U or a EV, a E UO V by the O-Lemma.</Paragraph>
      <Paragraph position="8"> User:. Why does this follow? * R fez: To prove a E U U V let us consider the two cases by assuming a E U and a E I/. If a E U, then aE U O V by the definition of 0. Similarly, if a E V, then a E U tO V. This example shows how a production and an instantiation are chosen by P. rex. While th, example elucidates the case that a more detailed explanation is desired, the system can similarly choose a more abstract explanation if needed. Hence, modeling the addressee's knowledge in AGT1~ allows P. rex to explain the proof adapted to the user's knowledge by switching between the levels in the PDS as needed.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML