File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/p96-1026_metho.xml

Size: 16,451 bytes

Last Modified: 2025-10-06 14:14:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1026">
  <Title>Two Sources of Control over the Generation of Software Instructions*</Title>
  <Section position="5" start_page="192" end_page="192" type="metho">
    <SectionTitle>
4 Coding features
</SectionTitle>
    <Paragraph position="0"> Our lexico-grammatical coding was done using the networks and features of the Nigel grammar (Halliday, 1985). We focused on four main concerns, guided by previous work on instructional texts, e.g., (Lehrberger, 1986; Plum et at., 1990; Ghadessy, 1993; Kosseim and Lapalme, 1994).</Paragraph>
    <Paragraph position="1"> * Relations between processes: to determine whether textual cohesion was achieved through conjunctives or through relations implicit in the task structure elements.</Paragraph>
    <Paragraph position="2"> Among the features considered were clause dependency and conjunction type.</Paragraph>
    <Paragraph position="3"> * Agency: to see whether the actor performing or enabling a particular action is clearly identified, and whether the reader is explicitly addressed. We coded here for features such as voice and agent types.</Paragraph>
    <Paragraph position="4"> * Mood, modality and polarity: to find out the extent to which actions are presented to the reader as being desirable, possible, mandatory, or prohibited. We coded for both true and implicit negatives, and for both personal and impersonal expressions of modality.</Paragraph>
    <Paragraph position="5"> * Process types: to see how the domain is construed in terms of actions on the part of the user and the software. We coded for sub-categories of material, mental, verbal and relational processes.</Paragraph>
  </Section>
  <Section position="6" start_page="192" end_page="193" type="metho">
    <SectionTitle>
5 The Corpus
</SectionTitle>
    <Paragraph position="0"> The analysis was conducted on the French version of the Macintosh MacWrite manual (Kaehler, 1983).</Paragraph>
    <Paragraph position="1"> The manual is derived from an English source by a process of adaptive translation (Sager, 1993), i.e., one which loealises the text to the expectations of the target readership. The fact that the translation is adaptive rather than literal gives us confidence in using this manual for our analysis. 1 Furthermore, we know that Macintosh documentation undergoes thorough local quality control. It certainly conforms to the principles of good documentation established by current research on technical documentation and on the needs of end-users, e.g., (Carroll, 1994; Hammond, 1994), in that it supplies clear and concise information for the task at hand. Finally, we have been assured by French users of the software that they consider this particular manual to be well written and to bear no unnatural trace of its origins.</Paragraph>
    <Paragraph position="2"> Technical manuals within a specific domain constitute a sublanguage, e.g., (Kittredge, 1982; Sager et al., 1980). An important defining property of a sublanguage is that of closure, both lexieal and syntactic. Lexical closure has been demonstrated by, for example, (Kittredge, 1987), who shows that after as few as the first 2000 words of a sublanguage text, the number of new word types increases little if at all. Other work, e.g., (Biber, 1988; Biber, 1989) and (Grishman and Kittredge, 1986) illustrates the prop-erty of syntactic closure, which means that generally available constructions just do not occur in this or that sublanguage. In the light of these results, we considered a corpus of 15000 words to be adequate for our purposes, at least for an initial analysis.</Paragraph>
    <Paragraph position="3"> The MacWrite manual is organised into three chapters, corresponding to the three different sections identified earlier: a tutorial, a series of step-by-step instructions for the major word-processing tasks, and a ready-reference summary of the commands. We omitted the tutorial because the generation of such text is not our concern, retaining the other two chapters which provide the user with generic instructions for performing relevant tasks, and descriptions of the commands available within MacWrite. The overlap in information between the two chapters offers opportunities to observe differences in the linguistic expressions of the same task structure elements in different contexts.</Paragraph>
    <Paragraph position="4"> 1We would have preferred to use a manual which originated in French to exclude all possibility of interference from a source language, but this proved impossible. Surprisingly, it appears that large French companies often have their documents authored in English by francophones and subsequently translated into French.</Paragraph>
    <Paragraph position="5"> One large French software house that we contacted does author its documentation in French, but had registered considerable customer dissatisfaction with its quality.</Paragraph>
    <Paragraph position="6"> We decided, therefore, that their material would be unsuitable for our purposes.</Paragraph>
  </Section>
  <Section position="7" start_page="193" end_page="194" type="metho">
    <SectionTitle>
6 Task Structure
</SectionTitle>
    <Paragraph position="0"> Task structure is constituted by five types of task elements, which we define below. We used the notion of task structure element both as a contextual feature for the analysis and to determine the segmentation of the text into units. Each unit is taken to be the expression of a single task element.</Paragraph>
    <Paragraph position="1"> Our definition of the task elements is based on the concepts and relations commonly chosen to represent a task structure (a goal and its associated plan), e.g., (Fikes and Nilsson, 1971; Sacerdoti, 1977), and on related research, e.g., (Kosseim and Lapalme, 1994). Our generator produces instructions from an underlying semantic knowledge base which uses this representation (Paris et al., 1995). To generate an instruction for performing a task is to chose some task elements to be expressed and linearise them so that they form a coherent set for a given goal the user might have. We distinguish the following elements, and provide examples of them in Figure 1:2 goals: actions that users will adopt as goals and which motivate the use of a plan.</Paragraph>
    <Paragraph position="2"> functions: actions that represent the functionality of an interface object (such as a menu item). A 2The text in parentheses in the Figure is part of the linguistic context of the task element rather than the element itself.</Paragraph>
    <Paragraph position="3"> function is closely related to a goal, in that it is also an action that the user may want to perform. However, the function is accessed through the interface object, and not through a plan.</Paragraph>
    <Paragraph position="4"> constraints and preconditions: states which must hold before a plan can be employed successfully. The domain model distinguishes constraints (states which cannot be achieved through planning) and preconditions (states which can be achieved through planning). We do not make this distinction in the linguistic analysis and regroup these related task structure elements under one label. We decided to proceed in this way to determine at first how constraints in general are expressed.</Paragraph>
    <Paragraph position="5"> Moreover, it is not always clear from the text which type of constraint is expressed. Drawing too fine distinctions in the corpus analysis at this point, in the absence of a test for assigning a unit to one of these constraint types, would have rendered the results of the analysis more subjective and thus less reliable.</Paragraph>
    <Paragraph position="6"> results: states which arise as planned or unplanned effects of carrying out a plan. While it might be important to separate planned and unplanned effects in the underlying representation, we again abstract over them in the lexico-grammatical coding.</Paragraph>
    <Paragraph position="7">  sub-steps: actions which contribute to the execution of the plan. If the sub-steps are not primitive, they can themselves be achieved through other plans.</Paragraph>
  </Section>
  <Section position="8" start_page="194" end_page="196" type="metho">
    <SectionTitle>
7 The Coding Procedure
</SectionTitle>
    <Paragraph position="0"> No tools exist to automate a functional analysis of text, which makes coding a large body of text a time-consuming task. We first performed a detailed coding of units of texts on approximately 25% of the corpus, or about 400 units, 3 using the WAG coder (O'Donnell, 1995), a tool designed to facilitate a functional analysis.</Paragraph>
    <Paragraph position="1"> We then used a public-domain concordance program, MonoConc (Barlow, 1994), to verify the representativeness of the results. We enumerated the realisations of those features that the first analysis had shown as marked, and produced KWIC 4 listings for each set of realisations. We found that the second analysis corroborated the results of the first, consistent with the nature of sublanguages.</Paragraph>
    <Paragraph position="2">  We examined the correlations between lexico-grammatical realisations and task elements and communicative purpose. The results are best expressed using tables generated by WAG: given any system, WAG splits the codings into a number of sets, one for each feature in that system. Percentages and means are computed, and the sets are compared statistically, using the standard T-test. WAG displays the results with an indicator of how statistically significant a value is compared to the combined means in the other sets. The counts were all done using the local mean, that is, the feature count is divided by the total number of codings which select that feature's system. Full definitions of the features can be found in (Halliday, 1985; Bateman et al., 1990).</Paragraph>
    <Paragraph position="3"> In some cases, the type of task element is on its own sufficient to determine, or at least strongly constrain, its linguistic realisation. The limited space available here allows us to provide only a small number of examples, shown in Figure 2. We see that the use of modals is excluded in the expression of function, result and constraint, whereas goal and substep do admit modals. As far as the polarity system is concerned, negation is effectively ruled out for function, goal and substep. Finally, with respect to the mood system, only substep can be realised through imperatives.</Paragraph>
    <Paragraph position="4">  In other cases, however, we observe a diversity of realisations. We highlight here three cases: modality in goal, polarity in constraint, and mood in substep.</Paragraph>
    <Paragraph position="5"> In such cases, we must appeal to another source of control over the apparently available choices. We have looked to the construct of genre (Martin, 1992) to provide this additional control, on two grounds: (1) since genres are distinguished by their communicative purposes, we can view each of the functional sections already identified as a distinct genre; (2) genre is presented as controlling text structure and realisation. In Martin's view, genre is defined as a staged, goal-oriented social process realised through register, the context of situation, which in turn is realised in language to achieve the goals of a text.</Paragraph>
    <Paragraph position="6"> Genre is responsible for the selection of a text structure in terms of task elements. As part of the realisation process, generic choices preselect a register associated with particular elements of text structure, which in turn preselect lexico-grammatical features.</Paragraph>
    <Paragraph position="7"> The coding of our text in terms genre and task elements thus allows us to establish the role played by genre in the realisations of the task elements. It will also allow us to determine the text structures appropriate in each genre, a study we are currently undertaking. This is consistent with other accounts of text structure for text generation in technical domains, e.g., (McKeown, 1985; Paris, 1993; Kittredge et al., 1991).</Paragraph>
    <Paragraph position="8"> For those cases where the realisation remains under-determined by the task element type, we conducted a finer-grained analysis, by overlaying a genre partition on the undifferentiated data. We distinguished earlier two genres with which we are concerned: ready-reference and step-by-step. In the manual analysed, we recognised two more specific communicative purposes in the step-by-step section: to enable the reader to perform a task, and to increase the reader's knowledge about the task, the way to achieve it, or the properties of the system as a whole. Because of their distinct communicative purposes, we again feel justified in calling these genres. We label them respectively procedure and elaboration. The intention that the reader should recognise the differences in function of each section is underscored by the use of distinctive typographical devices, such as fonts and lay-out. 5 The first step at this stage of the analysis was to establish whether there was an effective overlap in task elements among the three genres under consideration. The results of this step is shown in Figure 3. Sub-step and goal are found in all three genres, while constraint, result and function occur in both ready-reference and elaboration but are absent from procedure. null The next step was to undertake a comparative  analysis of the lexico-grammatical features found in the three genres. This analysis indicated that the language employed in these different sections of the text varies greatly. We summarise here the two genres that are strongly contrasted: procedure and ready-reference. Elaboration shares features with both of these.</Paragraph>
    <Paragraph position="9"> procedure: The top-level goM of the user is expressed as a nominMisation. Actions to be achieved by the reader are almost exclusively reMised by imperatives, directly addressing the reader. These actions are mostly materiM directed actions, and there are no causatives. Few modals are employed, and, when they are, it is to express obligation impersonally. The polarity of processes is always positive. Procedure employs mostly independent clauses, and, when clause complexes are used, the conjunctions are mostly purpose (linking a user goal and an action) and alternative (linking two user actions or two goals).</Paragraph>
    <Paragraph position="10"> ready-reference: In this genre, M1 task elements are always realised through clauses. The declarative mood predominates, with few imperatives addressing the reader. Virtually all the causatives occur here. On the dimension of modality, the emphasis is on personal possibility, rather than obligation, and on inclination. We find in this genre most of the verbM processes, entirely absent from procedure.</Paragraph>
    <Paragraph position="11"> Ready-reference is more weighted than procedure towards dependent clauses, and is particularly marked by the presence of temporal conjunctions. null The analysis so far demonstrates that genre, like task structure, provides some measure of control over the linguistic resources but that neither of these alone is sufficient to drive a generation system. The finM step was therefore to look at the realisations of the task elements differentiated by genre, in cases where the realisation was not strongly determined by the task element.</Paragraph>
    <Paragraph position="12"> We refer the reader back to Figure 2, and the under-constrained cases of modality in goal, polarity in constraint, and mood in substep. Figure 4 shows the realisations the task element goal with respect to the modal system, which brings into sharp relief the absence of modality from procedure. Figure 5 presents the reaiisations by genre of the polarity system for constraint. We observe that only positive polarity occurs in ready-reference. Finally, we note from Figure 6 that the realisation of sub-steps is heavily loaded in favour of imperatives in procedure.</Paragraph>
    <Paragraph position="13"> These figures show that genre does indeed provide useful additional control over the expression of task elements, which can be exploited by a text generation system. Neither task structure nor genre alone is sufficient to provide this control, but, taken together, they offer a real prospect of adequate control over the output of a text generator.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML