File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/p99-1058_metho.xml
Size: 9,947 bytes
Last Modified: 2025-10-06 14:15:28
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1058"> <Title>A semantically-derived subset of English for hardware verification</Title> <Section position="5" start_page="452" end_page="454" type="metho"> <SectionTitle> 4 Defining a Controlled Language </SectionTitle> <Paragraph position="0"> Even confining our attention to hardware specifications of the level of complexity examined so far, we can conclude there are some kinds of English locutions which will map rather directly into CTL, whereas others have a much less direct relation. What is the nature of this indirect relation? Our claim in this paper is that we can give semantically-oriented characterisations of the relation between complexity in English sentences and their suitability for inclusion in a controlled language for hardware verification. Moreover, this semantic orientation yields a hierarchy of subsets of English. (This hierarchy is a theoretical entity constructed for our specific purposes, of course, not a general linguistic hypothesis about English.) Our first step in developing an English-to-CTL conversion system was to build a prototype based on the Alvey Natural Language Tools Grammar (Grover et al., 1993). The Alvey grammar is a broad coverage grammar of English using GPSG-style rules, and maps into a event-based, unscoped semantic representation.</Paragraph> <Paragraph position="1"> For this application, we used a highly restricted lexicon and simplified the grammar in a number of ways (for example: fewer coordination rules; no deontic readings of modals). Tidhar (1998) reports an initial experiment in taking the semantic output generated from a small set S of English specifications, and converting it into CTL. Given that the Alvey grammar will produce plausible semantic readings for a much larger set S', the challenge is to characterise an intermediate set S, with S C S C S', that would admit a translation ~b into formulas of CTL. Let's assume that we have a reverse translation ~b -x from CTL to English; then we would like S = range(cP-x).</Paragraph> <Section position="1" start_page="453" end_page="453" type="sub_section"> <SectionTitle> 4.1 Transliteration </SectionTitle> <Paragraph position="0"> Now suppose that ~b -l is a literal translation from CTL to English. That is, we recurse on the formulas of CTL, choosing a canonical lexical item or phrase in English as a direct counterpart to each constituent of the CTL formula. In fact, we have implemented such a translation as a DCG ct12eng. To illustrate, ct12eng maps the formula (2) into (5): (5) globally if i is high then after 1 cycle if i is low then o is low and after 1 cycle o is high and after 1 cycle o is low Let cp~ -1 be the function defined by ct12eng; then we call El = range(~-(1) the canonical transliteration level of English. We can be confident that it is possible to build a translation ~bl which will map any sentence in El into a formula of CTL. L t can be trivially augmented by adding near-synonymous lexical and syntactic variants. For example, i is high can be replaced by signal i holds, and after 1 cycle ... by 1 cycle later .... This adds no semantic complexity. We call the this language (notated/2+) the augmented transliteration level.</Paragraph> <Paragraph position="1"> One potential problem with defining q~t in this way is that the sentences generated by ctl2eng soon become structurally ambiguous. We can solve this either by generating unambiguous paraphrases, or by analysing the relevant class of ambiguities and making sure that ~bt is able to provide all relevant CTL interpretations.</Paragraph> <Paragraph position="2"> These languages contain only sentences. Hardware specifications often have the form of multi-sentence discourses, however. Such discourses, and the additional phenomena they introduce, occur at higher levels of our language hierarchy, and we presently lack any detailed analysis of them in the terms of this paper.</Paragraph> </Section> <Section position="2" start_page="453" end_page="453" type="sub_section"> <SectionTitle> 4.2 Compositional indirect semantics </SectionTitle> <Paragraph position="0"> richer than the intended CTL translation.</Paragraph> <Paragraph position="1"> The best way to explain these notions is by way of some examples. First, consider expressions like the nouns pulse, edge and the verbs rise, fall. These refer to certain kinds of event. For example, an edge denotes the event where a signal changes between two distinct states; from high at time t to low at time t + 1 or conversely. In CTL, the notion of an edge on signal i corresponds approximately to the following expression: 5 (6) (i A AX~i) v (&quot;,i A AXi) Similarly, a pulse can be analysed in terms of a rising edge followed by a falling edge.</Paragraph> <Paragraph position="2"> What do we mean by saying that there is a compositional mapping of locutions at this level to CTL? Our claim is that they can be algorithmically converted into pure CTL without reference to unbounded context. What do we mean by saying that these English expressions involve a richer ontology than CTL? If compositional mapping holds, then clearly we are not forced to augment the standard models for CTL in order to interpret them (although this route might be desirable for other reasons). Rather, we are saying that the 'natural' ontology for these expressions is richer than that allowed for CTL, even if reduction is possible. 6</Paragraph> </Section> <Section position="3" start_page="453" end_page="454" type="sub_section"> <SectionTitle> 4.3 Non-compositional indirect semantics </SectionTitle> <Paragraph position="0"> We consider the conversion to involve non-compositional indirect semantics when there is some aspect of non-locality in the domain of the translation function. That is, some form of inference is required--probably involving domain-specific axioms or general temporal axioms--in order to obtain a CTL formula from the English expression.</Paragraph> <Paragraph position="1"> Here are two examples. The first comes from sentence (3a), where the use of eventually might normally be taken to correspond directly to the CTL operator AF. However because of the domain of (3a)--a handshaking protocol, evidenced by the use of the verbs acknowledge and request--it is in fact more accurate to require an extra AX in the CTL.</Paragraph> <Paragraph position="2"> 5Approximately, in the sense that one cannot simply substitute this expression arbitrarily into a larger formula, as it depends on the syntactic context--for example, whether it occurs in the antecedent or consequent of an implication. level expressiveness pure CTL examples i is high; after 1 cycle pure CTL i holds; 1 cycle later /22 extended CTL i rises; there is a pulse of unit duration /23 full SR? r is eventually acknowledged This ensures that the three transitions cannot occur at the same time.</Paragraph> <Paragraph position="3"> We see here an example of domain-specific interpretation conventions that our system needs to be aware of. Clearly, it must incorporate them in such a way that users are still able to reliably predict how the system will react to their English specifications.</Paragraph> <Paragraph position="4"> The second example is (7) From one cycle after i changes until it changes again x and y are different.</Paragraph> <Paragraph position="5"> In this case there is an interaction between a non-local linguistic phenomenon and something specific to the CTL conversion, namely how to make the right connection between the first and the second changes.</Paragraph> </Section> <Section position="4" start_page="454" end_page="454" type="sub_section"> <SectionTitle> 4.4 Language hierarchy </SectionTitle> <Paragraph position="0"> Table 2 summarises the main proposals of this section. The left-hand column lists the hierarchy of postulated sublanguages, in increasing order of semantic expressiveness. The middle column tries to calibrate this expressiveness. By 'extended CTL', we mean a superset of CTL which is syntactically augmented to allow formulas such as rise(p), fall(p), discussed earlier, and pulse(p, v, n), where p is an atom, v is a Boolean indicating a high or low value, and n is a natural number indicating duration. The semantic clauses would have to be correspondingly augmented--as carried out for example by Nelken and Francez (1996), for rise(p) and fall(p). By 'full SR', we are hypothesising that it would be necessary to invoke a general semantic representation language for English.</Paragraph> <Paragraph position="1"> We have constructed a context-free grammar for /22, in order to obtain a concrete approximation to a controlled subset of English for expressing specifications. There are two cautionary observations. First, as just indicated, /22 maps directly not into CTL, but into extended CTL. Second, our grammar for/22 ignores some subtleties of English syntax and morphology. For example, subject-verb agreement; modal auxiliary subcategorisation; varieties of verb phrase modification by adverbs; and forms of anaphora.</Paragraph> <Paragraph position="2"> These defects in our CFG for /22 are not fundamental problems, however. The device of using the ct12eng mapping to define a sublanguage is a specific methodology for finding a semantically motivated sublanguage. As such it is only an approximation to the language that we wish our system to deal with. This CFG is not the grammar used by our parser (which can, in fact, deal with many of the details of English syntax just mentioned). We may, therefore, introduce a language/2+ which corrects the grammatical errors of 122 and extends it with some degree of anaphora and ellipsis.</Paragraph> <Paragraph position="3"> We note that it would be useful to have a firmer theoretical grasp on the relations between our sublanguages; we have ongoing work in this area.</Paragraph> </Section> </Section> class="xml-element"></Paper>