File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/c92-2072_abstr.xml

Size: 32,829 bytes

Last Modified: 2025-10-06 13:47:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2072">
  <Title>Towards Robust PATR</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We report on the initial stages of development of a robust parsing system, to be used as part of The Editor's Assistant, a program that detects and corrects textual errors and infelicities in the area of syntax and style. Our mechanism extends the standard PATR-n formalism by indexing the constraints on rules and abstracting away control of the application of these constraints. This allows independent specification of grouping and ordering of the constraints, which can improve the efficiency of processing, and in conjunction with information specifying whether constraints are necessary or optional, allows detection of syntactic errors.</Paragraph>
    <Paragraph position="1"> Introduction The Editor's Assistant \[Dale 1989, 1990\] is a rule-based system which assists a copy editor in massaging a text to conform to a house style. The central idea is that publishers' style rules can be maintained as rules in a knowledge base, and a special inference engine that encodes strategies for examining text can be used to apply these rules. The program then operates by interactively detecting and, where possible, offering corrections for those aspects of a text which do not conform to the rules in the knowledge base.</Paragraph>
    <Paragraph position="2"> The expert-system-like architecture makes it easy to modify the system% behaviour by adding new rules or switching rule bases for specific purposes.</Paragraph>
    <Paragraph position="3"> Our previous work in this area has been oriented towards the checking of low-level details in text: for example, the format and punctuation of dates, numbers and numerical values; the punctuation and use of abbreviations; and the typefaces and abbreviations to be used for words from foreign languages. In this paper, we describe some recent work we have carried out in extending this mechanism to deal with syntactic errors; this has led us to a general mechanism for robust parsing which is applicable outside the context of our own work.</Paragraph>
    <Paragraph position="4"> deg E-mail addeas is S. Douslaa@ed. ac. uk.</Paragraph>
    <Paragraph position="5"> tAIso of the Department of Artificial Intelligence at the University of Edinburgh; e-mail address is R,Dale(c)ed. ac.uk.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Syntactic Errors
Categories of Errors
</SectionTitle>
      <Paragraph position="0"> Ultimately, the aim of The EdilorJs Assistant is to deal with real language~unrestricted natural language text in all its richness, with all its idio~yncracies. The system is therefore an experiment in what we call intelligent text processing: an intersection of techniques from natural language procossing and from more mundane text processing applications, with the intelligence being derived from the addition of language sensitivity to the basic text processing mechanisms.</Paragraph>
      <Paragraph position="1"> Many of the corrections made routinely in the course of human proofreading require subleties of semantic and pragmatic expertise that are simply beyond current resources to emulate. However, examination of common syntactic errors and infelicities, both as described in the literature (see, for example, \[Miller 1986\]) and es appearing in data we have analysed, has led us to distinguish a number of tractable error types, and we have based the development of our system on the various requirements imposed by these classes. The error types are defined very much with processing requirements in mind; orthogonal categorisations are of course possible. We give summary descriptions of these cla.~s here; examples are provided in Figure 1.</Paragraph>
      <Paragraph position="2"> Constraint Violation Errors: These involve what, in most contemporary syntactic theories, are best viewed as the violation of constraints on feature values. All errors in agreement fall into this category.</Paragraph>
      <Paragraph position="3"> Lexlcal Confusion: These involve the confusion of one lexical item with another. We specifically include in this category cases where a word containing an apostrophe is confused with a similar word that does not, or vice versa.</Paragraph>
      <Paragraph position="4"> Syntactic Awkwardness: We include here cases where the problem is either stylistic or likely to cause processing problems for the reader. Note that these 'errors' are not syntactically incorrect, but are constructions which, if overused, may result in poor writing, and as such are often included in style-checker 'hit-lists'; thus, we would include multiple embedding constructions, poten-ACT, S DE COLING-92, NANTES, 23-28 AOIJT 1992 4 6 8 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Constraint Violation Errors:
</SectionTitle>
      <Paragraph position="0"> (l) Subject-verb number disagreement: a. *John and Mary runs.</Paragraph>
      <Paragraph position="1"> b. *The dogs runs.</Paragraph>
      <Paragraph position="2"> (2) Premodifier-noun number disagreement: a. *This dogs runs.</Paragraph>
      <Paragraph position="3"> b. *All the dog run.</Paragraph>
      <Paragraph position="4"> (3) Subject-complement number disagreement: a. *There is live dogs here.</Paragraph>
      <Paragraph position="5"> h. *There are a dog.</Paragraph>
      <Paragraph position="6"> (4) Wrong pronoun case.: a. *llc and me ran to the dog. b, *This stays between you and I, (5) Wrong indelblitc article: a. *A apple and an rotten old pear. b. A NeXT workstation and *a NEC laptop. 1 Le.xical Confosion: (6) Confusion of its and it's: a. *Its late.</Paragraph>
      <Paragraph position="7"> b. *Tim dog ate it's bone.</Paragraph>
      <Paragraph position="8"> (7) Confusion of that,, their, and they're: a. *Their is a dog here.</Paragraph>
      <Paragraph position="9"> b, *They're is a dog here.</Paragraph>
      <Paragraph position="10"> e. *Ttmrc dog was cohi.</Paragraph>
      <Paragraph position="11"> d. *They're dog was cold.</Paragraph>
      <Paragraph position="12"> e. *There }lere now.</Paragraph>
      <Paragraph position="13"> f. *Ttmir here now.</Paragraph>
      <Paragraph position="14"> (8) Confusion of p~ivc's and plural s: a. *The dog's are cold.</Paragraph>
      <Paragraph position="15"> b. *3'lie boy ate the dogs biscuit. Syntactic Awkwardness: (9) ~\[bo many prepositional phraees: a. Tile boy gave the dog in the window at the end with tile red collar with the address on the back of it a biscuit.</Paragraph>
      <Paragraph position="16"> (10) i)a~'~ive constructions: a. The boy wa.s seen by the dog. Missing or extra elements: (11) Unpaired delimiters: a. *The dog, wllich was in tile garden was quiet. (12) Missing delimiters: a. *The dog, I think was in the garden. b. *In the garden dog,s arc a menace. (13) Missing list ~parators: a. *There were two dog~ three cats and a canary. (ld) Double syntactic function: a. *it s~,cms to be is a dog.</Paragraph>
      <Paragraph position="17"> b. *l think know Fve been there before.  :tally anlbiguous syntactic structures, and garden path sentence~s in this category. These problems are detectable by simple counting or recognition of syntactic forms.</Paragraph>
      <Paragraph position="18"> Missing or Extra Elements: These arc c~mes where elements (either words or ptmctuation symbols) arc omitted or mistakelfly included in a text. An interesting sub-category here, which is surprisingly frequent, is the pres~ ence of two constituents which serve the same or a similar purpose; by analogy with double-word errors (where a word appears twice it, succession when only one occurrence was intended), we refer to these as cas~ of doul)le syntactic function. The crrors dealt with in this paper all fall into the first class, i.e, thorn that can be seen as breaking constraint.s on feature values. At the end of the paper we slake sonte observations on how the mechanism can bc. extended lx) the other classes.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
:Previous Work
</SectionTitle>
      <Paragraph position="0"> Of course, there exists a signiiicant body of work dealing with (xmlputational approaches to syntactic errors like those just discu~.scd. Broadly, work dealing with ungrammatical input falls into two categoric: approaches where the principal objective is to determine what meaning the speaker intended, and approach~ where the principal objective is to oonstruct ml appropriate correction. The first kind of approach is most appropriate in the development of natural language inLcrfaces, where syntactic dysflueneies can often be ignored if tile ~mer~s intentions can be determined by means of other evidence, tIowever, these, approach~ (in the simplest cases, based on detecting content words) arc inappropriate where the sysLma must al.,~o propose a correction for the hypothcsised error.</Paragraph>
      <Paragraph position="1"> Of the different tedmiqucs that have been propo.~md under the second category, the most useful is that usually referred to as relaxation. Tiffs is a rather elegant method for extending a grammar's coverage to include ill-formed input, while retaining a principled connection between thc constructions accepted by the more restrictive grammar and those accepted by the extended one. If a grammar exprt.~ses informaLion ill terms of constraints or conditions on features, a slightly leas re,~trietivc grammar can be constructed by relaxing some subset of these constrainLs. Work commonly referred to in this corn text inelud~ Kwasny and Sondheimer \[1981\] and Weischedel and Black \[1980\], but very many systems u.~e .some kind of relaxation process, whether of syntactic or semantic constraints. The most well known is lnM'~q work on the Epistle and Critique systems \[tleidorn ct el. 1982; Jensen et at. 1983; Richardson and Braden-tIarder 1988\].</Paragraph>
      <Paragraph position="2"> lln British English, NEC is spelled out, rather than being pronolmeed like the word neei4 thus, the correct form here is an NEC, Ae*rl~s DE COIANG-92, NAtCn.:S, 23-28 AOI\]T 1992 4 6 9 PROC. O1: COLING-92, N^I~rgs, AUG. 23-28, 1992 Epistle parses text in a left-to-right, bottom-up fashion, using grammar rules written using an augmented phrase structure grammar (APSG). In APSG, each grammar rule looks like a conventional context-free phrase structure rule, but may have arbitrary tests and actions specified on both sides of the rule. So, for example, we might have a rule like the following: null</Paragraph>
      <Paragraph position="4"> This rule states that a noun phrase followed by a verb phrase together form a VP, 2 provided the number of the N\[ and the original VP agree. The resulting VP structure then has the original NP as the value of its SUBJECT attribute.</Paragraph>
      <Paragraph position="5"> Using rules like the~e, the system attempts to parse a sentence as if it were completely grammatical. Then, if no parse is found, the system relaxes some conditions on the rules and tries again; if a parse is now obtained, the system can hypothesise the nature of the problem on the basis of the particular condition that was relaxed. Thus, if the above rule was used in analysing the sentence Either of the models are acceptable, no parse would be obtained, since the number of the NP Either of the models is singular whereas the number of the VP are acceptable is plural. However, if the number agreement constraint is relaxed, a parse will be obtained; the system can then suggest that the source of the ungran\]maticality is the lack of number agreement between subject and verb.</Paragraph>
      <Paragraph position="6"> One thing that must be borne in mind when considering the merits and demerits of relaxation methods is that they depend crucially on how much of the particular grammar2s information is expressed as constraints on feature values. Where the basic form of a grammar is, say, complex phrase structure rules, the use of features may be confined to checking of number and person agreement. If, on the other hand, more of the informative content of the grammar is represented as constraints, as in recently popular unification-based grammars \[Sheibcr 1986\], relaxation can be used to transform grammars to less closely related ones.</Paragraph>
      <Paragraph position="7"> In the remainder of this paper, we show how a unification-based formalism, PATR-II, may be extended by a declarative specification of relaxations so that it can be used flexibly for detecting syntactic errors. Under one view, what we are doing here is rationally reconstructing the Epistle system within a unification-based framework. A useful consequence of this exercise is that the adoption of a declarative approach to the specification of relaxations makes it much easier to explore different processing regimes for handling syntactic errors.</Paragraph>
      <Paragraph position="8"> ~This second, higher-level VP plays the role of what we would normally think of as an S node.</Paragraph>
      <Paragraph position="10"/>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Making PATR Robust
The Basic Mechanism
</SectionTitle>
      <Paragraph position="0"> In this section, we describe an experimental system, written in Prolog, that is designed to support the mechanisms necessary to apply PATR-type rules to solve constraints selectively. The major components of the system are (a) the parsing mechanism; (b) the underlying P^TR system; and (c) the rule application mechanism that mediates between these two.</Paragraph>
      <Paragraph position="1"> The parser encodes the chosen strategy for applying particular grammar rules in a particular order. At this stage, the parser is not a crucial component of the system; all we require is that it apply rules in a bottom-up fashion. Accordingly, we use a simple shift-reduce mechanism. The parser will be the focus for many of the proposed extensions discussed later; in particular, we are in the process of implementing a chart-based mechanism to allow handling of errors resulting from missing or extra elements.</Paragraph>
      <Paragraph position="2"> The basic PATR system provides a unification based mechanism for solving sets of constraints on feature structures. A PATR rule corresponding to the grammar rule discussed in the context of Epistle above is shown in Figure 2.</Paragraph>
      <Paragraph position="3"> It is fairly obvious that, given some mechanism that allows us to remove the final constraint in this rule, we can emulate the behaviour of the Epistle system.</Paragraph>
      <Paragraph position="4"> In our model, the rule application mechanism provides the interface between the parsing mechanism, which accesses the lexicon and decides the order in which to try rules, and the PATR. system. To see how this works, we will consider a slightly morn complex rule, shown in Figure 3; the use of the numbers on the constraints will be explained below.</Paragraph>
      <Paragraph position="5"> Given this rule, a constituent of category NP will be found given two lexical items which axe respectively a determiner and a noun, provided all the constraints numbered 1 through 6 are found to hold. Note the constraint numbered 4: we suppose that the features addressed by (X1 agr precedes) and (X2 agr begins) may have the values vowel and consonant. This allows us to specify the appropriate restrictions on the use of the two forms a and an. 3 aOf course, the imp\]ication here that a is used before words beginning with a vowel and an is used before words beginning with a consonant is an oversimplification. There are aiso, of course, other means by which this con-</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Relaxing Constraints
</SectionTitle>
      <Paragraph position="0"> Given the rule in Figure 3, and a standard parsing mechanism, there will be no problem in parsing correct NPS like these dogs. tlowever, consider our target errors in (16a-e):  (16) a. *this dogs b. *an dog c. *an dogs  Exmnple (16a) exhibits premodifier noun number disgrecment; (16b) exhibits use of the wrong indefinite article; and (16c) cxmtains both of these errors. If the parser is to make any sense of thcse strings, we must introduce a more elaborate control structure. Premodifier-noun number agreement is enforced by constraint 5; constraint 4 enforces the use of the proper indefinite article. We need to be able to relax constraint 5 to parse (16a), and to relax constraint 4 to parse (16t)); to parse (16c), we want to relax both (xmstralnts 5 arid 4 at once.</Paragraph>
      <Paragraph position="1"> &amp;quot;\[b deal with this, we make use of the notion of a relaxation level Instead of applying all con strafers associated with a rule, we specify for evcry rule, at any given relaxation level, those constraints that are necessary and those that are optional. At relaxation level 0, which is equivalent to thc bohaviour of the standard PATR system, all constraints are deemed nece.~ary. At relaxation level 1, however, constraints 4 and 5 are optional. Optional constraints, if violated, need not result in a failed parse, but do correspond to particular errors.</Paragraph>
      <Paragraph position="2"> The algorithm in Figure 4 applies all constraints appropriately, given a specification as just dcscribed. Here, N is the set of nccessary constraints and O is the set of optional constraints, both for a given relaxation level L; R is the set of constraints which have to be relaxed in order for the rule to he used. R will always be a subset of O, of course; we r(.~ turn the actual vahm of 1~ as a result of parsing with the rule. The outer conditional ensures that all the necessary constraints are satisfied. The inner conditional takes appropriate action for each relaxable constraint whether or not it is satisfied: if the straint could be d~eckcd; however, we include it here as a constraint on the application of the rule for expository purp~c~.</Paragraph>
      <Paragraph position="3"> When applying rule r at relaxation level L: N ~- necessary constraints on r at L</Paragraph>
      <Paragraph position="5"> if all n (: N can be solved then incorporate any instantiations required for eadl oi r50 do if o, can be solved  constraint is satisfied, it has exactly thc same effect as a necc~ary constraint; if not, the constraint is recorded as having been relaxed.</Paragraph>
      <Paragraph position="6"> Once paining is complete, the information in R can then be used to generate an appropriate error message. null The operation of this algorithm is supported by explicitly indexing each constraint within a tale, as in Figure 3, and absl.racting out the specification of whieh vonstraint.s may be relaxed at a given relay ation lewfl. The constraint application specification for the NP rule is given ill Figure 5.</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Grouping Constraints
</SectionTitle>
      <Paragraph position="0"> This is not the whole story, however. Consider the NP this dogs, which would be correctly parsed at re-Acids DE COLING-92, NAN'Ir~.s, 23-28 AO~&amp;quot; 1992 4 7 1 PROC. OF COLlNG-92, N^tCrEs, AUG. 23-28. 1992 laxation level 1 as exhibiting premodifier-noun number disagreement under the system described so far.</Paragraph>
      <Paragraph position="1"> The instantiation of X0 resulting from this rule application would be as follows: degdeg \]\] x0: I:;:: 0,u Note in particular that (X0 agr num) has the value plu. This results from the solution of constraint 6, which is one of the necessary constraints at relaxation level 1 as specified in Figure 5. This 'feature transport' constraint propagates the number of the tread noun to tile superordinate noun phrase. It is not appropriate to perform such a propagation under the current, cirolmstances, however, because once a case of prcmodifier-noun number disagreement has been identified, we cannot tell whether it is the number of the noun or the number of the determiner that is in error. One might argue that one of the two is more likely than the other, but such a heuristic belongs in the mechanism that offers replacements rather than in the relaxation mechanism itself. If the number of the noun is always propagated to the noun phrase, spurious error reports may emerge in subsequent parsing: for example, in the text Th/s doys runs, a subject-verb number disagreement will be flagged in addition to the premodifier-noun number disagreement error. This will be at best misleading. null We would like to be able to express the intuition that it is not really meaningful to apply constraint 5 if constraint 5 has failed; these constraints should be grouped together, to be applied together or not at all. So we introduce an addition to the specification for relaxation level 1, shown in Figure 6.</Paragraph>
      <Paragraph position="2"> We refer to a group of constraints to be relaxed together or not at all, plus the error message that corresponds to the failure of the group of constraints, as a relaxation package. The algorithm of Figure 4 has been adapted to apply such relaxation packages, resulting in the algorithm in Figure 7. Here, R is the set of relaxation packages required in order to complete the parse.</Paragraph>
      <Paragraph position="3"> Note that if all the constraints in a relaxation package can be applied successfully, they have exactly the same effect as necessary ones, in terms of contributing to the building of structure. Thus, if the number agremnent condition constraint 5 is satisfied, as in the case of the text an dogs, then the associated feature percolation constraint, 6, will add the feature (agr nmn) to XO, with value (X2 agr hum).</Paragraph>
    </Section>
    <Section position="7" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Ordering Constraints
</SectionTitle>
      <Paragraph position="0"> In the previous section, we altered the mechanism to allow for the fact that it is not meaningful to apply some coustraints if others have failed; in the worst case, this avoided confusing error diagnoses.</Paragraph>
      <Paragraph position="1"> Even if no such confusion would result, however, con-When applying rule r at relaxation level L: N *- necessary constraints on r at L O 4- relaxation packages for r at L R~- {} if all n * N can be solved then incorporate any instantiations for each relaxation package P~ E O do if all constraints c4 * Pi can be solved  siderable efficiency gains can be made by ordering constraints in such a way as to minimise unnecessary structure building. A similar point is made by Uszkoriet \[1991\], who talks of the need for a flex* ible control strategy for efficient unification based parsers, to ensure that the conditions that are most likely to fall are tried first.</Paragraph>
      <Paragraph position="2"> Ideally, the ordering of constraints would be derived automatically from other information; but it is unclear how this would be done. Currently, we make use of one central ordering principle: (18) Category constraints on RnS items come first. In the bottom-up parsing system we use, all RrlS items will be instantiated with feature structures corresponding to lexical entries, or to syntactic categories built up by rule from lexical entries; it is a discipline on our lexicon and our structure building rules that all such feature structures will have a cat feature. This means that a query about the cat value will involve no structure building. However, if, before checking the category, we were to enquire about the (agr num) feature, we might involve ourselves in some unnecessary structure building, because if applied to a feature structure that does not have an (agr num) feature, what was thought of as a conditional constraint will in fact result in structure building. For example, the constraint in (19) applied against the structure in (20) will result in the structure shown in (21); this is clearly not desirable.</Paragraph>
      <Paragraph position="4"> cat: conjunction\] lex: and /  These considerations give ri~ to the ordering of constraints given in Figure 8; we assume that when the algorithm in Figure 7 tests whether all members of a constraint set can be solved, the constraints are solved in the order given in the specification, and the test halts as soon o.s any member of the constraint set cannot be solve&amp;  Wc have argued that combining the relaxation technique for syntactic error correction with a grammar (such a.s is found in recent unification formalisms) that expresses most of its information in the form of constraints provides a good starting point for a llexible mechanism for detecting and correcting syntactic errors. Our work in this area so far raises a number of interesting (lUt.~tions which need to be pursued lurther.</Paragraph>
      <Paragraph position="5"> Dependencies betwcen Constraints: As we have seen, the ordering of constraints in the relaxation specifications is very important. However, the particular role a specific constraint perh)rms will of course depend on the particular parsing strategy being used. Ideally, we would like to generate the ordering information antx)matically, although it is not entirely clear how this might be (lone. One source of some ordering constraints might come from using typed feature structures in tile lexicon, so that the rule application mechanism can deterniine abead of time what the pri mary source of information is. Another approach might lie to require the grammar writer to specify the c, onstraints on rules as belonging to specific categories, and then to allow the rule application mechanisni to impose a predefined ordering between categories; in particular, the most trouble~)mc constraints are those which transport feature values around a structure, since ttmy may transport the wrong values, ms we saw in the example discussed earlier.</Paragraph>
      <Paragraph position="6"> Generation of Replacement Text: A topic we have not addressed in the pr~ent paper is the generation of corrections for hypothesiscd errors. The result of parsing using relaxatkm provides suflicleat information to generate such replacements, but once again we need to maintain infurmatiou about the dependencies between elements of a structure so that, when a new structure is created, any conflicts that ari~ can be re=solved: for example, if generating a correction involves changing tile num feature of a noun from plural to singw lar, we need to encode the information that the lex feature is dependent upon the hum feature and some specification of the root form, so that the retfiaeement mechanism knows which features take priority and which may be overridden.</Paragraph>
      <Paragraph position="7"> Deciding between Error lt:ypotheses: When a constraint unifying two incompatible values vl mid v~ has to be relaxed, then in tile absence of further infi)rmation there are two equally likely error hypothers: one, that vl is the correct value and t~a is wrong, mid the other that v~ is correct and Vl is wrong. However, there are two typ~ of situation in which further information available dm5ng parsing may ratable one hypothesis to be preferred.</Paragraph>
      <Paragraph position="8"> The first is where the absolute likelihood of one error seems greater thau that of the other, l, br example, in the case of the noun phro.C/C/c these dog it might prove to be much more likely for a writer to mistakenly omit the single letter s than to choose the wrong determiner, which involvee a change of two letters there may be quantifiable difference between the assumptious behind the two hypotheses. The second is where a number of possible errors are linked, for example if the whole sentence w~m &amp;quot;llJese dog are fie~vze, llere, two possible errors involving different rules are interdependent, mid once again it is possible to argue that one error hypothesis requir~ a quantiliably dilferent set of ~umptions; here, both these and ant would have to be wrong if dog were to be a.~sumed COrrect,.</Paragraph>
      <Paragraph position="9"> &amp;quot;lb a certain extent, it may be possible to rely on unilication to deal with these confliet.~. The relaxation package dealing with the noun phr~ number disagreement might 'hold its fire'-not signal an error immediately- -leaving the number feature of the noun phrase uninstantiatcd. Then there will be no clash with the number of the verb phrmse, which will be propagated down to the noun phrase.</Paragraph>
      <Paragraph position="10"> It may be pix~sible to hook this value up to the sub sequent t)rocc~ing el the error suggestion from the nolin phrase rule.</Paragraph>
      <Paragraph position="11"> Alternatively, the idea that there are a number of a~sumptious behind a given error hypothc'sis could be formalised, perhaps by 1L~ing an A'rMS \[de Kleer 1986a, 1986b\] to keep track of inconsistencies. Ilypotimses could be weighted both by their absolute likelihood and the contextual evidence (i.e., the number mid weight of related errors eonsisPSent mid inco~mistent with the hypotheses).</Paragraph>
      <Paragraph position="12"> Much depends on where during the parsing pro~ eess errors arise and are notified, and so detailed consideration of this issue h~.u been deferred until our eht~rt parser extension to this system has been explored.</Paragraph>
      <Paragraph position="13"> Acrl!s DE COLING-92. NANTES. 23-28 hO(rr 1992 4 7 3 P~oc. o~ COLING-92. NArcri!s. Aeo. 23 k28. 1992 Levels of Relaxation: The examples we have provided have only explicitly mentioned one level of relaxation, One can imagine situations where other, further levels of relaxation are available.</Paragraph>
      <Paragraph position="14"> In particular, note that, since categorial information can be specified by means of constraints, we can also consider handling instances of words misspelled as words of other syntactic categories by means of the same mechanism; relaxing category feature constraints might be an appropriate candidate for a further level of relaxation. There is of course the question of how one decides what relaxations should be available at what levels; determining this requires more detailed statistical analysis of the frequencies of different kinds of errors. It is also likely to bc required that individual error rules, spread across a number of grammar rules, be capable of being treated as a unit, that is, switched on or off together, orthogonal to the idea of relaxation levels.</Paragraph>
      <Paragraph position="15"> Different Kinds of Relaxation: In the foregoing, we a~qumed that relaxing a constraint simply meant removing it. There are other notions of constraint relaxation that could be used, of course; for example, if a constraint assigns a value to some feature, we could relax this constraint by assigning a less specific value to that feature. There may be other cases where we would want to generalise the notion of relaxation to include the possibility that a constraint could be replaced by a quite different constraint.</Paragraph>
      <Paragraph position="16"> Conclusions and Future Work We have described a simple extension to the PATR-n fornndism which allows us to provide declarative specifications of possible relaxations on rules. This provides a good starting point for a flexible mechanism for detecting and c~rrecting syntactic errors. One rea.~on for this is that relaxation provides a precise and systematic way of specifying the relationship t)etweeu errorflfl and 'CorrecU forms, making it easier to generate suggestions for corrections. A second reason is that the very uniform representation of linguistic information will allow flexible strategies for relaxation to be applied; this is particularly important when dealing with text that may contain unpredictable errors.</Paragraph>
      <Paragraph position="17"> As we have shown, the mechanism described here can be applied straightforwarly to Constraint Violation Errors as described at the beginning of the paper.</Paragraph>
      <Paragraph position="18"> At the moment wc have a rather ad hoe mechanism that deals with cases of Lexical Confusion by providing alternative lexical entries in the case of parse failure, but this needs to be integrated better with the relaxation mechanism. Cases of Stylistic Awkwardness simply require the addition of a critic that walks over the structures produced by the parser.</Paragraph>
      <Paragraph position="19"> The mQor focus of our current work is the replacement of the shift-reduce parser by a chart parser, to enable us to handle cases of Missing or Extra Elements. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML