File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/p97-1020_metho.xml
Size: 21,638 bytes
Last Modified: 2025-10-06 14:14:37
<?xml version="1.0" standalone="yes"?> <Paper uid="P97-1020"> <Title>Deriving Verbal and Compositional Lexical Aspect for NLP Applications</Title> <Section position="4" start_page="151" end_page="151" type="metho"> <SectionTitle> 2 Lexical Aspect </SectionTitle> <Paragraph position="0"> Following Olsen (To appear in 1997), we distinguish between lexical and grammatical aspect, roughly the situation and viewpoint aspect of Smith (1991).</Paragraph> <Paragraph position="1"> Lexical aspect refers to the '0ype of situation denoted by the verb, alone or combined with other sentential constituents. Grammatical aspect takes these situation types and presents them as impeffective (John was winning the race/loving his job) or perfective (John had won/loved his job). Verbs are assigned to lexical aspect classes, as in Table i (cf. (Brinton, 1988)\[p. 57\], (Smith, 1991)) based on their behavior in a variety of syntactic and semantic frames that focus on their features. 1 A major source of the difficulty in assigning lexical aspect features to verbs is the ability of verbs to appear in sentences denoting situations of multiple aspectual types. Such cases arise, e.g., in the context of foreign language tutoring (Dorr et al., 1995b; Sams, 1995; Weinberg et al., 1995), where a a 'bounded' interpretation for an atelic verb, e.g., march, may be introduced by a path PP to the bridge or across the field or by a NP the length of the field: (1) The soldier marched to the bridge.</Paragraph> <Paragraph position="2"> The soldier marched across the field.</Paragraph> <Paragraph position="3"> The soldier marched the length of the field.</Paragraph> <Paragraph position="4"> Some have proposed, in fact, that aspectual classes are gradient categories (Klavans and Chodorow, 1992), or that aspect should be evaluated only at the clausal or sentential level (asp. (Verkuyl, 1993); see (Klavans and Chodorow, 1992) for NLP applications).</Paragraph> <Paragraph position="5"> Olsen (To appear in 1997) showed that, although sentential and pragmatic context influence aspectual interpretation, input to the context is constrained in large part by verbs&quot; aspectual information. In partitular, she showed that the positively marked features did not vary: \[+telic\] verbs such as win were always bounded, for exainple, In contrast, the negatively marked features could be changed by other sentence constituents or pragmatic context: \[-telic\] verbs like march could therefore be made \[+telic\].</Paragraph> <Paragraph position="6"> Similarly, stative verbs appeared with event interpretations, and punctiliar events as durative. Olsen 1Two additional categories are identified by Olsen (To appear in 1997): Semelfactives (cough, tap) and Stagelevel states (be pregnant). Since they are not assigned templates by either Dowty (1979) or Levin and Rappaport Hovav (To appear), we do not discuss them in this paper.</Paragraph> <Paragraph position="7"> therefore proposed that aspectual interpretation be derived through monotonic composition of marked privative features \[+/1~ dynamic\], \[+/0 durative\] and \[+/0 relic\], as shown in Table 2 (Olsen, To appear in 1997, pp. 32-33).</Paragraph> <Paragraph position="8"> With privative features, other sentential constituents can add to features provided by the verb but not remove them. On this analysis, the activity features of march (\[+durative, +dynamic\]) propagate to the sentences in (1). with \[+telic\] added by the NP or PP, yielding an accomplishment interpretation. The feature specification of this compositionally derived accomplishment is therefore identical to that of a sentence containing a relic accomplishment verb, such as produce in (2).</Paragraph> <Paragraph position="9"> (2) The commander produced the campaign plan.</Paragraph> <Paragraph position="10"> Dowry (1979) explored the possibility that aspectual features in fact constrained possible units of meaning and ways in which they combine. In this spirit, Levin and Rappaport Hovav (To appear) demonstrate that limiting composition to aspectually described structures is an important part of an account of how verbal meanings are built up, and what semantic and syntactic combinations are possible. null We draw upon these insights in revising our LCS lexicon in order to encode the aspectual features of verbs. In the next section we describe the LCS representation used in a database of 9000 verbs in 191 major classes, We then describe the relationship of aspectual features to this representation and demonstrata that it is possible to determine aspectual features from LCS structures, with minimal modification. We demonstrate composition of the LCS and corresponding aspectual structures, by using exampies from NLP applications that employ the LCS database.</Paragraph> </Section> <Section position="5" start_page="151" end_page="153" type="metho"> <SectionTitle> 3 Lexical Conceptual Structures </SectionTitle> <Paragraph position="0"> We adopt the hypothesis explored in Dorr and Olsen (1996) (cf. (Tenny. t994)), that lexical aspect features are abstractions over other aspects of verb semantics, such as those reflected ill the verb classes in Levin (1993). Specifically we show that a privative model of aspect provides an appropriate diagnostic for revising \[exical representations: aspectual interpretations that arise only in the presence of other constituents may be removed from the lexicon and derived compositionally. Our modified LCS lexicon theu allows aspect features to be determined algorithmically both from the verbal lexicon and from composed structures built from verbs and other sentence constituents, using uniform processes and representations. null This project on representing aspectual structure builds on previous work, in which verbs were grouped automatically into Levin's semantic classes (Dorr and Jones, 1996; Dorr, To appear) and assigned LCS templates from a database built as Lisp-like structures (Dorr, 1997). The assignment of aspectual features to the classes in Levin was done by hand inspection of the semantic effect of the alternations described in Part I of Levin (Olsen, 1996), with automatic coindexing to the verb classes (see (Dorr and Olsen, 1996)). Although a number of Levin's verb classes were aspectually uniform, many required subdivisions by aspectual class; most of these divided atelic &quot;manner&quot; verbs from telic &quot;result&quot; verbs, a fundamental linguistic distinction (cf. (Levin and Rappaport Hovav, To appear) and references therein). Examples are discussed below.</Paragraph> <Paragraph position="1"> Following Grimshaw (1993) Pinker (1989) and others, we distinguish between semantic structure and semantic content. Semantic structure is built up from linguistically relevant and universally accessible elements of verb meaning. Borrowing from Jackendoff (1990), we assume semantic structure to conform to wellformedness conditions based on Event and State types, further specialized into primitives such as GO, STAY, BE, GO-EXT, and ORIENT. We use Jackendoff's notion of field, which carries Loc(ational) semantic primitives into non-spatial domains such as Poss(essional), Temp(oral), Ident(ificational).</Paragraph> <Paragraph position="2"> Circ(umstantial), and Exist(ential). We adopt a new primitive, ACT, to characterize certain activities (such as march) which are not adequately distinguished from other event types by Jackendoff's GO primitive.-&quot; Finally, we add a manner component, to distinguish among verbs in a class, such the motion verbs run, walk, and march. Consider march, one 2Jackendoff (1990) augments the thematic tier of Jackendoff (1983) with an action tier, which serves to characterize activities using additional machinery. We choose to simplify this characterization by using the ACT primitive rather than introducing yet another level of representation.</Paragraph> <Paragraph position="3"> of Levin's Ran kerbs (51.3.2): 3 we assign it the template in (3)(i), with the corresponding Lisp format shown in (3)(ii):</Paragraph> <Paragraph position="5"> This list structure recursively associates arguments with their logical heads, represented as primitive/field combinations, e.g., ACTLoc becomes (act loc ...) with a (thing 1) argument. Semantic content is represented by a constant in a semantic structure position, indicating the linguistically inert and non-universal aspects of verb meaning (cf. (Grimshaw, 1993; Pinker, 1989; Levin and Rappaport Hovav, To appear)), the manner component by march in this case. The numbers in the lexical entry are codes that map between LCS positions and their corresponding thematic roles (e.g., 1 = agent). The * marker indicates a variable position (i.e., a non-constant) that is potentially filled through composition with other constituents.</Paragraph> <Paragraph position="6"> In (3), (thing 1) is the only argument. However.</Paragraph> <Paragraph position="7"> other arguments may be instantiated compositionally by the end-NLP application, as in (4) below. for the sentence The soldier marched to the bridge: In the next sections we outline the aspectual properties of the LCS templates for verbs in the lexicon and illustrate how LCS templates compose at the sententim level, demonstrating how lexical aspect feature determination occurs via the same algorithm at both verbal and sentential levels,</Paragraph> </Section> <Section position="6" start_page="153" end_page="154" type="metho"> <SectionTitle> 4 Determining Aspect Features from </SectionTitle> <Paragraph position="0"> the LCS Structures The components of our LCS templates correlate strongly with aspectual category distinctions. An exhaustive listing of aspectual types and their corresponding LCS representations is given below. The ! ! notation is used as a wildcard which is filled in by the lexeme associated with the word defined in the lexical entry, thus producing a semantic constant.</Paragraph> <Paragraph position="1"> (5) (i) States: (be ident/perc/loc (thing 2) ... (by !! 26)) (ii) Activities: tinctions, but are not articulated enough to capture other distinctions among verbs required by a large-scale application.</Paragraph> <Paragraph position="2"> Since the verb classes (state, activity, etc.) are abstractions over feature combinations, we now discuss each feature in turn.</Paragraph> <Section position="1" start_page="153" end_page="153" type="sub_section"> <SectionTitle> 4.1 Dynamicity </SectionTitle> <Paragraph position="0"> The feature \[+dynamic\] encodes the distinction be- null tween events (\[+dynamic\]) and states (\[0dynamic\]). Arguably &quot;the most salient distinction&quot; in an aspect taxonomy (Dahh 1985, p. 28), in the LCS dynamicity is encoded at the topmost level. Events are characterized by go, act, stay, cause, or let, whereas States are characterized by go-ext or be, as illustrated in (6).</Paragraph> <Paragraph position="1"> (6) (i) Achievements: decay, rust, redden (45.5) (go ident (* thing 2) (toward ident (thing 2) (at ident (thing 2) (!!-ed 9)))) (ii) Accomplishments: dangle, suspend (9.2} (cause (* thing 1) (be ident (* thing 2) (at ident (thing 2) (!!-ed 9)))) (iii) States: contain, enclose (47.8) (be loc (* thing 2) (in loc (thing 2) (* thing 11)) (by ~ 26)) (iv} Activities: amble, run. zigzag (51.3.2) (act loc (* thing 1) (by !! 26))</Paragraph> </Section> <Section position="2" start_page="153" end_page="153" type="sub_section"> <SectionTitle> 4.2 Durativity </SectionTitle> <Paragraph position="0"> The \[+durative\] feature denotes situations that take time (states, activities and accomplishments). Situations that may be punctiliar (achievements) are unspecified for durativity ((O\[sen, To appear in 1997) following (Smith, 1991), inter alia). In the LCS, durativity may be identified by the presence of act, be, go-ext, cause, and let primitives, as in (7): these are lacking in the achievement template, shown in (8).</Paragraph> </Section> <Section position="3" start_page="153" end_page="154" type="sub_section"> <SectionTitle> 4.3 Telicity </SectionTitle> <Paragraph position="0"> Telic verbs denote a situation with an inherent end or goal. Atelic verbs lack an inherent end, though.</Paragraph> <Paragraph position="1"> as (1) shows, they may appear in telic sentences with other sentence constituents. In the LCS, \[+telic\] verbs contain a Path of a particular type or a constant (!!) in the right-most leaf-node argument.</Paragraph> <Paragraph position="2"> Some examples are shown below:</Paragraph> <Paragraph position="4"> In the first case the special path component.</Paragraph> <Paragraph position="5"> toward or away_from, is the telicity indicator, in the next three, the (uninstantiated) constant in the rightmost leaf-node argument, and, in the last case, the special (instantiated) constant exist.</Paragraph> <Paragraph position="6"> Telic verbs include:</Paragraph> <Paragraph position="8"> Examples of atelic verbs are given in (11). The (a)telic representations are especially in keeping with the privative feature characterization Olsen (1994; To appear in 1997): telic verb classes are homogeneously represented: the LCS has a path of a particular type, i.e., a &quot;reference object&quot; at an end state. Atelic verbs, on the other hand. do not have homogeneous representations.</Paragraph> <Paragraph position="9"> (11) (i) Activities: appeal, matter (31.4) (act perc (* thing 1) (on pert (* thing 2)) (by !! 26)) (ii) States: wear (41.3.1) (be loc (* !! 2) (on loc (!! 2) (* thing 11)))</Paragraph> </Section> </Section> <Section position="7" start_page="154" end_page="154" type="metho"> <SectionTitle> 5 Modifying the Lexicon </SectionTitle> <Paragraph position="0"> We have examined the LCS classes with respect to identifying aspectual categories and determined that minor changes to 101 of 191 LCS class structures (213/390 subclasses) are necessary, including substituting act for go ill activities and removing Path constituents that need not be stated lexically. For example, the original database entry for class 51.3.2 is:</Paragraph> <Paragraph position="2"> This is modified to yield the following new database entry: (13) (act loc (* thing 1) (by march 26)) The modified entry is created by changing go to act and removing the ((* toward 5) ...) constituent. Modification of the lexicon to conform to aspectual requirements took 3 person-weeks, requiring 1370 decision tasks at 4 minutes each: three passes through each of the 390 subclasses to compare the LCS structure with the templates for each feature (substantially complete) and one pass to change 200 LCS structures to conform with the templates.</Paragraph> <Paragraph position="3"> (Fewer than ten classes need to be changed for durativity or dynamicity, and approximately 200 of the 390 subclasses for telicity.) With the changes we can automatically assign aspect to some 9000 verbs in existing classes. Furthermore. since 6000 of the verbs were classified by automatic means, new verbs would receive aspectual assignments automatically as a result of the classification algorithm.</Paragraph> <Paragraph position="4"> We are aware of no attempt in the literature to determine aspectual information on a similar scale, in part, we suspect, because of the difficulty of assigning features to verbs since they appear in sentences denoting situations of multiple aspectual types. Based on our experience handcoding small sets of verbs, we estimate generating aspectual features for 9000 entries would require 3.5 person-months (four minutes per entry), with 1 person-month for proofing and consistency checking, given unclassified verbs, organized, say, alphabetically.</Paragraph> </Section> <Section position="8" start_page="154" end_page="156" type="metho"> <SectionTitle> 6 Aspectual Feature Determination </SectionTitle> <Paragraph position="0"> for Composed LCS's Modifications described above reveal similarities between verbs that carry a lexical aspect, feature as part of their lexical entry and sentences that have features as a result of LCS composition. Consequently, the algorithm that we developed for verifying aspectual conformance of the LCS database is also directly applicable to aspectual feature determination in LCSs that have been composed from verbs and other relevant sentence constituents. LCS composition is a fundamental operation in two applications for which the LCS serves as an interlingua: machine translation (Dorr et al.. 1993) and foreign language tutoring (Dorr et al., 1995b: Sams. I993: Weinberg et al., 1995). Aspectual feature determination applies to the composed LCS by first, assigning unspecified feature values--atelic \[@T\], non-durative \[@R\], and stative \[@D\]--and then monotonically setting these to positive values according to the presence of certain constituents.</Paragraph> <Paragraph position="1"> The formal specification of the aspectual feature determination algorithm is shown in Figure 1. The first step initializes all aspectual values to be unspecified. Next the top node is examined for membership in a set of telicity indicators (CAUSE, LET, Given an LCS representation L: I. Initialize: T(L):=\[0T\], D(L):=\[0R\], R(L):=\[0D\] 2. If Top node of L E {CAUSE, LET, GO} GO); if there is a match, the LCS is assumed to be \[+T\]. In this case, the top node is further checked for membership in sets that indicate dynamicity \[+D\] and durativity \[+R\]. Then the top node is examined for membership in a set of atelicity indicators (ACT, BE, STAY); if there is a match, the LCS is further examined for inclusion of a telicizing component, i.e., TO, TOWARD, FORTC/~p. The LCS is assumed to be \[@T\] unless one of these telicizing components is present. In either case, the top node is further checked for membership in sets that indicate dynamicity \[+D\] and durativity \[+R\]. Finally, the results of telicity, dynamicity, and durativity assignments are returned.</Paragraph> <Paragraph position="2"> The advantage of using this same algorithm for determination of both verbal and sentential aspect is that it is possible to use the same mechanism to perform two independent tasks: (1) Determine inherent aspectual features associated with a lexical item; (2) Derive non-inherent aspectual features associated with combinations of lexical items.</Paragraph> <Paragraph position="3"> Note, for example, that adding the path l0 the bridge to the \[@relic\] verb entry in (3) establishes a \[+relic\] value for the sentence as a whole, an interpretation available by the same algorithm that identifies verbs as telic in the LCS lexicon: In our applications, access to both verbal and sentential lexical aspect features facilitates the task of lexieal choice in machine translation and interpretation of students' answers in foreign language tutoring. For example, our machine translation system selects appropriate translations based on the matching of telicity values for the output sentence, whether or not the verbs in the language match in telicity.</Paragraph> <Paragraph position="4"> The English atelic manner verb march and the telic PP across the field from (1) is best translated into Spanish as the telic verb cruzar with the manner marchando as an adjunct.: Similarly, in changing the Weekend Verbs (i.e..</Paragraph> <Paragraph position="5"> December, holiday, summer, weekend, etc.) template to telic, we make use of the measure phrase (for terap .. ,) which was previously available.</Paragraph> <Paragraph position="6"> though not employed, as a mechanism in our database. Thus, we now have a lexicalized exampie of 'doing something for a certain time' that has a representation corresponding to the canonical telic frame V for an hour phrase, as in The soldier marched for an hour: (16) (act loc (soldier) (by march) (for temp (*head*) (hour))) This same telicizing constituent--which is compositionally derived in the crawl construction--is encoded directly in the lexical entry for a verb such as (for temp (*head*) (december 31))) This lexical entry is composed with other arguments to produce the LCS for .John Decembered at the new cabin: (18) (stay loc (john) (at loc (john) (cabin (new))) (for temp (ahead*) (december))) This same LCS would serve as the underlying representation for the equivalent Spanish sentence. which uses an atelic verb estar 4 in colnbination with a telnporal adjunct durance el m.es de Diciembre: John estuvo en la cabafia nueva durance el mes de Diciembre (literally, John was in lhe new cabin during lhe month of December).</Paragraph> <Paragraph position="7"> The monotonic composition permitted by the LCS templates is slightly different than that perlnitted by the privative feature model of aspect (Olsen. 1994; Olsen, To appear in 1997). For example, in tiw LCS states may be composed into an achievement or accomplishment structure, because states are part 4Since estar may be used with both relic {'estar alto) and atelic (estar contento) readings, we analyze it as atelic to permit appropriate composition.</Paragraph> <Paragraph position="8"> of the substructure of these classes (cf. templates in (6)). They may not, however, appear as activities. The privative model in Table 2 allows states to become activities and accomplishments, by adding \[+dynamic\] and \[+telic\] features, but they may not become achievements, since removal of the \[+durative\] feature would be required. The nature of the alternations between states and events is a subject for future research.</Paragraph> </Section> class="xml-element"></Paper>