File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/73/c73-1004_metho.xml
Size: 23,180 bytes
Last Modified: 2025-10-06 14:11:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C73-1004"> <Title>CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 30 FRANK G. PAGAN </SectionTitle> <Paragraph position="0"> It is far beyond the scope of the paper tO consider the development and implementation of practical programs for building large semantic structures. It should also be emphasized that the aim is not the development of linguistic theories or the presentation of particular lexical structures. While the linguistic positions adopted in the following section are consistent with much of the existing literature, they are primarily intended to provide a sample context in which the role of constructibility may be illustrated.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. HYPONYMY AND COMPATIBILITY </SectionTitle> <Paragraph position="0"> The term hyponymy has been used by J. LYoNs (1968) to denote the common but somewhat vague notion of &quot;semantic inclusion&quot; A hyponymy p of a word sense q is a word sense whose meaning is a &quot;subset &quot; of the meaning of q; for example, scarlet is a hyponym of red, and tulip is a hyponym of flower. Intuitively, ifp is a hyponym of q, then it makes sense to say that p is a kind of q. Synonymy of word senses may be defined in terms of hyponymy: p is a synonym of q if and only if p is a hyponym of q and q is a hyponym of p. The hyponymy relation is evidently reflexive and transitive.</Paragraph> <Paragraph position="1"> The importance of hyponymy has been generally recognized by workers in computational semantics (see, for example, M. 1~. QuII.-I.IAN, 1969, and P,.. M. SCHWARCZ, J. F. BrinGeR, 1~. F. SIMMONS, 1970). Quillian's system (M. R. QUILI.IAN, 1969) is a generalization of a scheme for representing lexical definitions of word senses in terms of supersets and properties, where the superset relation is simply the inverse of hyponymy. When the superset of some sense is modified by the relevant properties, an expression is obtained which is a definition of paraphrase of the given sense. As a highly simplified example, suppose that the superset of bachelor is man, with the corresponding property unmarried, and that the superset of youth is also man, with the corresponding property young. A simple calculation would then mark the expressions zoung bachelor and unmarried youth as paraphrases, since both are equivalent to the expression young unmarried man. Although I have not mentioned many of the details and problems involved (such as the possible forms properties may take), it seems clear that a representation for hyponymy is necessary for such important semantic processes as transformation between equivalent linguistic expressions.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS 31 </SectionTitle> <Paragraph position="0"> Turning to the compatibility relation, the simplest case involves combinations of adjective senses and noun senses to form expressions which may be either meaningful (hungry man) or meaningless (hungry theory). The two senses may be said to be compatible if their combination is meaningful. It turns out that little can be done with regard to the representation of this relation unless its definition is extended as follows: two adjective senses are compatible if and only if there are noun senses compatible with both. It is useful here to think of noun senses as &quot;atoms &quot; or &quot;constants &quot; and adjective senses as &quot;predicates &quot; which take noun senses as arguments (this is similar to Fillmore's approach \[C.J. FII.LMOI~, 1969\]). The second definition of compatibility then involves the intersections of the domains of predicates. This relation, which is reflexive and symmetric, will be the more important as far as constructibility is concerned.</Paragraph> <Paragraph position="1"> * As for the use of compatibility, on the other hand, the first definition is the important one. First, in a linguistic or computational system involving the generation or analysis of sentences, it would be useful in excluding meaningless constructions of word senses, and it can be argued that this capability is a fundamental one for any such system.</Paragraph> <Paragraph position="2"> Secondly, the relation provides the ability to disambiguate words in the context of their combination with other words. For example, suppose that hard-has two senses written as hard (difficult) and hard (not soft), and that ball has the two senses ball (round object) and ball (social event).</Paragraph> <Paragraph position="3"> Suppose further that hard (di~cult) is compatible with question and incompatible with both senses of ball, and that hard (not soft) is incompatible with question and ball (social event) but compatible with ball (round object). Then, in the expression* hard question, hard must mean hard (diffcult). Similarly, in the expression hard ball, hard must mean hard (not soft) and ball must mean ball (round object), since the other three combinations are all incompatible.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3. STRUCTURES. FOR HYPONYMY </SectionTitle> <Paragraph position="0"> To illustrate some classes of structures for the hyponymy relation, a hypothetical set of seven word senses a, b .... , g will be used. Their hyponymy relationships are given in Fig. 1 in the form of a matrix: a sense x is a hyponym of a sense y if and only if there is a 1 in the position whose row corresponds to x and whose column corresponds to y.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 32 FRANK G. PAGAN </SectionTitle> <Paragraph position="0"> a b c d e f g</Paragraph> <Paragraph position="2"> The graph structure which directly corresponds to this matrix is given in Fig. 2.</Paragraph> <Paragraph position="4"> In an F-graph such as this, each node corresponds to a sense, and the interpretation of the structure is that x is a hyponym of y if and only if there is an arc from x's node to y's node (stricdy speaking, since hyponymy is reflexive, there should be a loop at each node, but these may be omitted without loss of information).</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS 33 </SectionTitle> <Paragraph position="0"> An obvious simplification is otbained by making use of the transitivity of hyp0nymy in the interpretation of the structure. In a G-graph (Fig. 3), then, the nodes are unchanged from the corresponding Fgraph, but now x is a hyponym of/, if and only if there is a path from x's node to/s node.</Paragraph> <Paragraph position="1"> An F-graph is the transitive closure of its corresponding G-graph, and the G-graph has no &quot;redundant &quot; arcs; i.e., arcs from x to y and from y to z imply that there is no arc from x to z. This, together with the fact that synonyms are represented by separate nodes, implies that G-graphs are not necessarily unique; for example, the G-graph in Fig. 4 is equivalent to that in Fig. 3.</Paragraph> <Paragraph position="3"/> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 34 FRANK G. PAGAN </SectionTitle> <Paragraph position="0"> This non-uniqueness may be avoided by transforming the G-graph into an H-graph, in which synonyms are associated with the same node but the interpretation is otherwise unchanged. All H-graphs are acydic, since all the senses on a cycle would have identical hyponymy relationships with all other senses, i.e., they would be synonyms. Thus the nodes form a partially ordered set, and/-/-graphs may be represented by lattice-like Hasse diagrams (D. E. P,.OTr~RrORD, 1965) where each edge is assumed to be directed towards the top of the page, although the arrows are not explicitly present (Fig. 5).</Paragraph> <Paragraph position="1"> It may easily be seen that H-graphs are in one-to-one correspondence with the set of all possible hyponymy matrices (transitive, square, Boolean matrices). They are thus a fully general class of structures, and the question arises as to whether the set of linguistically significant structures is actually more restrictive than this. Judging from the exsting literature in pure and applied linguistics, there would seem to be a consensus that hyponymy graphs should in fact be tree-structured.</Paragraph> <Paragraph position="2"> This is apparently the case in Quillian's system (M. P~. QmLLIAN, 1969), for example, and some experimental support for the psychological reality of this type of structure is given in A. M. CotuNs, M. 1~. QUttHAN (1969). T. G. B~.W.R and P. S. R.OSSNRAU~ (1970) have also argued for the adequacy of tree structures in this context. In terms of H-graphs, the necessary restriction is obtained by requiring that any word sense may be an immediate hyponym of at most one other sense; i.e., it may not be a hyponym of two &quot; disjoint&quot; senses.</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS 35 </SectionTitle> <Paragraph position="0"> The class of structures so obtained may be termed H-forests, since they are a proper subset of the/-/-graphs.</Paragraph> <Paragraph position="1"> Although it has been generally adopted in existing models, the restriction to H-forests is not without its dit~iculties. The H-graph in Fig. 6 represents an intuitively correct set of hyponymy relationships which is inconsistent with the H-forests.</Paragraph> <Paragraph position="2"> An interesting but open question is whether such examples can be considered to be rare in natural language, occurring only in very special cases such as kinship terminology. A related question concerns the validity of the assumption that a given sense is necessarily an immediate hyponym of at most one other sense, so that hyponymy, defined on this basis, must be tree-structured. It is clear that with the present state of our knowledge a linguistic case can be made for both H-graphs and /-/-forests as the appropriate class of structures for the hyponymy relation. null Turning now to the constructibility of the various possible classes of hyponymy structures, it is not necessary to consider F-graphs and G-graphs, since they m:iy be dismissed in favor of H-graphs purely on the grounds of their mathematical properties and economy of representation. Both H-graphs and H-forests are constructible insofar as one can devise semi-automatic discovery procedures which will, in principle, construct them from basic data obtained from an informant.</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> 36 FRANK G. PAGAN </SectionTitle> <Paragraph position="0"> This &quot;basic data &quot; would consist of judgments of the relationships between particular pairs of senses: either the first is a hyponym of the second, the second is a hyponym of the first, they are synonyms, or they are related in none of these ways.</Paragraph> <Paragraph position="1"> The H-graph or H-forest would be constructed by a process of monotonic refinement; i.e., after the incorporation of each additional word sense, the new structure would be a recognizable generalization of the previous structure. More specifically, suppose that Hi and Hi+l are the current structures after i and i + 1 senses, respectively, x and y are any two senses previously processed, and the distance between two senses on the same path is the number of arcs separating them.</Paragraph> <Paragraph position="2"> Then the following statements would hold: a) If x and y are at the same node in H i, then they are at the same node in Hi+l; b) x and y are not at the same node in H i iff they are not at the same node in H~+I; c) x is above y in Hi iff x is above y in Hi+l;</Paragraph> <Paragraph position="4"> f) x and y are not on the same path in Hi iff they are not on the same path in Hi+l.</Paragraph> <Paragraph position="5"> The remaining requirements for constructibility are those of input efficiency and consistency maintenance. An obvious point of reference for efficiency is the maximum number of inputs which would be required irrespective of the algorithm and the set of word senses used. 1 N 2, where N is the number of senses, This is approximately given by and would be reached in cases where inputs must be obtained for all pairs of senses. The transitivity of hyponymy would allow H-graphs to be constructed with better efficiency than this; for example, if a sense z were found to be a hyponym of y, and y were known to be a hyponym of x, then it could be inferred or predicted that z is a hyponym of x with no additional input. H-forests could be constructed with better efficiency than H-graphs since they are a much more predictive class of structures; for example, if z were found to be &quot; unrelated&quot; at the root of a tree, then it would be predicted to be &quot;unrelated&quot; everywhere in the tree.</Paragraph> <Paragraph position="6"> Because the H-graphs are a fully general class of structures, inconsistency of a set of inputs with respect to them could be due only to</Paragraph> </Section> <Section position="11" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS - 37 </SectionTitle> <Paragraph position="0"> a violation of transitivity, and the procedure could check for such occurences automatically. In the case of H-forests, there is more scope for inconsistency, since any valid H-graph which is not also an H-forest represents a set of information which is inconsistent with Hforests. The procedure would have to obtain redundant inputs in order to maintain a consistency check in this case. The details of testing for and correcting inconsistencies would depend upon the properties of the particular discovery procedure used.</Paragraph> <Paragraph position="1"> To summarize this section, I have attempted to place in perspective the plausible classes of structures for hyponymy. Both H-graphs and H-forests satisfy the basic requirement for constructibility. The construction of the latter, however, would be more efficient, so that it would be disadvantageous to use a discovery procedure for H-graphs in general if H-forests alone are linguistically adequate.</Paragraph> </Section> <Section position="12" start_page="0" end_page="0" type="metho"> <SectionTitle> 4. STRUCTURES FOR COMPATIBILITY </SectionTitle> <Paragraph position="0"> The variety of structure classes for the compatibility relation turns out to be richer than in the case of hyponymy. To illustrate some of these representations, suppose that the compatibility relationships among the seven-word senses a, b ..... g are as shown in the matrix of Fig. 7.</Paragraph> <Paragraph position="1"> a b c d e f g</Paragraph> <Paragraph position="3"> Since the relation is symmetric, it is quite natural to represent it as an undirected graph (an A-graph) where the nodes correspond to senses and the edges indicate compatibility. The A-graph corresponding to For large sets of senses, A-graphs would clearly be unwieldy and uneconomic, and a more compact representation would be highly desirable. One particularly interesting transformation of the matrix or A-graph is obtained by grouping senses with the same compatibility properties with respect to all the other senses and placing them at the same node. The interpretation of such a C-graph is that two senses are compatible if and only if they are associated with the same or adjacent nodes. The C-graph for the present example is given in Fig. 9. It can be easily shown that every C-graph is unique and that this class of structures is fully general.</Paragraph> </Section> <Section position="13" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS 39 </SectionTitle> <Paragraph position="0"> Turning now to more restrictive structures, a more compact representation can sometimes be obtained by transforming the undirected C-graph into a directed D-graph with the same set of nodes but with the interpretation that two senses are compatible if and only if either they are at the same node or there exists a path between their nodes. In analogy to the situation for/-/=graphs, the nodes of a D-graph are partially ordered. Fig. 10 shows a D-graph for the example in the form of a Hasse diagram. The D-graphs are not a fully general class of structures; i.e., there are C-graphs with no corresponding D-graphs. Neither are they necessarily unique: it can be shown that some C-graphs have more than one equivalent D-graph.</Paragraph> <Paragraph position="1"> The linguistic significance of these structures lies in their strong similarity to systems of semantic markers or features (corresponding to nodes) related by redundancy rules (corresponding to edges) (J. J. KATZ, J. A. FODOR, 1963). In this case, a feature is defined as the set of senses which, in other formulations, would be said to have the feature. The relationships among features, and hence the compatibility properties of word senses, have often been represented by structures similar to/9= graphs; moreover, a survey of the relevant theoretical and computational literature (e,g., tL. C. SCHANK, 1969; F. SOMMF.RS, 1963) would reveal that tree-like structures have nearly always been considered to be adequate for this purpose. The graph of Fig. 10 is in fact such a D-forest. A given set of compatibility relationships can be represented by at most one D-forest.</Paragraph> <Paragraph position="2"> The constructibility criterion also bears strongly on the relative merits of these structure classes. Considering the C-graphs first, it 40 FRANK G. PAGAN may easily be shown that this class satisfies all the requirements of constructibility except the crucial one of effciency. Assuming the availability of basic data in the form of yes-or-no judgments of the compatibility of unordered pairs of word senses, a C-graph could be built by a monotonic refinement process: at each stage the current sense would either be added to an existing node or form a new node, possibly in conjunction with the &quot;splitting &quot; of other nodes. Efficiency, however, is the worst possible because the class of structures is fully general, and the judgements obtained for the compatibility of the current sense with some subset of the previous senses have no predictive value for its compatibility with any of the other senses in the graph.</Paragraph> <Paragraph position="3"> The lack of effciency in constructing C-graphs can be regarded as a corollary of the fact that inconsistencies are impossible2 They are possible in the case of D-graphs, however, and thus the assumption that they would not arise would make some judgments unnecessary after others became known. But D-graphs fail to be constructible because of their non-uniqueness. The D-graphs shown in Fig. 11 and Fig. 12 are equivalent, and that shown in Fig. 13 represents the same information except that it contains an additional word sense.</Paragraph> </Section> <Section position="14" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS 41 </SectionTitle> <Paragraph position="0"/> </Section> <Section position="15" start_page="0" end_page="0" type="metho"> <SectionTitle> 42 FRANK G. PAGAN </SectionTitle> <Paragraph position="0"> The latter graph is a simple modification of Fig. 11 but not of Fig.</Paragraph> <Paragraph position="1"> 12, and there is no graph equivalent to Fig. 13 which is easily obtained from Fig. 12. Since a given discovery procedure could construct either of the first two graphs depending upon the order in which the senses were incorporated, the process would in general involve complete reconstruction rather than monotonic refinement.</Paragraph> <Paragraph position="2"> The D-forests, finally, are constructible in all respects. Monotonic refinement is characterized in this case by the following statements, where x and y again are any two senses previously processed and D~ and D~+ 1 are the current forests after i and i q- 1 senses, respectively: a) if x and y are at the same node in D,., then they are at the same or adjacent nodes in Di+l; b) if x is a distance d above y in Di, then x is a distance d or d -q- 1 above y in D~+I; c) if x is a distance d above y in Di+l, then x is a distance d or d-- 1 above y in D~; d) x and y are not on the same path in Di iff they are not on the same path in Di+l.</Paragraph> <Paragraph position="3"> A D-forest is a highly predictive structure; i.e., a judgment involving one of its word senses will in general predict the outcome of many other judgments. A detailed estimation of the efficiency of construction would involve the order of the senses, the configuration of the final structure, and the node-scanning strategy used in the discovery algorithm. It is sufficient here to recall, however, that nodes correspond to linguistic features, and hence one would expect that if the number of senses processed were very large, then the number of nodes would be much smaller. Thus all the nodes would appear relatively early in the construction process, and this leads to the conclusion that the total number of judgments would be a nearly linear function of the number of senses. In order to detect inconsistencies, however, some of this efficiency would have to be sacrificed and redundant judgments taken.</Paragraph> <Paragraph position="4"> To sum up this section, the series of structure classes considered for representing compatibility has led to one class, the D-forests, which is constructible. These structures also have several important properties in common with other models in theoretical and computational linguistics. null</Paragraph> </Section> <Section position="16" start_page="0" end_page="0" type="metho"> <SectionTitle> CONSTRUCTIBLE REPRESENTATIONS FOR TWO SEMANTIC RELATIONS 43 5. CONCLUSION </SectionTitle> <Paragraph position="0"> This paper has presented two examples of the use of constructibility as a design criterion for lexical or semantic structures of the type used in natural language processing. This criterion is essential if we ever hope to implement computational systems with broad linguistic competence. It is very encouraging that in the two cases considered the approach has led to structures basically equivalent to those which have actually been used. Although these previous models have generally lacked the property of constructibility, then, it is suggested that this is a fault of design which can be overcome.</Paragraph> </Section> class="xml-element"></Paper>