File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-1065_metho.xml
Size: 20,756 bytes
Last Modified: 2025-10-06 14:07:08
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1065"> <Title>Hypertags</Title> <Section position="2" start_page="446" end_page="447" type="metho"> <SectionTitle> 1 Brief Overview of LTAGs </SectionTitle> <Paragraph position="0"> A LTAG consists of a t'inite set of elementary trees of finite depth. Each elementary tree nmst &quot;anchor&quot; one or more lcxical item(s). The principal anchor is called &quot;head&quot;, other anchors are called &quot;co-heads&quot;. All leaves in elementary trees are either &quot;anchor&quot;, &quot;foot node&quot; (noted *) or &quot;substitution node&quot; (noted $).. These trees are of 2 types * auxiliary or initial 3. A tree has at most 1 foot-node. A tree with a foot node is an auxiliary tree. Trees that are not auxiliary are initial. Elementary trees combine with 2 operations : substitution and adjunction, but we won't develop this point since it is orthogonal to our concern and refer to Joshi (87) for more details. Morphosyntactic features are encoded in atomic feature structures associated to nodes in elementary trees, in order to handle phenomena such as agreement.</Paragraph> <Paragraph position="1"> Moreover, linguistic constraints on the well-formedness of elementary trees have been formulated : * Predicate Argulnent Cooccurence Principle : there must be a leaf node for each realized argument of the head of an elementary tree.</Paragraph> <Paragraph position="2"> * Semantic consistency : No elementary tree is semantically void * Semantic minimality : an elementary tree corresponds at most to one semantic unit Figure 1 shows a non exhaustive set of Supertags (i.e. elementary trees) which can be assigned to &quot;beats ''4 , which is a verb in trees ctl (canonical tree), ~2 (object extraction), 131 (ob.iect relative) and \[32 (subject relative) and a noun in tree oG. So an LTAG can be seen as a large dictionary, were in addition of traditional POS, lexical entries are associated with several structures encoding their nlorphological as well as some of their syntactic properties, these structures being very similar to small constituent trees.</Paragraph> <Paragraph position="3"> The idea of underspecifying constituent trees (and thus elementary trees) is not new. Several solutions have been proposed in the past. We will now investigate how these solutions could potentially be used to encode a set of supertags in a compact manner.</Paragraph> <Section position="1" start_page="446" end_page="447" type="sub_section"> <SectionTitle> 2.1 Parse forest </SectionTitle> <Paragraph position="0"> Since elementary trees are constituent structures, one could represent a set of elementary trees with a graph instead of a tree (cf. Tomita (91)). This approach is not particularly interesting though. For example, if one considers the trees czl and 131 fi'om figure 1, it is obvious that they hardly have any structural information in common, not even the category of their root.</Paragraph> <Paragraph position="1"> Therefore, representing these 2 structures in a graph would not help. Moreover, packed structures are notoriously difficult to manipulate and yield unreadable output.</Paragraph> </Section> <Section position="2" start_page="447" end_page="447" type="sub_section"> <SectionTitle> 2.2 Logical formulae </SectionTitle> <Paragraph position="0"> With this approach, developped for instance in Kalhneyer (99), a tree can be represented by a logical formula, where each pair of nodes is either in relation of dominance, or in relation of precedance. This allows to resort to 1 ~' order logic to represent a set of trees by underspecifying dominance and/or precedence relations . Unfortunately, this yields an output which is difficult to read. Also, the approach relies only on mathematical properties of trees (i.e. no linguistic motivations)</Paragraph> </Section> <Section position="3" start_page="447" end_page="447" type="sub_section"> <SectionTitle> 2.3 Linear types of trees </SectionTitle> <Paragraph position="0"> This approach, introduced in Srinivas (97), used in other work (e.g. Halber (99)) is more specific to TAGs. The idea is to relax constraints on the order of nodes in a tree as well as on internal nodes. A linear type consists in a 7-tuple <A,B,C,D,E,F,G> where A is the root of the tree, B is the category of the anchor, C is the lexical anchor, D is a set of nodes which can receive an adjunction, E is a set of co-anchors, F a set of nodes marked for substitution, and G a potential foot node (or nil in case the tree is initial). In addition, elements of E and F are marked + if they are to the left of the anchor, - if they are to the right.</Paragraph> <Paragraph position="2"> two trees with the same linear type For example, the tree NOdonneNl'~N2 for &quot;Jean donne une pomme gl Marie&quot; (J. gives an apple to M.) and the tree N0donne~lN2Nl for &quot;Jean donne & Marie une pomme&quot; (J. gives M. an apple) which are shown on Figure 2, yield the unique linear type (a) (a) <S,V,donnc, { S,V,PP}, { h+ }, { N0-,NI +,N2+ }, nil> (b) <S,V,gives, { S,V,PP}, { to+ }, { N0-,N1 +,N2+} ,nil> This approach is robust, but not really linguistic : it will allow to refer to trees that are not initially in the grammar. For instance, the linear type (b) will correctly allow the sentence &quot;John gives an apple to Mary&quot;, but also incorrectly allow &quot;*John gives to Mary an apple&quot;. Moreover, linear types are not easily readable s. Finally, trees that have more structural differences than just the ordering of branches will yield different linear types. So, the tree N0giveNltoN2 (J. gives an apple to M.) yields the linear type (b), whereas the tree N0giveN2Nl (J. gives M. an apple) yields a different linear type (c), and thus both linear types should label &quot;gives&quot;. Therefore, it is impossible to label &quot;gives&quot; with one unique linear type.</Paragraph> <Paragraph position="3"> (c) <S,V,gives, { S,V}, { }, { N0-,N 1 +,N2+} ,nil></Paragraph> </Section> <Section position="4" start_page="447" end_page="447" type="sub_section"> <SectionTitle> 2.4. Partition approach </SectionTitle> <Paragraph position="0"> This approach, which we have investigated, consists in building equivalence classes to partition the grammar, each lexical item then anchors one class instead of a set of trees. But building such a partition is prohibitively costly : a wide coverage grammar for French contains approx. 5000 elementary trees (cf Abeilld & al.</Paragraph> <Paragraph position="1"> (99), (00b)), which means that we have 25~'~ possible subsets. Also, it does not work from a linguistic point of view : (a) Quand Jean a brisd la glace ? (When did J. break the ice ?) (b) Jean a brisd la glace (J. broke the ice) (c) Quelle chaise Jean a brisd ce matin ? (Which chair did J. break this morning ?) In (a) brisd potentially anchors N0briseNI (canonical transitive), WhN0brise (object extraction) and NOBriseGlace (tree for idiom). But in (b), we would like brim not to anchor WhN0brise since there is no Wh element in the sentence, therefore these three trees should not belong to the same equivallence class : We can have class A={N0briseN1,NOBriseGlace} and ClassB={WhN0brise}. But then, in (c), brisd potentially anchors WhN0brise and N0briseNI but not NOBriseGlace since glace does not appear in the sentence. So NOVN1 and NOBriseGlace should not be in the same equivalence class. This hints that the only realistic partition of the grammar would be the one were each class contains only one tree, which is pretty useless.</Paragraph> </Section> </Section> <Section position="3" start_page="447" end_page="451" type="metho"> <SectionTitle> 4. Exploiting a MetaGrammar </SectionTitle> <Paragraph position="0"> Candito (96), (99) has developed a tool to generate semi-automatically elementary trees She use an additional layer of linguistic description, called the metagrammar (MG), which imposes a general organization for syntactic information in a 3 dimensional hierarchy : 5 This type of format was considered as a step towards creating a trccbank for French (of Abcilld & al 00a), but unfommatcly proved impossible to manually annotate.</Paragraph> <Paragraph position="1"> (r) Dimension 1: initial subcategorization (r) Dimension 2: redistribution of functions and transitivity alternations * Dimension 3: surface realization of arguments, clause type and word order Each terminal class in dimension 1 describes a possible initial subcategorization (i.e. a tree family). Each terminal class it\] dimension 2 describes a list of ordered redistributions of functions (e.g. it allows to add an argument for causatives). Finally, each terminal class in dimension 3 represents the surface realization of a (final) flmction (e.g. cliticized, extracted ...). Each class in the hierarchy corresponds to the partial description of a tree (cf. Rogers & Vijay-Shanker (94)). An elementary tree is generated by inheriting from one terminal class in dimension 1, fi'om one terminal class in dimension 2 and fl'olll U terulinal classes ill dinlension 3 (were n is the number of arguments of the elementary tree). 6 The hierarchy is partially handwritten. Then crossing of linguistic phenomena (e.g. passive + extraction), terminal classes, and from there elementary trees are generated automatically off line. This allows to obtain a grammar which cat\] then be used to parse online. When the grau\]mar is generated, it is straight forward to keep track of the terminal classes each elementary tree inherited from : Figure 3 shows seven elementary trees which can superiag &quot;domw&quot; (gives), as well as the inheritance patterns 7 associated to each of these supertags. All the exainples below will refer to this figure.</Paragraph> <Paragraph position="2"> The key idea then is to represent a set of elementary trees by a disjunction for each dilnension of the hierarchy. Therefore, a hypertag consists in 3 disjunctions (one for dimension 1, one for dinlension 2 and one for dimension 3).</Paragraph> <Paragraph position="3"> The cross-product of the disiunctions can then be perforined automatically and from there the set of elementary trees referred to by the hypertag will 6 The idea to use the MG to obtain a colnpact representation of a set of SuperTags was briefly sketched in Candito (99) and Abeill6 & al. (99), by resorting to MetaFeatures, but the approach here is slightly different since only inlbrmation about the classes in the hierarchy is used.</Paragraph> <Paragraph position="4"> 7 We call inheritance patterns Ihe structure used to store all the terminal classes a tree has inherited from.</Paragraph> <Paragraph position="5"> be automatically retrieved We will now ilhlstrate this, first by showing how hypertags are built, and then by explaining how a set of trees (and thus of supertags) is retrieved from the information contained in a hypemig.</Paragraph> <Section position="1" start_page="448" end_page="449" type="sub_section"> <SectionTitle> 4.1 Building hypertags : a detailed example </SectionTitle> <Paragraph position="0"> Let us start with a simple exemple were we want &quot;donner&quot; to be assigned the supertags o~1 (J. dmme tree pomme D M.I J. gives an apple to M.) and o~2 (J donne h M. tree l)omme/J, gives M. an apple). On figure 3, one notices that these 2 trees inherited exactly fi'om the same classes : the relative order of the two complements is left unspecified in the hierarchy, thus one same description will yield both trees. In this case, the hypertag will thus simply be identical to the inheritance pattern of these 2 trees : Let's now add tree o{3 (J. donne une pomme / J. gives an apple) to this hypertag. This tree had its second object declared empty in dimension 2 (thus it inherits only two terminal classes from dimension 3, since it has only 2 arguments realized). The hypertag now becomes 8 : Dim. 1: n0vnl(an2) Dim. 2 : no redistribution OR StObj- empty I)im. 3 lsubj :nonainal-canonical \[ obj : nominal-canonical a-obj: nominal-canonical Let's now add the tree 134 for the object relative to this hypertag. This tree has been generated by inheriting in dimension 3 fi'om the terminal class &quot;nominal inverted&quot; for its subject and from the class &quot;relativized object&quot; for its object. This information is simply added in the hypertag, which now becomes : I )i,l~. : n0wll (~.12) ira. 2 : no redistribution 0P, il0bj- empty l ira. 3 subj :nominal-canonical OR nominal-inverledl I obj : nominal-canonical OR relativized-oblect I I I a-0bj: n0minal-canonical ii Also note that for this last example the structural properties of 134 were quite different than those of C/~1, 0{2 and cG (for instance, it has a root of category N and not S). But this has little importance since a generalization is made in linguistic terms without explicitly relying on the shape of trees.</Paragraph> <Paragraph position="1"> it is also clear that hypertags are built in a monotonic fashion : each supertag added to a hypertag just adds information. Hypertags allow to label each word with a unique structure 9. and 8 What has been added to a supertag is shown in bold characters.</Paragraph> <Paragraph position="2"> 9 We presented a simple example for sake of clarity, but traditional POS ambiguity is handled in the same way, except that disjunctions are then added in dimension 1 as contain rich syntactic and ftmctional information about lexical items (For our example here the word donne~gives). They are linguistically motivated, but also yield a readable output. They can be enriched or modified by Imman annotators or easily fed to a parser or shallow parser.</Paragraph> </Section> <Section position="2" start_page="449" end_page="449" type="sub_section"> <SectionTitle> 4.2 Retrieving information from hypertags </SectionTitle> <Paragraph position="0"> Retrieving inforlnation from hypertags is pretty straightforward. For example, to recover the set of supertags contained in a hypertag, one just needs to perform the cross-product between tile 3 dimensions of the hypertag, as shown orl Figure 4, in order to obtain all inheritance patterns. These inheritance patterns are then matched with tile inheritance patterns contained in the grammar (i.e. tile right colunm in Figure 3) to recover all the appropriate supertags.</Paragraph> <Paragraph position="1"> Inheritance patterns which are generated but don't match any existing trees in tile grammar are simply discarded.</Paragraph> <Paragraph position="2"> We observe that the 4 supertags 0{1, c~2 and 0{3 and \]34 which we had explicitly added to tile hypertag in 4.1 are correctly retrieved. But also, the supertags 135, 136 and 137 arc retrieved, which we did not explicitly intend since we never added them to the hypertag. But if a word can anchor the 4 first trees, then it will also necessarily anchor tile three last ones : for instance we had added the canonical tree without a second object realized into the hypertag (tree or2 ), as well as the tree for tile object relative with a second object realized realized (tree 134 ), so it is expected that tile tree for the object relative without a second object realized can be retrieved from the hypertag (tree 136) even though we never explicitly added it. In fact, the automatic crossing of disjunctions in the hypertag insures consistency.</Paragraph> <Paragraph position="3"> Also note that no particular&quot; mechanism is needed for dimension 3 to handle arguments which are not realized : if hObj-empty is inherited from dilnension 2, then only subject and object will inherit from dimeusiou three (since only arguments that are realized inherit from that dimension when the grammar is generated).</Paragraph> <Paragraph position="4"> Information can be modified at runtime in a hypertag, depending on the context of lexical items. For example &quot;relativized-object&quot; can be suppressed in dimension 2 from the hypertag shown on Figure 4, in case no Wh element is encountered in a sentence. Then, the correct set of supertags will still be retrieved from the well.</Paragraph> <Paragraph position="5"> hypertag by automatic crossing (that is, trees o~1, (;~2 and o'.3), since the other inheritance l)atterns generated won't refer to any tree ill the grainmar (here, tie tree inherits in diillension 3 ,vuhject:in, verted-nominal, without inheriting also objecl: IwlalivizeU-oluect)</Paragraph> </Section> <Section position="3" start_page="449" end_page="451" type="sub_section"> <SectionTitle> 4.3 Practical use </SectionTitle> <Paragraph position="0"> We have seen that an LTAG can be seen as a dictionary, in which each lexical entry is associated to a set of elementary trees. With hypertags, each lexical entry is now paired with one unique structure. Therefore, automatically hypertagging a text is easy and involves a simple dictionary lookup. The equiwllent of finding the &quot;right&quot; supertag for each lexical item in a lext (i.e. reducing ambiguity) then consists in dynamically removing information from hypertags (i.e.</Paragraph> <Paragraph position="1"> suppressing elements in disjunctions). This can be achieved by specific rules, which are currently being developed. The resulting output carl then easily be manually annotated in order to build a gold-standard corpus : manually removing linguistically relevant pieces fronl information in a disjunction from a single structure is simpler than dealing with a set of trees. In addition of obvious advantages in terms of display (tlee structures, especially when presented in a non graphical way, are unreadable), the task itself becomes easier because topological problems are solved automatically: annotators need just answer questions such as &quot;does this verb have an extracted object ?&quot;, &quot;is the subject of this verb inverted ?&quot; to decide which terminal classe(s) nlust be kept ideg .We believe that these questions are easier to iulswcr than &quot;Which of these trees have a node N I marked wh+ at address 1.1 9&quot; (for an extracted object).</Paragraph> <Paragraph position="2"> Moreover, supertagged text are difficult to use outside of an LTAG framework, contrary to hypertagged texts, which contain higher level general linguistic information. An example would be searching and extracting syntactic data oil a large scale : suppose one wants to extract all tile occurrences where a given verb V has a relativized object. To do so on a hypertagged text simply involves performing a &quot;grep&quot; ell all lines coutainhig a V whose hypertag contains dimension .7 : objet:relalivized-object , without knowing anything about the LTAG framework.</Paragraph> <Paragraph position="3"> Performing the same task with a supertagged text involves knowing how LTAGs encode relativized objects in elementary trees and scanning potential trees associated with V. Another examl)le would be using a hypertagged text as an input to a parser based oil a framework other than LTAGs : for instance, information in hypertags could be used by an LFG parser to constrain the construction of an IV-structure, whereas it's uuclear how tills could be achieved with supertags.</Paragraph> <Paragraph position="4"> 10 This of course implies that one must be very careful in choosing evocative names for terminal classes.</Paragraph> <Paragraph position="5"> The need to &quot;featurize&quot; Supertags, in order to pack ambiguity and add functional information has also been discussed for text generation ill Danlos (98) and more recently in Srinivas & Rainbow (00). It would be interesting to compare their approach with that of hypertags.</Paragraph> <Paragraph position="6"> Conclusion We have introduced the notion of Hypertags.</Paragraph> <Paragraph position="7"> Hypertags allow to assign one unique structure to lexical items. Moreover this structure is readable, linguistically and computationally motivated, and contains much richer syntactic information than traditional POS, thus a hypertagger would be a good candidate as the front end of a parser. It allows in practice to build large annotated resources which are useful for extracting syntactic information on a large scale, without being dependant on a ~iven grammatical formalism.</Paragraph> <Paragraph position="8"> We have shown how hypertags are built, how information can be retrieved from them. Further work will investigate how hypertags can be combined directly.</Paragraph> </Section> </Section> class="xml-element"></Paper>