File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1504_intro.xml
Size: 4,539 bytes
Last Modified: 2025-10-06 14:02:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1504"> <Title>Axiomatization of Restricted Non-Projective Dependency Trees through Finite-State Constraints that Analyse Crossing Bracketings</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Prerequisites </SectionTitle> <Paragraph position="0"> We are going to define the string representation for D-trees with axioms that are given as extended regular expressions. The intersection of the languages described by these axioms is the set of valid representations. In the following, we define the alphabets and regular expressions used in the axioms.</Paragraph> <Paragraph position="1"> Assume that a3 is the set of category labels for the arcs, and that a0 , a1 , and a2 are the parameters specifying upper bounds respectively for the proper bracketing depth, nested crossing depth, and non-projectivity depth. These depth measures will be precisely defined in an appropriate context.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Alphabets </SectionTitle> <Paragraph position="0"> In Figure 2, several different kinds of symbols are involved. They belong to the following alphabets:</Paragraph> <Paragraph position="2"> The union of these alphabets is denoted as a46 . The union of the alphabetsa26 a27 anda26a40a42 isa26 .</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Regular Expressions </SectionTitle> <Paragraph position="0"> Leta9 anda5 be sets of strings, anda47 a positive integer. In the axioms we will use the following regular operations: Kleene's closure (a9a49a48 ), finite iteration</Paragraph> <Paragraph position="2"> with this precedence order. The semantics of these operations is defined in the usual way. Parenthesis (a31a58a33 ) is used for grouping. The boxed dot a59 denotes the language a31a58a46a60a52a60a19 #a24a61a33a48 .</Paragraph> <Paragraph position="3"> A context restriction of a center a62 in contexts</Paragraph> <Paragraph position="5"> is a regular operation where a62 is a subset of a46a64a48 and each context a63 a12, a0a65a13 a17a11a13a66a47 , is of the forma67 a12 a68 a12, wherea67 a12a37 a68 a12a15a69 a46a70a48 . The operation is expressed using a notation</Paragraph> <Paragraph position="7"> and it defines the set of all stringsa76 a36 a46 a48 such that, for every possible a2 a37a78a77 a36 a46 a48 and a39a79a36 a62 , for which</Paragraph> <Paragraph position="9"> The axioms in this paper produce a set of quite small automata and the satisfiability and usability of this lazy finite-state system for choosing a representation has been tested with real D-trees. (After these tests, the axioms presented here have undergone some editing that hopefully have not introduced typos or bugs.)</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 The Basic Structure </SectionTitle> <Paragraph position="0"> Axiom 1. (a) The string begins and ends with a node boundary (#). (b) Between each two boundaries there exists at least one word token or the wall</Paragraph> <Paragraph position="2"> are always separated by a node boundary.</Paragraph> <Paragraph position="3"> Axiom 2. (a) There are no two similar square brackets in a node. (b) The color indices a17 of closing brackets increase monotonically when we move from a right bracket towards the closest node boundary (#) on the left.</Paragraph> <Paragraph position="4"> Axiom 3. (a) All the labels a39a81a36 a26 belong to some surrounding bracket. (b) Each left (right) bracket has some label that is attached to it.</Paragraph> <Paragraph position="5"> Axiom 4. (a) Angle bracketsa9a7a6a83a55a86a9a11a8 do not have more than one label attached to them. (b) Within each node, no angle bracket a36a12 ( a34a87a12) occurs inside a square bracket [a12 (]a12) having the same color a17 . Axiom 5. (a) The wall (a14a15a14a16a14a16a14a16a14 ) and all the word tokens</Paragraph> <Paragraph position="7"> the color selectors a88 a36 a16 occur after a word token or the wall.</Paragraph> <Paragraph position="8"> Axiom 6. (a) There is one and only one ungoverned node. (b) All other nodes are governed by some node. (c) No node depends immediately on more than one other node. (d) No node depends on itself. These axioms are presented more formally as the following regular expressions:</Paragraph> <Paragraph position="10"/> </Section> </Section> class="xml-element"></Paper>