File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1614_intro.xml
Size: 5,482 bytes
Last Modified: 2025-10-06 14:03:58
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1614"> <Title>Is it Really that Difficult to Parse German?</Title> <Section position="4" start_page="111" end_page="112" type="intro"> <SectionTitle> 2 Grammatical Features of German </SectionTitle> <Paragraph position="0"> There are three distinctive grammatical features that make syntactic annotation and parsing of German particularly challenging: its placement of the finite verb, its flexible phrasal ordering, and the presence of discontinuous constituents. These features will be discussed in the following subsections. null</Paragraph> <Section position="1" start_page="111" end_page="111" type="sub_section"> <SectionTitle> 2.1 Finite Verb Placement </SectionTitle> <Paragraph position="0"> In German, the placement of finite verbs depends on the clause type. In non-embedded assertion clauses, the finite verb occupies the second position in the clause, as in (1a). In yes/no questions, as in (1b), the finite verb appears clause-initially, whereas in embedded clauses it appears clause finally, as in (1c).</Paragraph> <Paragraph position="1"> '... that Peter will have read the book.' Regardless of the particular clause type, any cluster of non-finite verbs, such as gelesen haben in (1a) and (1b) or gelesen haben wird in (1c), appears at the right periphery of the clause.</Paragraph> <Paragraph position="2"> The discontinuous positioning of the verbal elements in verb-first and verb-second clauses is the traditional reason for structuring German clauses into so-called topological fields (Drach, 1937; Erdmann, 1886; H&quot;ohle, 1986). The positions of the verbal elements form the Satzklammer (sentence bracket) which divides the sentence into a Vorfeld (initial field), a Mittelfeld (middle field), and a Nachfeld (final field). The Vorfeld and the Mittelfeld are divided by the linke Satzklammer (left sentence bracket), which is realized by the finite verb or (in verb-final clauses) by a complementizer field. The rechte Satzklammer (right sentence bracket) is realized by the verb complex and consists of verbal particles or sequences of verbs.</Paragraph> <Paragraph position="3"> This right sentence bracket is positioned between the Mittelfeld and the Nachfeld. Thus, the theory of topological fields states the fundamental regularities of German word order.</Paragraph> <Paragraph position="4"> The topological field structures in (2) for the examples in (1) illustrate the assignment of topological fields for different clause types.</Paragraph> <Paragraph position="5"> (2) a. a0a1 a2 a0a3a4 Peter a5 a5 a0a6a7 wird a5 a0a8 a2 a0a3a4 das Buch a5 a5 a0a9a7 a0a1 a10 gelesen haben. a5 a5 b. a0a6a7 Wird a5 a0a8 a2 a0a3a4 Peter a5 a0a3a4 das Buch a5 a5 a0a9a7 a0a1 a10 gelesen haben? a5 a5 c. a0a6a7 a0a10a2 dass a5 a5 a0a8 a2 a0a3a4 Peter a5 a0a3a4 das Buch a5 a5 a0a9a7 a0a1 a10 gelesen haben wird. a5 a5 (2a) and (2b) are made up of the following fields: LK (for: linke Satzklammer) is occupied by the finite verb. MF (for: Mittelfeld) contains adjuncts and complements of the main verb. RK (for: rechte Satzklammer) is realized by the verbal complex (VC). Additionally, (2a) realizes the topological field VF (for: Vorfeld), which contains the sentence-initial constituent. The left sentence bracket (LK) in (2c) is realized by a complementizer field (CF) and the right sentence bracket (RK) by a verbal complex (VC) that contains the finite verb wird.</Paragraph> </Section> <Section position="2" start_page="111" end_page="112" type="sub_section"> <SectionTitle> 2.2 Flexible Phrase Ordering </SectionTitle> <Paragraph position="0"> The second noteworthy grammatical feature of German concerns its flexible phrase ordering. In (3), any of the three complements and adjuncts of the main verb (ge)lesen can appear sentenceinitially. null b. Gestern hat der Mann den Roman gelesen c. Den Roman hat der Mann gestern gelesen In addition, the ordering of the elements that occur in the Mittelfeld is also free so that there are two possible linearizations for each of the examples in (3a) - (3b), yielding a total of six distinct orderings for the three complements and adjuncts. Due to this flexible phrase ordering, the grammatical functions of constituents in German, unlike for English, cannot be deduced from the constituents' location in the tree. As a consequence, parsing approaches to German need to be based on treebank data which contain a combination of constituent structure and grammatical functions - for parsing and evaluation.</Paragraph> </Section> <Section position="3" start_page="112" end_page="112" type="sub_section"> <SectionTitle> 2.3 Discontinuous Constituents </SectionTitle> <Paragraph position="0"> A third characteristic feature of German syntax that is a challenge for syntactic annotation and for parsing is the treatment of discontinuous constituents. null 'Peter is said to have recommended to the man to read the novel.' (4) shows an extraposed relative clause which is separated from its head noun den Roman by the non-finite verb gelesen. (5) is an example of an extraposed non-finite VP complement that forms a discontinuous constituent with its governing verb empfohlen because of the intervening non-finite auxiliary haben. Such discontinuous structures occur frequently in both treebanks and are handled differently in the two annotation schemes, as will be discussed in more detail in the next section.</Paragraph> </Section> </Section> class="xml-element"></Paper>