File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/e95-1034_intro.xml

Size: 3,415 bytes

Last Modified: 2025-10-06 14:05:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="E95-1034">
  <Title>Integrating &amp;quot;Free&amp;quot; Word Order Syntax and Information Structure</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In this paper, I present a categorial formalism, Multiset CCG (based on Combinatory Categorial Grammars (Steedman, 1985; Steedman, 1991)), that captures the syntax and context-dependent interpretation of &amp;quot;free&amp;quot; word order in languages such as Turkish. Word order variation in relatively free word order languages, such as Czech, Finnish, German, Japanese, Korean, Turkish, is used to convey distinctions in meaning that go beyond traditional propositional semantics. The word order in these languages serves to structure the information being conveyed to the hearer, e.g.</Paragraph>
    <Paragraph position="1"> by indicating what is the topic and the focus of the sentence (as will be defined in the next section). In fixed word order languages such as English, these are indicated largely through intonation and stress rather than word order.</Paragraph>
    <Paragraph position="2"> The context-appropriate use of &amp;quot;free&amp;quot; word order is of considerable importance in developing practical applications in natural language generation, machine translation, and machine-assisted translation. I have implemented a database query system in Prolog, described in (Hoffman, 1994), which uses Multiset CCG to interpret and gen: crate Turkish sentences with context-appropriate word orders. Here, I concentrate on further devel*I would like to thank Mark Steedman, Ellen Prince, and the support of NSF Grant SBR 8920230.</Paragraph>
    <Paragraph position="3"> oping the formalism, especially to handle complex sentences.</Paragraph>
    <Paragraph position="4"> There have been other formalisms that integrate information structure into the grammar for &amp;quot;free&amp;quot; word order languages, e.g. (Sgall et al, 1986; Engdahl/Vallduvi, 1994; Steinberger, 1994).</Paragraph>
    <Paragraph position="5"> However, I believe my approach is the first to tackle complex sentences with embedded information structures and discontinuous constituents. Multiset CCG can handle free word order among arguments and adjuncts in all clauses, as well as word order variation across clause boundaries, i.e. long distance scrambling. The advantage of using a combinatory categorial formalism is that it provides a compositional and flexible surface structure, which allows syntactic constituents to easily correspond with information structure units. A novel characteristic of this approach is that the context-appropriate use of word order is captured by compositionally building the predicate-argument structure (AS) and the information structure (IS) of a sentence in parallel. After presenting the motivating Turkish data in Section 2, I present a competence grammar for Turkish in Section 3 that captures the basic syntactic and semantic relationships between predicates and their arguments or adjuncts while allowing &amp;quot;free&amp;quot; word order. This grammar, which derives the predicate-argument structure is then integrated with the information structure in Section 4. In Section 5, the formalism is extended to account for complex sentences and long distance scrambling.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML