File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-1307_intro.xml

Size: 2,765 bytes

Last Modified: 2025-10-06 14:00:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1307">
  <Title>A Uniform Method of Grammar Extraction and Its Applications</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> There are various grammar frameworks proposed for natural languages. We take Lexicalized Tree-adjoining Grammars (LTAGs) as representative of a class of lexicalized grammars. LTAGs (Joshi et al., 1975) are appealing for representing various phenomena in natural languages due to its linguistic and computational properties. In the last decade, LTAG has been used in several aspects of natural language understanding (e.g., parsing (Schabes, 1990; Srinivas, 1997), semantics (Joshi and Vijay-Shanker, 1999; Kallmeyer and Joshi, 1999), and discourse (Webber and Joshi, 1998)) and a number of NLP applications (e.g., machine translation (Palmer et al., 1998), information retrieval (Chandrasekar and Srinivas, 1997), and generation (Stone and Doran, 1997; McCoy et al., 1992). This paper describes a system that extracts LTAGs from annotated corpora (i.e., Treebanks).</Paragraph>
    <Paragraph position="1"> There has been much work done on extracting Context-Free grammars (CFGs) (Shirai et al., 1995; Charniak, 1996; Krotov et al., 1998). However, extracting LTAGs is more complicated than extracting CFGs because of the differences between LTAGs and CFGs.</Paragraph>
    <Paragraph position="2"> First, the primitive elements of an LTAG are lexicalized tree structures (called elementary trees), not context-free rules (which can be seen. as trees with depth one). Therefore, an LTAG extraction algorithm needs to examine a larger portion of a phrase structure to build an elementary tree. Second, the composition operations in LTAG are substitution (same as the one in a CFG) and adjunction. It is the operation of adjunction that distinguishes LTAG from all other formalisms. Third, unlike in CFGs, the parse trees (also known as derived trees in the LTAG) and the derivation trees (which describe how elementary trees are combined to form parse trees) are different in the LTAG formalism in the sense that a parse tree can be produced by several distinct derivation trees. Therefore, to provide training data for statistical LTAG parsers, an LTAG extraction algorithm should also build derivation trees.</Paragraph>
    <Paragraph position="3"> For each phrase structure in a Treebank, our system creates a fully bracketed phrase structure, a set of elementary trees and a derivation tree. The data produced by our system have been used in several NLP tasks. We report experimental results on two of those applications and compare our approaches with related work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML