XML Viewer - w06-2002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-2002_abstr.xml
Size: 4,280 bytes
Last Modified: 2025-10-06 13:45:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2002">
  <Title>A Framework for Incorporating Alignment Information in Parsing</Title>
  <Section position="2" start_page="0" end_page="9" type="abstr">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Much of the current research into probabilistic parsing is founded on probabilistic contextfreegrammars(PCFGs)(Collins,1999;Charniak, null 2000; Charniak, 2001). For instance, consider the parse tree in Figure 1. One way to decompose this parse tree is to view it as a sequence of applications of CFG rules. For this particular tree, we could view it as the application of rule &amp;quot;NP-NPPP,&amp;quot;followedbyrule&amp;quot;NP-DTNN,&amp;quot; followedbyrule&amp;quot;DT-that,&amp;quot;andsoforth. Hence instead of analyzing P(tree), we deal with the moremodular:</Paragraph>
    <Paragraph position="2"> Obviously this joint distribution is just as difficulttoassessandcomputewithas P(tree).However there exist cubic time algorithms to find the most likely parse if we assume that all CFG rule  applicationsaremarginallyindependentofoneanother. Inother words, weneed to assume that the aboveexpressionisequivalenttothefollowing:</Paragraph>
    <Paragraph position="4"> Itisstraightforward toassess theprobability of the factors of this expression from a corpus using relative frequency. Then using these learned probabilities, wecan findthe mostlikely parse of a given sentence using the aforementioned cubic algorithms.</Paragraph>
    <Paragraph position="5"> The problem, of course, with this simplification is that although it is computationally attractive, it is usually too strong of an independence assumption. Tomitigatethislossofcontext,without sacrificing algorithmic tractability, typically researchers annotate the nodes of the parse tree with contextual information. For instance, it has been found to be useful to annotate nodes with their parent labels (Johnson, 1998), as shown in Figure2. Inthiscase,wewouldbelearningprobabilitieslike: P(PP-NP-IN-PPNP-PP). The choice of which annotations to use is one of the main features that distinguish parsers based on this approach. Generally, this approach has proven quite effective in producing English phrase-structure grammar parsers that perform wellonthePennTreebank.</Paragraph>
    <Paragraph position="6"> One drawback of this approach is that it is somewhatinflexible. Becauseweareaddingprobabilistic context by changing the data itself, we make our data increasingly sparse as we add features. Thus we are constrained from adding too  many features, because at some point we will not have enough data to sustain them. Hence in this approach, feature selection is not merely a matter ofincludinggoodfeatures. Rather,wemuststrike a delicate balance between how much context we want to include versus how much we dare to partitionourdataset. null Thisposes aproblem whenwehavespenttime andenergytofindagoodsetoffeaturesthatwork well for a given parsing task on a given domain.</Paragraph>
    <Paragraph position="7"> Fora different parsing task or domain, our parser mayworkpoorly out-of-the-box, anditisnotrivialmattertoevaluatehowwemightadaptourfea- null turesetforthisnewtask. Furthermore, ifwegain accesstoanewsourceoffeatureinformation,then it is unclear how to incorporate such information intosuchaparser.</Paragraph>
    <Paragraph position="8"> Namely, in this paper, weare interested in seeing how the cross-lingual information contained by sentence alignments can help the performance of a parser. We have a small gold-standard corpus of shallow-parsed parallel sentences (in English,French,andGerman)fromtheEuroparlcor- null pus. Because of the difficulty of testing new features using PCFG-based parsers, we propose a new probabilistic parsing framework that allows us to flexibly add features. The closest relative  entry (i, j)=true iff span (i, j) is a constituent inthetree.</Paragraph>
    <Paragraph position="9"> ofour framework isthe maximum-entropy parser of Ratnaparkhi(Ratnaparkhi, 1997). Both frameworks are bottom-up, but while Ratnaparkhi's views parse trees as the sequence of applications of four different types of tree construction rules, ourframeworkstrivestobesomewhatsimplerand moregeneral.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML