File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0309_intro.xml

Size: 5,198 bytes

Last Modified: 2025-10-06 14:02:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0309">
  <Title>The information-processing difficulty of incremental parsing</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Entropy Reduction
</SectionTitle>
    <Paragraph position="0"> The idea of the entropy reduction of a word is that uncertainty about grammatical continuations fluctuates as new words come in. The ERH is the proposal that fluctuations in this value be taken as psycholinguistic predictions. This proposal is founded on the possibility of viewing nonterminal symbols in probabilistic grammars as random variables. For instance, in the rules given below, 0.87 NP-theboy 0.13 NP-thetallboy the nonterminal NP can be viewed as a random variable that has two alternative outcomes. Indeed, nonterminals generally in probabilistic context-free phrase structure grammars (PCFGs) can be viewed this way. Since their outcomes are discrete, their entropy H is easily calculated</Paragraph>
    <Paragraph position="2"> There is just over half a bit of uncertainty about how NP is going to rewrite, because the outcome is so heavily weighted towards the first alternative.</Paragraph>
    <Paragraph position="3"> In this simple example there is no recursion, so the generated language is finite. To obtain the uncertainty about infinite PCFG languages, a recursive relation due to Grenander (1967) can be used to calculate the entropy of the start symbol S which begins all derivations.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Entropy of nonterminals in a PCFG
</SectionTitle>
      <Paragraph position="0"> Grenander's theorem is a recurrence relation that gives the entropy of each nonterminal in a PCFG G as the sum of two terms. Let the set of production rules in G be P and the subset rewriting nonterminal x be P(x). Denote by pr the probability of a rule r having daughters xj1,xj2,.... Then</Paragraph>
      <Paragraph position="2"> the first term, lowercase h, is simply the definition of entropy for a discrete random variable. The second term, uppercase H, is the recurrence. It expresses the intuition that derivational uncertainty is propagated from children to parents.</Paragraph>
      <Paragraph position="3"> For PCFGs that define a probability distribution, the solution to this recurrence can be written as a matrix equation where I is the identity matrix, vectorh the vector of the h(xi) and A is a matrix whose (i,j)th component gives the expected number of nonterminals of type j resulting from nonterminals of type i.</Paragraph>
      <Paragraph position="5"/>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Incomplete sentences
</SectionTitle>
      <Paragraph position="0"> Grenander's theorem supplies the entropy for any PCFG nonterminal in one step by inverting a matrix. To determine the contribution of a particular word, one would like to be able to look at the change in uncertainty about compatible derivations as a given prefix string is lengthened. When this set, the set of derivations generating a given string w = w0w1 ...wn as a left prefix, is finite, it can be expressed as a list. In the case of a recursive grammar this set is not finite and some other representation is necessary.</Paragraph>
      <Paragraph position="1"> Lang and Billot observe (1974; 1988; 1989) that the incremental state of a parser can be described by another, related grammar. They view parsing as the intersection of a grammar with a regular language, of which ordinary strings are but the simplest examples. This perspective readily accommodates incomplete sentences as regular languages whose members all have the same initial n words but continue with all possible words of the terminal vocabulary, for all possible lengths. If L(G) is the language of the grammar G, parsing an initial substring w is the intersection depicted in 3 where the period denotes any terminal symbol of G and the Kleene star indicates any number of repetitions.</Paragraph>
      <Paragraph position="3"> The result of this intersection is a new context-free grammar describing just the derivations whose yield begins with the string w. By generalizing the input from a single string to a regular set of strings, the grammatical continuations can be captured in the new, output grammar. These grammars are easily read off of chart parsers' internal data structures by attaching position indices to nonterminal names, thus distinguishing recognized constituents in different positions.</Paragraph>
      <Paragraph position="4"> The uncertainty associated with the the start symbol of this new, resultant grammar is the conditional entropy H(S|w1,w2,***wn). The entropy reduction of word wn+1 then is the downward change in this value as the string w is made one word longer.</Paragraph>
      <Paragraph position="5"> The proposal of the ERH is that these changes measure the disambiguation work the comprehender has performed by ruling out possible syntactic analyses.</Paragraph>
      <Paragraph position="6"> SUBJECT [?] DIR. OBJECT [?] INDIR. OBJECT [?] OBLIQUE [?] GENITIVE [?] OCOMP</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML