File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-2139_intro.xml

Size: 8,106 bytes

Last Modified: 2025-10-06 14:06:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2139">
  <Title>Full-text processing: improving a practical NLP system based on surface information within the context</Title>
  <Section position="4" start_page="824" end_page="825" type="intro">
    <SectionTitle>
2 Framework
</SectionTitle>
    <Paragraph position="0"> 1. G(`neratil,g a context model tlmt consists of 1)arsed trees of each seltt('n(`e ill a sour('e t(`xt 2. Refining the context model by assigning a single raftfled parse tree to (`a&lt;'h senten&lt;'e in the text 3. Resolving the prol)lems in -;t&lt;'h sentence in the &lt;'&lt;m null text ntodel an&lt;l generating a. final analysis for ea&lt;'h sentence in tit(. text The resl)ective procedures fl)r these steps are (It'scribed in the tolh)wing thre(, subs(`ctions.</Paragraph>
    <Section position="1" start_page="824" end_page="824" type="sub_section">
      <SectionTitle>
2.1 Generation of a simple context model
</SectionTitle>
      <Paragraph position="0"> In order to refer to ('ontext information that consists of dat;t on multiple senten('es in at text, it is esseutim to constru('t some eollt(`:~t model; the tirst st(' 1) of the full-text 1)ro('essing nwthod ix therefore to ('onstru('t a context lnodel by amalyzing (`a('h senten('(` in an inlmt text. To avoid any (,rrors that may o('cur during transforlmLtion into any other rel)r(`s('ntations, su(:h as a h)gicM rel)resentation , we stayed with surface structures, and to i)reserve the robustn('ss of this framework, we used only a. set of l)arsed tr('es as ;t (:ontext model. Thus, ea(:h sent.enc(` of an inl)Ut text ix pro('(`ssed t)y a syntactic lmrs('r in the first st('I), and the positi(m of eac|t instance of every h'mma., its morphological information, and its lno(lifiee-modifier relationships with other content words are extracted from the parser output, and stored to construct a context model, ;~s shown in Figure 1. In addition, if any on-line knowledge r(`sourc('s are ~tvMbd)l(`, infl)rmation extracted froln tit(, resour&lt;:es is also stored in the context model. For examl)le, infl)rmation on sym onyms extra.('te(t from an on-lilw thesaurus dictionary and information (m wor(l sense all(\[ structural disambiguation extracted D()m an examl)le l)~ts(`, such as &lt;me describe&lt;l in (Urmnoto, 1991) and (Nagao, 1990), may l)e ad&lt;led to the cont('xt model.</Paragraph>
    </Section>
    <Section position="2" start_page="824" end_page="825" type="sub_section">
      <SectionTitle>
2.2 Refinement of the context model
</SectionTitle>
      <Paragraph position="0"> In the first step, a syntactic l)~trser may not always generate a Mngl(` unified parse It(`(` for e~wh sentence in tiw source text. A syntacti(' parser with general grammar ruh's is often mml)le to analyze not only se.ntences with grammatical errors and ellipses, but also h)ng s(`nten(:es, owing to their comi)lexity, l Thus, it: ix indispensable to (`stablish a ('orrect analysis for l In texts front a restricted (lomain, suelt as compltter manu~tls, most sentences are g1:mmm~tic~tl\[y correct, ttow(wer, even a well-established syntaetie parser usually fails to generate a ratified parsed structure for a\])out 10 to 20 1)(~rc(:nt of all the sentences in such texts, and the failnre in syntactic analysis leads to a failure in the filt~tl outl)l/t  su('h a s('ntenee, hfformation extracted front COlnpl(`te 1)arses of w(`ll-formed sentences 2 in a context model ('all b(` us('(l to cOlnlflete incolnl)lete parses, in the f()rm of partially parsed chunks that a bottom-up 1)ars(,r outlmts fl)r ill-formed sentences by using a previously des('ribed method (N~Lsukawa, 1995).</Paragraph>
      <Paragraph position="1"> On the other hand, fl)r some sentences in a text, such as Time \]lies like an arrow, a syntactic t)arser lltay gent,rate nlore thatl olle parse tree, owillg to the 1)r(`sen(-e of words that Call \])e ;Lssigned to more than one part of st)eech , or to the l)resen('e of complicated coordinate structures, or for wtrious other re~Lsons. In attempting to select the correct 1)arse of such a sent(`nee, on(' (;an use the tyt)es of the l)revious and subse(\[lleltt sentences or 1)hras(`s (Sll('h as sentence, llOllll phrase, verb 1)hrasc, anti so ()It) an(l the modifier-modifiee 1)atterns in the context model.</Paragraph>
      <Paragraph position="2"> Therefore, in the second step, tit(: context model g(`nerat(`d in the firs{; st(' 1) is refined by referring to information in the context model. First, the most l)referable candidate parses are selected for sentences with multit)le parses by referring to information on ea('h sentence in the context model for which a parser lent'rated a single unified parse. Then, partiM parses of ill-forlned sentences are ('ompleted by referring to information on well-h)rmed senten(:es in the context model.</Paragraph>
      <Paragraph position="3"> The algorithm for multiple parse selection based on &amp;quot;'Ill this paper, a &amp;quot;well-fornwd senten(-e&amp;quot; life,IllS ()It(' that is 1)arsed as one or lllOl'e than Ol1(` lllli~i('d strll('tllre~ and an &amp;quot;ill-formed sent(me(`&amp;quot; means one that c;mnot be pm'sed as a unified strncture.</Paragraph>
      <Paragraph position="4">  the context model is as fi)llows:  1. In each candidate 1)arse of a sentence with nmMph' candidate i)arses, assign a score for each lnodifiermodifiee relationship that is fl)und in the context model, and add u I) the scores to assign a 1)reference value to the (:andidate l)arse.</Paragraph>
      <Paragraph position="5"> 2. Select the 1)arse or 1)arses wilh the highest preference value. If more than one l);~rse has the highest t)ref erenee wdue, go to the next ste 1) with those lmrses; otherwise, leave this i)ro('edure.</Paragraph>
      <Paragraph position="6"> 3. Assign a 1)reference value to each remaining candidate parse that has the same tyl)e of root node (su('h as noun phrase, verb l)hrase, or sentence) as the parse of the 1)receding sentence or the next senten('e.</Paragraph>
      <Paragraph position="7"> 4. Select the parse or 1)arses with the highest 1)reference wdue. If more than on(' parse has the highest 1)ref erence value, go to tit(, next ste I) with dtose 1)arses; otherwise, leave this procedure.</Paragraph>
      <Paragraph position="8"> 5. Assign a preference wfiue to ea('h remaining ('andidate parse based on heuristic ruh's that assign scores to structures according to their grammatical preferability. null 6. Select the parse or parses with the highest preference value. If more than one t)arse has the highest 1)reference wfiue, select the first parse in the list of the remmning candidate parses.</Paragraph>
      <Paragraph position="9"> Tile procedure of conq)leting l)artia\] \])kLl'ses of a.n ill-formed sentence consists of two steps: 1. Inspecting and restrnet.uring of each 1)artial parse  The part of st)ee('h mid the modifiee-modifier relationshil)s with other words are inspe('ted for each word in a 1)artial l)arse. If the part of speech and tit(&amp;quot; modifiee-modifier relationships with other words are different from those in the eont('x:t model, the 1)aerial parse is restructured a('eor(ling to the information in the context model.</Paragraph>
      <Paragraph position="10"> 2. Joining of partial pmses If the 1)artial l)arses were not ratified into a singh&amp;quot; structure in the previous step, they arc, joined together on tit(&amp;quot; l)asis of modifier-modifiee relationshil) 1)atterns in the ('ontext model so that a unified i)arse is obtained.</Paragraph>
    </Section>
    <Section position="3" start_page="825" end_page="825" type="sub_section">
      <SectionTitle>
2.3 Problem resolution for each sentence in
</SectionTitle>
      <Paragraph position="0"> the context model Finally, in the third stel) , ea,'h senten('e in the ('Olltext lnodel is mmlyzed individually, and its mnl)iguities and context-dependent prol)h'ms are resolved by referring to information on other sentences in the context model. The next section des('ribes the 1)rocedures for problenl resolution, and explains lheir effectivene, ss in lint)roving nmehine transla.don output.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML