File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-1008_metho.xml

Size: 28,643 bytes

Last Modified: 2025-10-06 14:12:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1008">
  <Title>ON TEXT COHERENCE PARSING Udo llahn Albert-Ludwigs 4hfiversit~t Fmiburg Linguistische Infonnatik / Computerlinguistik</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 MOTIVATING THE NEED FOR TEXT
COHERENCE PARSING
</SectionTitle>
    <Paragraph position="0"> Tbe model of text structure parsing we propose draws a careful distinction between text cohesion and text coherence phenomena. As to the illustration of text cohesion mechatfisms in natural language texts, consider the following text passage: \[1\] The De/taX from ZetaMachineslnc. is a computer system that mns Unix V.3.</Paragraph>
    <Paragraph position="1"> \[21 ~h.e_Lw\[9~ is based on a 68020 processor.</Paragraph>
    <Paragraph position="2"> \[3\] It has a 12-inch monochrome display and an integrated telephone handset and built-in modem.</Paragraph>
    <Paragraph position="3"> \[4\] Internally, there's a 40-megabyte hard disk, a 1.2megabyte 51/4-inch floppy disk drive, 4.5 megabytes of RAM, three RS-232C ports, and an S T-506 port.</Paragraph>
    <Paragraph position="4"> Repeated occurrences of various text cohesion phenomena are illustrated by nominal anaphora (7&amp;quot;he system' in \[2\]), pronominal anaphora ('/t' in \[3\]), both referring to the unique antecedent Delta-X (in \[1\]), while '/nternally, there's a ... hard disk&amp;quot; (in \[4\]) is linked to Delta-X via textual ellipsis. The basic cohesion among these sentences yields the common thematic background for constantly elaborating on a single topic (Delta-X). An appropriate text parser should, first of all, recognize these multiple cohesion phenomena and produce something like the following representation structures (indicated by \[...\]R):  Delta-X &lt; external storage devices: { 1.2-megabyte 51/4-inch floppy disk drive } &gt; Delta-X &lt; main memory: { 4.5 megabytes of RAM } &gt; Deha-X &lt; ports: { 3 RS-232C ports } &gt; Deha-X &lt; ports: { ST-506 port } &gt; Ac'r~!s DE COLING-92, NArcl'l~s, 23-28 AoC-r 1992 26 What is still lacking is a representation facility which  characterizes this sequence of single assertions constantly referring to a single topic (Delta-X) as constituting a coherent whole. Recognizing linguistic forms of text coherency and providing appropriate thematic grouping operators for text knowledge bases is what text coherence parsing mainly is about. Even if parsers would perfectly recognize and normalize all occurrences of text cohesion phenomena in texts, missing recognition capabilities for text coherence phenomena would nevertheless produce under-structured, incoherent text knowledge bases in the sense that global pragmatic indicatops of discourse bracketing would be lacking.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 BASIC TEXT COHERENCE PATTERNS
</SectionTitle>
    <Paragraph position="0"> In this section, we informally describe the basic patterns of text coherence focused on in this paper. According to Danes \[1974\] three categories of thematic developments can be distinguished: ~1 Constant Theme. This pattern is characterized by the con.~tant elaboration of one specific topic within a text (passage) by considering several of its conceptual facets. The following two paragraphs serve to illustrate this major pattern of thematic progression (the reference points to the constant theme (Delta-X) are indicated by italics): \[TI.ll. The Delta-X from ZetaMachineslnc. is a multiuser, multitasking computer system that runs Unix V.3 and comes complete with most of the software needed for business applications. The combination host computer/workstation is based on a 68020 processor, with dual 68000 processors providing peripheral processing. It has a 12-inch monochrome display andan integrated telephone handset and built-in modem.</Paragraph>
    <Paragraph position="1"> Internally, there's a 40-megabyte hard disk, a 1.2megabyte 51/4-inch floppy disk drive, 4.5 megabytes of RAM, a network controller, three RS-232C ports, and an ST-506 port.</Paragraph>
    <Paragraph position="2"> 7\] Continuous Thematization of Rhemes. In contrast to constant themes, this pattern realizes a continuous shift of topics (visualized by bold italics). The process starts with a theme and ,some comment on that theme which we shall call theme (actu* ally, an elaboration on one of its conceptual facets). Now this rheme is focused on as the next theme that is elaborated by a corresponding rheme, etc.: IT1.2\]. The $12,000 Delta-X host/workstation can be supplied from ZetaMachines Inc.. 2999 State St., Santa Barbara, CA 93105. Zeta-Machines&amp;quot; sales manager, Brian Wilson, says that they also plan to market the Gamma-Z, a CAD/CAM workstation based on a Connection Machine architecture. The underlying theoretical foundations are due to D. Hillis, a former M.I.T. student who first developed an experimental prototype based on connectionist principles.</Paragraph>
    <Paragraph position="3"> Derived Theme. Global text structure can also be introduced by a variety of topics which share conceptual commonalities (facets) at the knowledge repreSelltation level (not necessarily need this be paralleled with properties actually mentioned in the text!) without the general concept being explicitly stated in the text. Technically this is realized by a set of sub-PROC. Cq: COLING-92, NANTES, AUG. 23-28, 1992 ordinates or instances of a common (only implicit) supcrordinate/prototype. Suppose the iUuslrative text ITI\] composed of its two constituent parts from above, \[T1.1 \] and \[T1.2\], is augmented by ~vel~d paragraphs dealing with Gamma-Z and Sigma-P machines on a similar level of detail as those passageswlfichcor~sidertheDelta-X in \[TII: \[T21. The DeltaoX from ZetaMachines... \[1'1. I~TI.2\] The Gamma-Z is a MS-DOS machine. Peripheral devices include an 8- inch color display, a tmarix printer , and a key&amp;)ar d ....</Paragraph>
    <Paragraph position="4"> The Sigma-P system makes available a lot of desirable application sz~ftware such as a ck~tatnt~e,~stem, word processing, and a variety of games ....</Paragraph>
    <Paragraph position="5"> This text implicitly has workstation as a derived lhemc, since that is the immediate prototype concept of those three instauees (Delta-X, Gamma-Z, S igma-P) explicitly menlioned in \[T 2\].</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 TIlE KNOWLEDGE SOURCES
INVOINED IN TEXT PARSING
</SectionTitle>
    <Paragraph position="0"> This section deals with the .knowledge sources involved in actually parsing a text. Basically (see Figure 1), these are constituted by the PARSE BULLETIN, a blackboard-type memory which records the single events of the parsing process, the DOMAIN KNOWLEDGE BASE, which contains file domain-specific background knowledge needed for the parse, and various EXPER~Ps for actually driving the parse through the text grammar specifications they incorporate (cf. tlahn \[1990\] for a more comprehensive presentation).</Paragraph>
    <Paragraph position="1"> The PARSE BULLETIN has a flat list struc.</Paragraph>
    <Paragraph position="2"> ture. It records the sequence of text tokens as they appear in the text and, if relevant (see below), notes their class identifiers (FRAME item, ADJective, etc.). More imlxmant, cox~structivc parsing activities based on operations of the knowledge base and the parser are indicated at ~ver',d positions (so-called parse points) in the PARSE BULLETIN. The type of operation being performed is indicated by a particular parse descriptor.</Paragraph>
    <Paragraph position="3"> Some are internal to the management of the knowledge base, e.g., DEFACF (default concept activation), while others indicate grammatical relations recognized by tile parser, such as NounA'Vl' (conccptu~d attribution relations between nouns), AdjA'FI' (conceptual attribution relations between adjectives and nouns). The items alZ lcctcd by an operation lorm a so-called parse mple.</Paragraph>
    <Paragraph position="4"> The parser does not consider every token it receives from the input text at the same level of detail.</Paragraph>
    <Paragraph position="5"> Instead, it distinguishes between words which am signilicant to its performance (conceptually relevant ones, such as nouns or arljcctives which denote concepts in the domain knowledge base, or linguistically relevant ones, such as negation particles, certain conjtmctions, quantiliers, etc.), and tho~ that are not (anrong them a wide variety of semantically indifferent nouns, verbs, particles, etc., each of which is assigned the class identifier NIL). The latter are simply discarded from further analysis, while the fom~er arc assigned lexicalized grammar spccificafiorts. The parser h~s thus been tuned towards partialparsing in a spirit similar to that advocated by Schank ct al. \[19801 and achieves text understanding primarily on a terminological levcl of knowlcdge representation.</Paragraph>
    <Paragraph position="7"> for shoo) contains frame representation structures.</Paragraph>
    <Paragraph position="8"> E~:hframe identifier (in bold face) is assigned a list of slots (enclo~d by angular brackets). Them sioLs are associated with two different kinds of slot fillers. Permit ted slot fillers are enclomd in square brackets, \[a-framo namo\], which characterizes the range of possible slot fillers by ,all those fr~mles which ale a sulx)rdinate or an instance of framo name. Actual slot fillers are enclosed in curly braces and can be taken as facts either known a prk)ri to ll~c system or acquired continuously from the text as its understanding proceeds during file parm.</Paragraph>
    <Paragraph position="9"> In addition, each concept has attached to it an a.'~ tivation weight counter. The values of the weight fac~ \[ors are enclosed by vertical bars attached to each item; if no bars explicitly occur, a zero weight is assumed.</Paragraph>
    <Paragraph position="10"> Activation weights arc incremented (starting from zero-level activation) whenever a noun denoting its associated concept occurs in the text, and whenever structure-building operations in KB aflect that concept. The ma~ I'ROC. OF COI,ING-92, NAN'IES, Atl(;. 23.28, 1992 nipulation of activation weights serves several pur~ poses, the major ()tie being their use as an indicator of salience of concepts during rite text condensation phase, (luring which text summaries are generated flom the text representation structures resulting from lhe text parse \[Reimer &amp; tlahu 19881.</Paragraph>
    <Paragraph position="11"> The text grammar is composed of a set of distributed graulmar experts, cach one responsible for sortie specific linguistic function (e.g., concept attribution via nominal, adjectival or prepositional phrases, mlaphora).</Paragraph>
    <Paragraph position="12"> Each expert ix characterized by a unique EXPERT NAME trod ix activated by a message event, i.e., by receiving a message text which nifty contain some parameters. 111 order to check its conlt~tence in contributing to the parse, pre-ennditions com\[xrsed of complex test predicates are evaluated. If these pre-conditions hold for that expert, the post-conditions immediately apply, i.e. messages are sent to qualified actors (to other grammar experts, to the domain KB or to the bulletin).</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 A DISTRIBUTEI) MODF, I~ OF TEXT
COtlERENCE PARSING
</SectionTitle>
    <Paragraph position="0"> fil this paper, we shall not go intn the details of phrasal, clausal, and text cohesion parsing (of. llahn \[ 1989\] lot fin in-flcpth coilsideration of related technical issues).</Paragraph>
    <Paragraph position="1"> hlstead, we assume that these preliminary activities have aheafly teen carried out properly arid lhat sonic initial strnctural representation is already available from tile bulletin. These requirements are fulfilled in the snapshot of the PARSE BULLETIN in Figure 1, taken after all local parsing events have terminated; dlis characterizes a state ready to tune to the activation o\[ global text stnlclure computing experts.</Paragraph>
    <Paragraph position="2"> We here consider the end of the paragraph (denoted by the symbol 0 and the class identilier EOP) as an lulchoring point for coherence computation. It is motivated hy the observation that -- at least in tile sublanguage domain we are currently working in -- major tnpic movements occur predominantly fit paragrat)h boundaries. This coincides with linguistic evidence for the (text)grammatical status o1: paragraphs \[tlinds 1979, Giora 1983b, and Zadrozny &amp;Jcnsen 1991\]. Therelore, the proper rccogalition of textual macro structures is always initialized at the end ofa paragnq)h.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Considering Constant Theme
</SectionTitle>
      <Paragraph position="0"> Constant themc is a coherencc pattern which is characterized by multiple occurrences of a singlcJJ'ame in tt~ PARSE BULLETIN within one paragraph. Most of its occurrences, in turn, arc accompanied by a slot and/or slot fillet&amp;quot; indicating that some knowledge base operation with respect to.9~ame has ficcn carried out in KB (e.g., slot filling as indicated by NounA'lT or AdjA'IT for which wc shall introduce the LC* descriptor as a convenient shorthand notation). It is the cnntilmous elaboration of that particular conccpt that makes the corresponding text passage coherent. While tbe bulletin maintains file sequential order of these (,pclations, KB provides the conceptual background lot coulinuous references to Ihe same frame object.</Paragraph>
      <Paragraph position="1"> Vigure 2 visualizes the description for constant theme; the DOMAIN KNOWI,EIXiE BASE window displays fill properties of frame dealt with in a text (passage) in the shadowed area of the frame Ix)x, while those ilot mentioned in tile text are in tile remaining white pat~t. Consequently, it is neither neccssaly Ihat all Acllis tIE COLING 92, NANTES, 23 28 AO(JT 1992 2 8 slots of a frame awulablc in the knowlcdgc basc be referred to in the text (as with sloth41 ...... ~'lotm), nor that there t)e any ordering constraint relating single slots of a fl'amc in KB to thc sequence of slot filling operations in the PARSE BUI,LETIN.</Paragraph>
      <Paragraph position="2">  The general pattern from Figure 2 is already present in Figure 1. This contains a description of the par'sing results of the first paragraph of text \[TI.1\]. The entries in the PARSE BULLETIN have been worked out by experts for linguistic phenomena on tile local level of phrasal, sentence and text cohesion analysis.</Paragraph>
      <Paragraph position="3"> For the propose of constant theme computation, we need only consider those entries whose pat.se descriptor designates manipulations of slots or slot values of some frame (LC*-typc descriptors, such as NounATT or Adj-A'FF). Other descriptors are irrelevant here and have been left out on purlx)se in Figure 1. From this we construct the set THEMES. It consists of triples ( J?ame, slot, bullpos ) where frame is file name of a frame, and slot is the name of a slot of that frame, both co-occurring as lexical parameters of some parse tuple in the PARSE BULLETIN with a LC*-typc pal~e descriptor; bullpos gives file parse point in file PARSE BULLETIN where frwne mid slot occur iustzmtancously. With respect to Figure 1 TIIEMES is given by:  When considering TIIF, MES, we want tile criterion for constant {heine to tm spcci\[ied in a way thai ac-COUIKS 10t tile fact that up to parse ix}int '037' each slt.}l (value) manipulation reR:rs to one particular 1heine (Delta-X). Between parse lx}int '039' and '046' there is a minor themalical distortion in thai there is no proper referetlce to that \[hellle, although slots are menlioltcd which are associated with other concepts, llowever, from parse lXfint '046' onward the already established theme is taken tip again till the end of tile para.graph. In conclusi{}n, Delta-X seelns to be a 1}mt~r ean{tidate for consideration as a constatlt theme o\[ Ihat l}aragl'aph. 1 Figure 1 provides a snapshot of the pro-conditions that are encountered by tile CT EXPERT, the coher ence expert for ConstantTheme. Runnin 8 twice, sup plied with diflcrent parametm,-;, it wolks out lhc results alluded to alxwc. The grammatical knowledge needed for tile determination of it constant theme is incorporated in its pre-collditi{m part. This expression is evaluated q~l,/E iff conslanl-lhetne produces sotnc theme and at associated mm-cmpty set RtlEMES related to theme, otherwise it is FAI~SI,;. Thc conditions for a constant theme can now Ira. stated morn precisely:</Paragraph>
      <Paragraph position="5"> (a) testpos &lt; textpos &amp; (b) ( textpos, O, EOP) is in the PARSE BUIA,ETIN ~ &amp; (el (prepos, O, COP ) is also in the PMLSE BULLI'; TIN such that prepos &lt; textpos and such that no other triple with 'C/' as text item interwmes between prepos and textpos in the l'Al~qE BUIAA:,~ TIN &amp; (d) newpos * Imax( prepos, testpos )+1, textpos- I \[ &amp; (el theme is a frame in the DOMAIN KNOWL-EI)GE BASE &amp; (f) V ki c \[max( prepos, testpos ) ~1, ~tewpos- 1\]: (theme, slot, k i) { TIIEME8 .===&gt; slot c IeHEMES &amp; (g) -,~ k&amp;quot; c Imax( prepos, testpos)+l, newpos-1\]: ({z) air_theme (distinct from theme) is a fl'ame in the DOMAIN KNOWLEDGE BASE &amp; (\[~) (alt. theme, slot', k&amp;quot; ) &lt; TI\[EMES &amp; (,%) ,H tsk&amp;quot; (. TtIEMES: tsk&amp;quot; = ( theme, slot, k&amp;quot; ) &amp; (h) IRHEMESI &gt; 2 &amp; (i) newpos is maximal in the sense that  tion giv~ below ~mly ImMs for file specific sample text irdeHed to lhrilughoul this i~\[~r lustead, it sh~ll{I indicate that, alth~lgh the blsic idea of Ul~llt\[ic prngres~lo(1 patterns ix overwhelmingly i;inl tie, levi-life texll lcJid \[o I~ less homog~us with rcapect to Ih~C/ pattefr~ ~an one lilly COllsider under c\[C/lIi Ilkmratory condititms. Thus, fln~nal de.~crlptions have to be inherraltly mbusl towards ~uch hx:al foml~ of digmssi~ls 2 Referenc~ to mltrles th the PARSI! BUI~I.|~I'IN have the fro,nil ( PantePoint, par*dL'uple, I)at~eI)e~cnptar ).</Paragraph>
      <Paragraph position="6"> AC1T:S lIE COLING-92, NANI'ES, 23 28 ^O~;t 1992 2 9 Some {'onimmllS lelated {{} this specilicafion: (a) The l}aramclms supplied to ctm.~Hlnl-lheme Spill lhe spatial extellsi{lil in PARSE BI.JLI,ETIN which IS searched I{}l' it c{)nstiltll l.heltlC; tgxft)os always denotes the end t)f {he cuucnt l}aragraph, i.e. the upper lx}und of |he search area, while testpos delimits its h)wer bound.</Paragraph>
      <Paragraph position="7">  (It) The t}alse D}int characterized by textpos iaust colt~ tam tile end of-palagraph syitil}(}l 0.</Paragraph>
      <Paragraph position="8"> (c) Since testt~o.~&amp;quot; ll&gt;ay bc any arl)itrary parse tx}int preceding textpos, prepo.~ denotes tile pat=sc point in PARSF, BtJ1.1 ,t';TIN thai contains 1he end-of-paragraph symlx)l occurfin.p; right l~'ft)re tile one {}ll palse ix}in{ textpos.</Paragraph>
      <Paragraph position="9"> ({1) After lixing the search intelwll in the bulletin for which a col\]stanl IhenK: is going to bc coiuputed, tle~.vpos allows \[0r vii'it)us choices as to how far a constant thenle may acLually extend iu that interval. (c) theme nlay be any frame from KP,.</Paragraph>
      <Paragraph position="10"> (I) A ttu~me is related Ill ilS various fitcmes actor{ling to Ihe fblh}wilw, condition: ill each btllletin pasitic}n (k) where t/let, le t}cctlls in &amp;quot;I'llI,;MI'2~ wilhin lhe in. tclval delimited by newl)oS, its associated slot (slid glc fimme) is assig\[~cd to lhe set RIIEMES.</Paragraph>
      <Paragraph position="11"> (g) To guarautee lhat the~m~ is the only topic dealt wilh ill Ihe text, wBals{} requile that uo ah lh(:t?lt! differelll \[rl)ilt t\]leDle Occur ill lit{: chosell iiltelval such that it. also f{}nns pall t}f TIiI;MES .-- (;0 accotlllls ft}r m{}r{: eomplicat{'d cases where both, ah theme a\[td themL', i/lay {g,3cllr at tile Salllc p\[nNe poini. (hi To role out insignilicant occttlrelIces of theme ilK: cardinalily of RIIEMES must exceed a cemlin level.</Paragraph>
      <Paragraph position="12"> (i) The maximality criterion for newpos rules oul  choosilig t{}{) Slll~l\[I valtleS (if tiewDos.</Paragraph>
      <Paragraph position="13"> l.el us now consider an Bxanlple (}1 the COmllUla..</Paragraph>
      <Paragraph position="14"> liotl iltocesses illvolved ill actual c{ttl{'.i'ellce \[}arsillg (sec l:il,,ure 1). ValiOllS coherellCe eXl~.:llS slafl execllliOll tll}{}1\] consulnplioll of the 0 symlx)l (indicating tile end ol a paral;raph) by lhe administration ell}eli of t\[te pai~er, hut wc shall limit om attenlR}l |to (2'1&amp;quot; EXI)EIUI ' (since the others will eventually staIvc). After receiving l:}le&lt;:k CT{ \]'2OP, \[)!15, \[)00 ) as ils st;.Irtilt}~ laessagt.;, cottstanl-theme is sutlplicd with inilial paranieters: textpos :: {)55, testpos = {X}0. Obviously, pr~7}os = 0{X}, since the analysis st;ms l{)1 the til~t paragraph of the text. newpos clay ll(}W galilee lrOlll '0{)1' 10 '054'. l,et us consider Delta-X as theme. (This is a proper choice. 11 iml}ropcr choices were ntade, cott,%'ltttHtheme w{}nld not t)roduce a significant result.). The chaice lor newpos milS\[ aCCOlilitlottatc Ihe tClllp{}raly breakdown o{ the selected thet~w beginning from tx}sitioll '{}39', since we have k' ~ {}39 { \[IX)l, 0541 with all theme :: 68000-1 (or 68000-2) ill TtlEMES and Bo pr0pcr triple ( l}eRa-X, slol 039 ) as required by condition g(x.) al:~)ve. So newpos has to be adjusted properly to tile parse point '{}39', at which lx}int tile constant theme i}attem for l)eha-X eventually temfinates for lit{*, lirst time. This produces:</Paragraph>
      <Paragraph position="16"> \[rrl(Lztuthcttzrer, usage mode, operaling m~le, operatillg ~y~ tern, application domain, CPU, processors}, 089 ) and ('T EXPEIIT issues a {71'4 roup reading to KB incoi'lx}i'~lting lhe constant theme togcther with its ass{}elated i\]tetiies, Since lhc PARSE BUIJ ,I C/TIN hlts not exhat, stivcly lmen investigated with restmct to its coherence data PRO}C. Of, U()I,IN{; 92, NANIE:% AU{;. 2L28, 1992 (newpos+l &lt; textpos), CT EXPERT resumes execution, now starting with a-~econd set of parameters: textpos = 055, testpos = 039 (see the second expert placed into the foreground in Figure 1). Again, prepos -- 000, but due to the new testpos parameter newpos is now in the interval \[40, 54\]. The evaluation of constant-theme( 055, 039 ) starts with a proper choice of newpos = 054. testpos+ l excludes 68000-1 (68000-2) from further consideration. Finally, we obtain cor~t~nt-theme( 055, 039 ) = ( Delta-X, {i /o devices, peripheral devices, communication devices},054) Note that the occurrence of display-I at parse point '046' does not conflict with criterion (g), since we also have Delta-X (thematically related to i/o devices &amp;quot;and peripheral devices) at that parse point (cf criterion g(z)). Since the end of the paragraph has been reached, the coherence computation process hails.</Paragraph>
      <Paragraph position="17"> Figure 3 represents the effects of grouping a constant theme and the themes referred to in the text passage (cf. \[055.1\] and \[055.2\]) by the shadowed area of the (frame) box. This indicates that the grouped items are treated coherently in a text passage.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Remarks on Continuous Thematization
</SectionTitle>
      <Paragraph position="0"> of Rhemes and Derived Theme Similarily, formal descriptions have been worked out for the other two basic text coherence patterns mentioned above. Instead of a full treatment, we give two rather informal sketches of the underlying regularities as they have been incorporated into our framework. Contitmous thematization of rhemes most significantly departs from the constant theme schema just outlined (in fact, both are mutually exclusive) in that the former incorporates a continuous shift of the topics being considered. Figure 4 illustrates this permanent change of issues in a text. The PARSE BULLETIN contains a sequence of local theme-theme pairs withframeTi being tile current local theme and slotftllerTi being its associated local rheme. Text coherence is due to the fact that the current local theme (slotfillerTi) becomes the next local theme (framerl+l). This rheme-specific connectivity criterion is stressed by the double-sided black arrows in the DOMAIN KNOWLEDGE BASE which link the immediately preceding theme to its identical theme succes-</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Acrl~.s DE COLlNG-92, NAM'ES, 23-28 ho~r 1992 3 0
</SectionTitle>
    <Paragraph position="0"> sor, while local theme-theme connections are indicated by the one-sided grey arrows which go l~om the local theme to its associated local tl~eme. A sequence of local theme-theme pairs fulfilling the rheme-specilic conuectivity criterion in terms of overlapping palmneters (current rheme becomes next theme) constitutes what is Item called continuous thematization of rhemes, i.e. a g/oba/theme-theme cluster.</Paragraph>
    <Paragraph position="1">  Conn. Machine architecture - developer-D. Hillis The third pattern further generalizes the results of the afore-going coherence computations on the paragraph level and extends them over various (adjacent) paragraphs and possibly over the whole text. Consider a series of paragraphs, each one dealing exclusively with one special topic (see Figure 5 below). The first paragraph deals with frame T 1, tile second one elaborates onframeT2, etc. A derived theme can be computed when all these different (sub)topics call be linked to the most specific general (super)topic (frameT). In technical terms, these subtopics are all instances of that Ptu)c, ov COLING-92, NANTES, AUG. 23-28, 1992 supertopic.Text \[T2\] illustrates Otis pllenomenon: there are three paragraphs whose major topics arc Delta-X, Gamma-Z, and Sigma-P; a conceptual generalization step links them to the derived theme work, s'tation. In Figure 5 this relationship is indicated by thc arrows pointing fi'om each subtopic (of a single paragraptt) to its supertopic, thematically characterizing these paragraphs on a more general level of conceptualization.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 The Merits of Text Coherence Parsing
</SectionTitle>
      <Paragraph position="0"> Among the many advantages to having text coherence pbenomena under computational control we here emphasize their potential for information retrieval dialogs. Evidence for this comes from our experiments with TOPOGRAPHIC, an interactive graphical interface to TOPIC's text knowledge ba~s \[Thiel &amp; Hammwhhner 1987\]. In particular, we observed a close funclional relationship between the selection of particular coherence patterns and particular search states during the retrieval process which is performed on network representations of text summaries, so-called text graptLs: 1) Constant Theme coherently characterizesavariety of facts related to one particular topic. A CT-based search operation enhances the user ~ knowledge of that topic by presenting facets (or data related to those facets) the user is probably not aware of, although they may be relevant to the solution of his or her problem.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2) Continuous Thematization of Rhemes linlc,;a
</SectionTitle>
    <Paragraph position="0"> set of formerly unrelated topics by a coherent line of conceptual dependencies (cunent rheme becomes next theme). A CTR-based search operation therefore provides the basis for thettuaical g~sociations and stim-</Paragraph>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
AOI&amp;quot;ES DE COLING-92, NANTES, 23-28 AO~I' 1992 3 1
</SectionTitle>
    <Paragraph position="0"> ulates previously unconsidered lines of reasoning by thematically cotL~trained browsing.</Paragraph>
    <Paragraph position="1"> 3) Derived Theme g,~oups hierarchically related topics and thus may enhance the knowledge of alternatives of the particular topic (,and facts related to it) under focused attention of the user (by way of stimulating comparisolts, recognizing int0rmation gaps, etc.).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML