File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/p85-1014_metho.xml
Size: 11,082 bytes
Last Modified: 2025-10-06 14:11:50
<?xml version="1.0" standalone="yes"?> <Paper uid="P85-1014"> <Title>New Approaches to Parsing Conjunctions Using Prolog</Title> <Section position="2" start_page="122" end_page="124" type="metho"> <SectionTitle> 2. Identical Leading Substrlngs. </SectionTitle> <Paragraph position="0"> The second case occurs wheTt the two (non-eml)ty) difference lists have identical leading non-empty substrings. Then the coni-ined string is identical to the concatenation of that leading substring with the lineari~.ation of the rest of th,: two difference lists. For example, consider the linearization of the two flagments &quot;likes Mary&quot; and &quot;likes Jill&quot; as shown in fig. 13 ..</Paragraph> <Paragraph position="1"> {likes Mary. likes Jill} which can be. lineariz~:d a~ :{likes X} where X is the linearization of strings {Mary. Jill} l'Tg. 13: Example of identical leading substrings 3. Conjohfing.</Paragraph> <Paragraph position="2"> The last case occurs when the two pairs of (qonempty) difference lists have no common leading substring, llere, the conjoined string will be the co,tcatenation nf the co.junctinn of one of the pairs from the candidate set, with the conjoined sqring resulting fr~nl the line;trization of the two strings with their respective candidate substrings deleted. For example, consider the linearization -f the two sentences &quot;John likes Mary&quot; aitd &quot;Bill likes Jill&quot; a~ shown in fig. 14 :{John likes Mary. Bill likes Jill} Given th,t the .~elertt:,l ,',ltdi,l,tc lmir is {John. Bill}, the c,,sj,,,',,:,l :;,rtdt ,,'e ~;:,ul.l Iw :what linearizations the system would produce for an example sentence. Consider the sentence &quot;John and Bill liked Mary&quot; (fig. 15) :{John and Bill liked Mary} would produce the string:.</Paragraph> <Paragraph position="3"> {John and Bill liked Mary.</Paragraph> <Paragraph position="4"> John and Bill liked Mary} with candidate set {} { John liked Mary, Bill liked Mary} with candidate set {(John, Bill)} {John Mary. Bill liked Mary} with candidate set {(John. Bill liked)} All of the strings ,'ire then passed to the predicate findequivalences which shouhl pick out the second pair of strings as the only grammatically correct linearization.</Paragraph> <Section position="1" start_page="122" end_page="124" type="sub_section"> <SectionTitle> Finding Equiwdences </SectionTitle> <Paragraph position="0"> (.;oodall's delinition of eqnivalence w,'~s that two terminal strings were said to be equivalent if they h;ul the same left and right contexts. Furthermore we had previously a.ssertcd th;~t the equivaleut pairs couhl be l}roduced without ~earching the whole RI'M. For example consider the equivah.nt lernnimd strings in the two sentences &quot;Alice saw Bill&quot; There are S,.hC i,ul~h~,.c.t;dic.= d,:t;tils Lhat are dlfr,~re.t for parsi.g tc~ ge,er:ttinK. (~ec al~l~,ndi.'c A.) llowcver the fierce cases :u'e the sanonc for hoth.</Paragraph> <Paragraph position="1"> We cast illusl, r;ll.e the :tl~C/~v,; dc:llntili,m by she=wing {Alice saw Bill. Mary saw Bill} would prt.hwr the, equiwdrnt pairs :{Alice saw Bill. Mary saw Bill} * If there exists two terminal strings X & Y such that X-'=xxfl & Y--xYf'/, then X &. 1&quot;~ should be the strongest possible left ~ right contexts respectively - provided x & y axe both nonempty. In the above example, x--nil and fl=&quot;saw Bill&quot;, so the first a.ud the third pairs produced are redundant.</Paragraph> <Paragraph position="2"> In general, a pair of terminal strings are redundant if they have the form (uv, uw) or (uv, zv), in which case - they may be replaced by the pairs (v, w) ~ad (u, z) respectively.</Paragraph> <Paragraph position="3"> * Ia Goodall's definition any two terminal strings themselves are also a pair of equivalent terminal strings ( whe, X & f2 ,are both ,ull). We exclude this case it produces simple string concatenation of sentences.</Paragraph> <Paragraph position="4"> The above restrictions imply that in fig. 16 the only remai,ing equivalent pair ({Alice. Mary})is the correct one for tl, is example.</Paragraph> <Paragraph position="5"> However, before fiuding eq,ivalent pairs for two simple zenlences, the ittocess ,,f fimli, g ,quiv.,lel, ces ,nlust check that the two se,tt,;nces ate actually gral,tlllatical. We ;msuune thnt a recot;nizer/i,arser (e.g. a predicate parse(S El) alremly exists for determining the grammaticality of ~itnple ~entenccs. Since the proct'ss only requires a yes/no answer to gramnmtic;dity, any parsing or recognition sysl.e;,t f,,r simple sentences can be used.</Paragraph> <Paragraph position="6"> We can now specify a l,redicate lindcandi(lates(X Y SI $2) that hohls when {X. Y} is an equiw,hmt pair front the two grantmatical simple .:e,te,ces {SI. $2} .~ f, llows (li!,C/. 17):findcandidates(X and Y in SI and $2) ir parse(Sl nil) ilnld parse(S2 nil) and eqlniv(X Y SL $2) wh,.rc eqt,iv is ,h'fit~,'d as :.</Paragraph> <Paragraph position="8"> if append3(Chi X Omega Xl) and ternfinals(X) and append3(C.hi Y Omega YI) and terminals(Y) :vh,'r,' :q,t,',,,IS(L! L2 I..'~ L 1) h,,hls wh,.n L.I i:&quot; ,',l,ml ;o th,. c',,tJ,'nl,'t~;tli,,tl ,,f I.I.L2 .~: 1.3. h'rminzd.~(X) holds when X i.'~ n li..t ,,1' t,'rtztinnl .~yml,,,Is ouly Fig. l 7: Logic delit, itiolz .f Fi.:lcntldirh, Les Then the predicate findcquivalencos is simply defined ;t~ (fig. 18) :findequivalences(X and Y in S1 and $2) if findcandidates(X and Y in S1 and $2) and not redundant(X Y) wl.,re redundant implements the two restrictions described. Fig.18: Logic definition of Findeq,ivalences Comparison with MSGs The following table (fig. 19) gives tile execution times in milliseconds for the parsing of some sample sentences mostly taken from Dahl 0~ McCor(l \[1983\]. Both systems were executed using Dec-20 Prolog. The times shown for the MSG interpreter is hazed on the time taken to parse ,'rod buihl the syntactic tree only - the time for the subsequent transformations w,-~s not ,,chided.</Paragraph> <Paragraph position="9"> Sample / MSG RPM ences J system device Each m;ul ate an apish deg ;~.lld ;t pear \[ 662 292 .Iolm at,, ~lt appl,, and a pear \[ 613 233 f Z~k ;t,I ;Ll,ll ;1 WOIIU~.,, ~ilW o;i{'h trttill I Eiit'h ll,;lll ;tllll ,'ach wl|l,llt|t at(' l ,&quot;m pple J,~hll saw and the woman heard a a, lhat laughed .\]ohn drov,. Ihe car through and ct)m ~h.lt'ly demolishe, l a window &quot;rh,, woa,t;tl, wit,) gav(&quot; a l),~ok to .John and dr,we ;L car through .'L window laugh~l .h,hn .~aw the ,ltltll |.hiLt Mary .~aw and Bill gay,. a bo,,k t,, hutght~d .l.hnt .~aw the man lhat lu.;trd the wotnaH rhar lattglu'd and ~aw Bill Th,. ,,tan lh;d Mary saw and h(.ard ~;LVI' ,'~.ll ;).llllll&quot; t,I ,,;\[l'h ~viHlla\[~ .h,htl mtw a /uul Mary .~aw the red From tile timings we can conclude that the propo..:ed device is comparable to the MSC, system in terms -f comt,ttati,Jn:d elllciency, llowever, there are some other advantages s,,ch as :* Transparency of the grammar - There is no need for phrmsal rules such .-m &quot;S ~ S and S&quot; The device also allows ,,m-phr~al conjunction.</Paragraph> <Paragraph position="10"> * Since no special grammar or particular phr~e marker representation is required, any par.,;er can be used the dcvicc' only requires an acctpt/reject answer. * The specification is uot biased with respect to liars ing or generation. The iniplement:ition is reversible allowing it to generate aay sentence it can parse and vice versa.</Paragraph> <Paragraph position="11"> * Modularity of the device. The granimaticallty of sentestes with conjunctiou is determined by the definition of equivalence. For instance, if needed we can filter the equivalent terlninals using semantics. A Note on SYSCONJ It is worthwhile to compare the phr;me marker approach t{i the Aq.'N-ba.sed SYSCON.I inechanisln. Like SYSCONJ~ OUr analysis is extragrammatical: we do not tanlper with the h,sic gramnlar, but add a new cnniponent *.hat handles conjunction. Unlike SYSCONJ, our approach is based on a precise definition of &quot;equiwdent tlhrztse~&quot; that attenlpts ta unify urider one analysis nlany dill'erent types of coordination phen,mena. :~YSi~,ONJ relied ou a rather conipticated, interrupt-driven method that restarted sentence ~malysis in SOlltC previously recorded m;tchine coiilil~qiration, but with the input sequence following the conjunction. This capturcs part of the &quot;multillle planes&quot; analy:ds of the phrase marker ,'tpproach, but without a precise notion of equivalent phr,'l~es. Perhaps ~ a result, SYSCONJ handled only ordinary conjunction, ali(l \[tot respectively or gapping reading~. In our appr-:,h, a simple change to the lincarization process allows ll~ t~l handle gapping.</Paragraph> <Paragraph position="12"> Extensions to the Basic Device The device described in the previ,lus section is a .~iluplified version for rough elliilll;iristin wii.h the MS~ inter-In'ctct &quot;. llowever, the systClll C;ill e.tsily he gciicralizcd to h~uidle nlultiple conjunctz. The only ,uhliti.nal phase required ia to gelicrate telnpl:tte~ for nluttlph: rc:ulings. Also, gallpillg can lie handled just lly adding clauses tll the deftnifioll of linearize - which allows :l dilferent path from that of fi~. 8 to be taken.</Paragraph> <Paragraph position="13"> The ~iinlllilied device llVruiits ~llllil. ,.,(ainllh~s of ungr;liillii;lliC/:tl ~.l.il!l,nfl.s I.,, h,r ll;U'<'ed as if tin'i--or (lig. 5), The inildularity ~f the systelll all.ws its {() ciln..itr;tin the dcliiiii.iclii of eClUiv:th,qlcl~ still I'lirl.hl.r. The cxtcndcl\[ dellniticlns in (141~lthdl's draft l, hcory wci-e licit iiichilled iii his thesi~; (;,i,.la11144i lirP~lilll;lllly hl,vi'.liSe it w:us liill COli.'-itrailled en~liigh. Ilnwever in lii.~ I.hl~sis he lll'llll~lses illiolher :lefinition elf !4raniliial.ic;dity ilshil~ II.l~Ms. This delliiitilln cltn lie lisctl t.o c~liistrain i~Cliiiv.-tlclice .,;till I'ilrl, lier ill Clllr systelli at a lOSS fif Siillle crllil:ieni:y ;llld gelilrl';ilil.y. For (~Xltlll|ile, the n~quircd ;tdditional predicate will need to ni;tke explicit use of the colnbined RPM. Therefilre, a parser will need to produce a I1.PM representation as its phr,~ze marker. The modifications necessary to produce th,, representation is shown hi appemlix B.</Paragraph> </Section> </Section> <Section position="3" start_page="124" end_page="124" type="metho"> <SectionTitle> Acknowledgements </SectionTitle> <Paragraph position="0"> This work describes research clone at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Sitpport for the Laboratory's artificial intelligence rese,'u'ch has been provided in part by the Advanced Research Projects Agency of the Depitrtnlent of Defense under Office of Naval Re'~earch contract N000t-I-80-C-0505.</Paragraph> <Paragraph position="1"> The first author is also filndnd by a scholarship from the</Paragraph> </Section> class="xml-element"></Paper>