File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/e89-1033_metho.xml
Size: 11,001 bytes
Last Modified: 2025-10-06 14:12:20
<?xml version="1.0" standalone="yes"?> <Paper uid="E89-1033"> <Title>Interactive Incremental Chart Parsing</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Discussion </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Top-Down Parsing </SectionTitle> <Paragraph position="0"> The algorithm given in section 2.4.2 could be modified to top-down parsing by changing the predictor (see e.g. Wirdn 1988) and by having Move-Vertex/RightHalf not move active looping edges (vt.AIooo) since, in top-clown, these &quot;belong&quot; to the left portion of the chart where the predictions of them were generated.</Paragraph> <Paragraph position="1"> In general, the algorithm works better bottom-up than top-down because bottom-up predictions are 12One edge subsumes another edge if and only if the first three elements of the edges are identical and the fourth element of the first edge subsumes that of the second edge. For a definition of subsumption, see Shieber (1986:14).</Paragraph> <Paragraph position="2"> lSNote that this condition is tested by the unification which specifically ensures that D( (Xm cat}) = E( (Yo eat}).</Paragraph> <Paragraph position="3"> - 245 made &quot;locally ~ at the starting vertex of the triggering (inactive) edge in question. Therefore, a changed preterminal edge will typically have its dependants locally, and, as a consequence, the whole update can be kept local. In top-down parsing, on the other hand, predictions are Uforward-directed', being made at the ending vertex of the triggering (active) edge. As a result of this, an update will, in particular, cause all predicted and combined edges after the change to be removed. The reason for this is that we have forward-directed predictions having generated active and inactive edges, the former of which in turn have generated forward-directed predictions, and so on through the chart.</Paragraph> <Paragraph position="4"> On the one hand, one might accept this, arguing that this is simply the way top-down works: It generates forward-directed hypotheses based on the preceding context, and if we change the preceding context, the forward hypotheses should change as well. Also, it is still slightly more well-behaved than exhaustive reanalysis from the change.</Paragraph> <Paragraph position="5"> On the other hand, the point of incremental parsing is to keep updates local, and if we want to take this seriously, it seems like a waste to destroy possibly usable structure to the right of the change. For example, in changing the sentence &quot;Sarah gave Kim a green apple s to &quot;Sarah gave a green apple to Kim s, there is no need for the phrase &quot;a green apple s to be reanalysed.</Paragraph> <Paragraph position="6"> One approach to this problem would be for the edge-removal process to introduce a &quot;cut s whenever a top-down prediction having some dependant edge is encountered, mark it as &quot;uncertain ~, and repeatedly, at some later points in time, try to find a new source for it. Eventually, if such a source cannot be found, the edge (along with dependants) should be Ugarbage-collected ~ because there is no way for the normal update machinery to remove an edge without a source (except for preterminal edges).</Paragraph> <Paragraph position="7"> In sum, it would be desirable if we were able to retain the open-endedness of chart parsing also with respect to rule invocation while still providing for efficient incremental update. However, the precise strategy for best achieving this remains to be worked out (also in the light of a fully testable interactive system}.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Alternative Ways of Determining Affected Edges </SectionTitle> <Paragraph position="0"> Henry Thompson (personal communication 1988) has pointed out that, instead of computing sets of dependants from source edges, it might suffice to simply record the latter, provided that the frequency of updates is small and the total number of edges is not too large. The idea is to sweep the whole edge space each time there is an update, repeatedly deleting anything with a non-existent source edge, and iterating until one gets through a whole pass with no new deletions.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2.2 Maintain Neither Sources Nor Dependencies </SectionTitle> <Paragraph position="0"> If we confine ourselves to bottom-up parsing, and if we accept that an update will unconditionally cause all edges in the dependency closure to be removed (not allowing the kind of refinements discussed in footnote 10, it is in fact not necessary to record sources or dependencies at all. The reason for this is that, in effect, removing all dependants of all preterminal edges extending between vertices v|, ..., Vr+l in the bottom-up case amounts to removing all edges that extend somewhere within this interval (except for bottom-up predictions at vertex W+l which are triggered by edges outside of the interval). Given a suitable matrix representation for the chart (where edges are simultaneously indexed with respect to starting and ending vertices}, this may provide for a very efficient solution.</Paragraph> <Paragraph position="1"> between Features There is a trade-off between updating as local a unit as possible and the complexity of the algorithm for doing so. Given a complex-feature-based formalism like PATR, one extreme would be to maintain dependencies between feature instances of the chart instead of between chart edges. In principle, this is the approach of the Synthesizer Generator (Reps and Teitelbaum 1987), which adopts attribute grammar for the language specification and maintains dependencies between the attribute instances of the derivation tree.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Lexical Component </SectionTitle> <Paragraph position="0"> An approach to the lexical component which seems particularly suitable with respect to this type of parser, and which is adopted in the actual implementation, is the letter-tree format. 14 This approach takes advantage of the fact that words normally are entered from left to right, and supports the idea of a dynamic pointer which follows branches of the tree as a word is entered, immediately calling for reaction when an illegal string is detected. In particular, this allows you to distinguish an incomplete word from a (definitely) illegal word. Another advantage of this 14 Tr/e according to the terminology of Aho, Hopcroft, and Ullman (1987:163).</Paragraph> <Paragraph position="1"> - 246 approach is that one may easily add two-level morphology (Koskenniemi 1983) as an additional filter. A radical approach, not pursued here, would be to employ the same type of incremental chart-parsing machinery at the lexical level as we do at the sentence level.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4 Dependencies across Sentences </SectionTitle> <Paragraph position="0"> Incremental parsing would be even more beneficial if it were extended to handle dependencies across multiple sentences, for example with respect to nounphrases. Considering a language-sensitive text editor, the purpose of which would be to keep track of an input text, to detect (and maybe correct) certain linguistic errors, a change in one sentence often requires changes also in the surrounding text as in the following examples: The house is full of mould. It has been judged insanitary by the public health committee. They say it has to be torn down.</Paragraph> <Paragraph position="1"> The salmon jumped. It likes to play.</Paragraph> <Paragraph position="2"> In the first example, changing the number of ~house ~ forces several grammatical changes in the subsequent sentences, requiring reanalysis. In the second example, changing &quot;it (likes) ~ to ~they (like) ~ constrains the noun-phrase of the previous sentence to be interpreted as plural, which could be reflected for example by putting the edges of the singular analysis to sleep.</Paragraph> <Paragraph position="3"> Cross-sentence dependencies require a level of incremental interpretation and a database with non-monotonic reasoning capabilities. For a recent approach in this direction, see Zernik and Brown It is planned to maintain a dynamic agenda of update tasks (either at the level of update functions or, preferably, at the level of individual edges), removing tasks which are no longer needed because the user has made them obsolete (for example by immediately deleting an inserted text).</Paragraph> <Paragraph position="4"> In the long run, an interactive parsing system probably has to have some built-in notion of time, for example through time-stamped editing operations and (adjustable) strategies for timing of update operations. null</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> This paper has demonstrated how a chart parser by simple means could be augmented to perform incremental parsing, and has suggested how this system in turn could be embedded in an interactive parsing system. Incrementality and interactivity are two independent properties, but, in practice, an incremental system that is not interactive would be pointless, and an interactive system that is not incremental would at least be less efficient than it could be. Although exhaustive recomputation can be fast enough for small problems, incrementality is ultimately needed in order to cope with longer and more complex texts. In addition, incremental parsing brings to the system a certain ~naturainess ~ analyses are put together piece by piece, and there is a built-in correlation between the amount of proceasing required for a task and its difficulty.</Paragraph> <Paragraph position="1"> &quot;Easy things should be easy... ~ (Alan Kay).</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Interactive Parsing </SectionTitle> <Paragraph position="0"> This section outlines how the incremental parser is embedded in an interactive parsing system, called LIPS. 15 Figure 1 shows the main components of the system. The user types a sentence into the editor (a Xerox TEDIT text editor). The words are analysed on-line by the scanner and handed over to the parser proper which keeps the chart consistent with the input sentence. Unknown words are marked as illegal in the edit window. The system displays the chart incrementally, drawing and erasing individual edges in tandem with the parsing process.</Paragraph> </Section> class="xml-element"></Paper>