File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/86/c86-1028_concl.xml
Size: 8,161 bytes
Last Modified: 2025-10-06 13:56:03
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1028"> <Title>Lexicase Parsing: A Lexicon-driven App.roach. to Syntactic Analysis</Title> <Section position="8" start_page="130" end_page="132" type="concl"> <SectionTitle> (6) Output </SectionTitle> <Paragraph position="0"> The outptlt of tile algorithm is zero or nmre syntactic analyses of the input sentcnce, but at the same time it can be considered an intensional semantic representation: it presents all the sernantic distinctive features for each word, and specifies the head-modifier and semantic implication relations between each linked pair of wm'ds.</Paragraph> <Paragraph position="1"> The 'extensional' meaning of the sentence then is just tile range of external situations which are compatible with the intension, the lexical meanings and interrelationships characterized by this structure. I,exicase is very well suited to characterizing this intcasional semantic representation bccausc it formally defines the range of possible loxical linkages. The structure is simple yet rich enough to in principle carry enough information to serve as the input to a know\]edge extraction or machine translation system.</Paragraph> <Section position="1" start_page="130" end_page="131" type="sub_section"> <SectionTitle> 6.1 Words </SectionTitle> <Paragraph position="0"> (1) Prepositions: IAnk each preposition by contextual features with an accessible N, V, or P. Prepositions are linked first because they link with N's, V's, or other P's to form PP's wbich delimit closed domains whose internal non-head constituents are then inaccessible to connections with external elements. Subsequent parsing stages then search inside of or outside of these dmnains, but do not need to consider links between PP-internal not>heads and PPexternal lexical items.</Paragraph> <Paragraph position="1"> (2) Verbs: Verbs are linked with their attributes to form clauses or sentences. Note that in the lexicase framework, 'sentence' refers to any verb-headed construction, regardless of the finiteness of its verbal head or its position in the tree. The searching proceedes from left to right in English, but would scan fi'om right to left in a verb-final left-branching language such as Japanese. In a dependency grammar framework such as lexicase, a (verbal) sentence is defined as a verb together with its syntactic dependents. A sentence is the basic unit of syntax because it is the maximum domain of dependencies. Once a sentence unit has been established in this way, subsequent parsing stages can ignore links between sentence-internal and sentence-external items.</Paragraph> <Paragraph position="2"> (3)Nouns: Nouns are linked with their dependents to form Noun Phrases. Noun Phrases and Sentences ('verb phrases') are the syntactically and semantically basic sentence constituents. Like other head items, nouns establish domains whose non-head constituents are inaccessible to external links, so that cross-domain Iinkages can be ignored on subsequent passes, thereby radically limiting the number of pairs of items that have to be considered on each subsequent pass and again cutting down on computation time.</Paragraph> <Paragraph position="3"> (4) Determiners: Link each Determiner with an accessible Noun. In English, the Determiner marks the left boundary of a Noun Phrase. Linking the N and its Det establishes one boundary of the NP, and subsequent parsing can ignore links between elements inside this domain and elements outside it.</Paragraph> <Paragraph position="4"> (5) Adjectives Link each Adjective with an adjacent noun.</Paragraph> <Paragraph position="5"> Because previous passes will have already delimited major constituent boundaries and radically narrowed the set of possible connections, very little checking will need to be done to link an Adjective with the correct head Noun.</Paragraph> <Paragraph position="6"> (6) Adverbs: Link each Adverb with a head Verb or Adjective.</Paragraph> <Paragraph position="7"> Structural ambiguity is most likely to appear in connection with alternate attachments of PP's and Adverbs with other words in a sentence. By saving Adverb linking until near the end of the parsing sequence, we establish domains of inaccessibility which greatly reduce the number of possible Adverb attachment points which need to be considered.</Paragraph> </Section> <Section position="2" start_page="131" end_page="131" type="sub_section"> <SectionTitle> 6.2 Coordination </SectionTitle> <Paragraph position="0"> Link each conjunction with one or more major constituents (S, NP, PP, AdjP, or AdvP) on each side. At this point, all the major constituents have already been established, so the conjunction linking procedure needs to consider only the head word of each major constituent. Since every conjunction will at this time be either at the highest level, that is, linkable only to the immediate constituents of the sentence, or inside the domain of some other construction, thc number of linking choices will be extremely limited.</Paragraph> </Section> <Section position="3" start_page="131" end_page="132" type="sub_section"> <SectionTitle> 6.3 Orphanage </SectionTitle> <Paragraph position="0"> Link all remaining upwardly unlinked Nouns, Determiners, Adjectives, Adverbs, Prepositions, and Verbs with an accessible 'elder sister' (or 'regent' \[12\]). At this point unattached lexical items will be found only embedded inside of other constructions, with very few accessible attachment possibilities to consider (usually only one).</Paragraph> <Paragraph position="1"> Thus there will generally be no backtracking and stacking required.</Paragraph> <Paragraph position="2"> The exception will be Adverbs and PP's, which account for most of the structural ambiguity likely to be encountered. By saving these alternative connection possiblities until near the end of the parsing process, we minimize the amount of computation that has to be done 'on top of' the alternative structures produced at this stage.</Paragraph> <Paragraph position="3"> 7. Overall assessment and conclusion The parsing approach we advocate here is in principle very simple because lexicase requires no rules for normal parsing situations at all, and is based on linguistic principles designed to maximize the generality and simplicity of descriptions. It has no deep structure or transformations; instead, 'transformed' and 'untransformed' lexical entries are listed separately in the lexicon, thereby placing the parsing burden on memory rather than processing. Since Iexicase automatically determines which items are relevant to the satisfaction of particular contextual requirements, no feature percolation or feature copying mechanism is needed to move features around in a tree to get them into a position where they are accessible to related items.</Paragraph> <Paragraph position="4"> Lexicase parsing is bottom-up in the sense that it begins with individual words rather than some 'root node' S. It scans from left to right or vice versa, depending on whether the language is verbinitial, verb-medial, or verb-final, but in fact it is a mechanism which works from head to dependent rather than primarily from one end or the other. Since it forms constituents from heads and dependents at all levels simultaneously, it thus incorporates virtues of both top-down and bottom-up parsers. Lexicase accomplishes this by only making links allowed or required by contextual features of head lexical items, and since the 'overall structure of the sentence' is determined by just these features, it is not possible to make links which are not compatible with this overall structure.</Paragraph> <Paragraph position="5"> Since lexicase has no Phrase Structure rules, a lexicase parser cannot blunder into the loops caused by left-recursive rules. Lexicase generates linguistically correct structures: they directly represent head-attribute relationships, they characterize the concept of grammatical relatedness, they allow various other important generalizations to be captured, and they account adequately for speakers' intuitions.</Paragraph> </Section> </Section> class="xml-element"></Paper>