File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/96/c96-1049_evalu.xml

Size: 5,050 bytes

Last Modified: 2025-10-06 14:00:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1049">
  <Title>Lean Formalisms~ Linguistic Theory~ and Applications. Grammar Development in ALEP.</Title>
  <Section position="5" start_page="289" end_page="290" type="evalu">
    <SectionTitle>
5 'Efficiency' and Performance
</SectionTitle>
    <Paragraph position="0"> In this section we would like to address the topic of efficiency. A number of points contributing specifically to efficiency should be summarized here.</Paragraph>
    <Paragraph position="1"> * ALEP is designed to support efficiency as far as the formalism ('lean' approach) is concerned.</Paragraph>
    <Paragraph position="2"> Formal constructs known to be computationally expensive are not available 3.</Paragraph>
    <Paragraph position="3"> * Refinement (mentioned already) is a monotonic appfication of phrase structure rules and lexical entries to further featurize (flesh-out with features) a finguis6c structure, established in analysis.</Paragraph>
    <Paragraph position="4"> If Q1 is the finguistic output structure of the analysis, then Q~ is the output structure of'refinenlent' if Q1 subsumes Q2, i.e. every local tree in Q= and every lexical entry in Q~ is subsumed by a corresponding local tree and a corresponding lexical entry in Q1.</Paragraph>
    <Paragraph position="5"> Any non-deterministic backtracking algorithm (depth-first) is badly effected by ambiguities as it has to redo repeatedly large amounts of work. In terms of lingware development this means that lexical ambiguities have to be avoided for analysis. As on the other hand lexicalist theories result in an abundance of lexical ambiguities, 'refinement' is a relief. Optimal distribution of inforlnation over analysis and refinement results in a gain of efficiency by several factors of magnitude.</Paragraph>
    <Paragraph position="6"> ,, Head selections: ALEP allows for user-defined parsing head declarations as &amp;quot;the most appro3It should have been shown in the precious sections that felicitous descriptions are possible anyway.</Paragraph>
    <Paragraph position="7">  priate choice of head relations is grammar dependent&amp;quot; (Alshtl), p.318. On the basis of the user-defined head relations the reflexive transitive closure over head relations is calculated. It has to be made sure that the derived relations are as compact as possible. Optimal choice of head relations pays off in a gain in efficiency by several factors of magnitude.</Paragraph>
    <Paragraph position="8"> * Keys: Keys are values of attributes within linguistic descriptions defined by path declarations. Keys allow for indexation and efficient retrieval of rules and lexical entries. This beconres extremely relevant for larger-scale resources. A key declaration which the grammar developer may do identifies the atomic value which is to serve as a key. Optimal keys again result in a substantial gain in efficiency.</Paragraph>
    <Paragraph position="9"> * Last not least tuning the grammars with a view oil efficiency has contributed to tile current performance of the system.</Paragraph>
    <Paragraph position="10"> In the following we would like to give some actual figures which may illustrate performance. These figures are not meant to be an exact measurement as exact measurenrents are not available, in order to give an indication it may be said that ALL the phenomena which increase indeterminism in a grammar of German are covered: All forms of the articles ('die', 'der') and homomorphous relative pronouns, all readings of verbs ( all frames, all syntactic realizations of complements), semantic readings, prepositions and honu)lnorphous prefixes, PPs as nominal adjuncts, as preadjectival complements, as adjuncts to adverbs, as VP adjuncts, valent nouns (with optional complementation), all readings of Gerlnan 'sein', coordination, N -~ N combinations, relatives, Nachfeld.</Paragraph>
    <Paragraph position="11"> One result of the corpus investigation was that 95% of the sentences in the corpus have between 5 and 40 words. The grammar is able to parse sentences with up to 40 words in 120 sees. The following are corpus examples containing time-consmning parse problems.</Paragraph>
    <Paragraph position="12"> Input: In den Wochen vor Weihnaehten konnte der stolze Vorsitzende der zu Daimler-Benz gehoerenden Deutsche Aerospace AG ein Jahresergebnis, das alle Erwartungen uebertraf, verkuenden.</Paragraph>
    <Paragraph position="13"> (Comment: In the weeks before X-mas the proud head of the Deutsche Aerospace AG which belongs to Daimler-Benz could announce an annual statement of accounts which exceeds all expectations. ) total RWordSeg RLiftAna Refine sol : i 34. 450 0.380 34.070 0.000 Input: Dieser Erfolg ueberrascht in zwei Hinsichten.</Paragraph>
    <Paragraph position="14"> (Comment: This success is surprising in two respects,) total RWprdSeg RLiftAna Refine sol : 1 1.910 0.130 1.780 0.000 For industrial purposes this may still be too slow, but we think that the figures show that the system is not so far away front reality.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML