XML Viewer - h01-1035

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/h01-1035_metho.xml
Size: 30,406 bytes
Last Modified: 2025-10-06 14:07:31
<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1035">
  <Title>Induced Morphological Analyses for CZECH Inflection Root Out Analysis TopBridge</Title>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. DATA RESOURCES
</SectionTitle>
    <Paragraph position="0"> The data sets used in these experiments included the English-French Canadian Hansards, the English-Chinese Hong Kong Hansards, and parallel Czech-English Reader's Digest collection. In addition, multiple versions of the Bible were used, including the French Douay-Rheims Bible, Spanish Reina Valera Bible, and three English Bible Versions (King James, New International and Revised Standard), automatically verse-aligned in multiple pairings.</Paragraph>
    <Paragraph position="1"> All corpora were automatically word-aligned by the now publicly available EGYPT system (Al-Onaizan et al., 1999), based on IBM's Model 3 statistical MT formalism (Brown et al., 1990). The tagging and bracketing tasks utilized approximately 2 million words in each language, with the sample sizes for morphology induction given in Table 3. All word alignments utilized strictly rawword-based model variants for English/French/Spanish/Czech and character-based model variants for Chinese, with no use of morphological analysis or stemming, POS-tagging, bracketing or dictionary resources.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. PART-OF-SPEECHTAGGER INDUCTION
</SectionTitle>
    <Paragraph position="0"> Part-of-speech tagging is the first of four applications covered in this paper. The goal of this work is to project POS analysis capabilities from one language to another via word-aligned parallel bilingual corpora. To do so, we use an existing POS tagger (e.g. Brill, 1995) to annotate the English side of the parallel corpus. Then, as illustrated in Figure 1 for Chinese and French, the raw tags are transferred via the word alignments, yielding an extremely noisy initial training set for the 2nd language. The third crucial step is to generalize from these noisy projected annotations in a robust way, yielding a stand-alone POS tagger for the new language that is considerably more accurate than the initial projected tags.</Paragraph>
    <Paragraph position="1"> Additional details of this algorithm are given in Yarowsky and Ngai (2001). Due to lack of space, the following sections will serve primarily as an overview of the algorithm and its salient issues.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Part-of-speech Projection Issues
</SectionTitle>
      <Paragraph position="0"> First, because of considerable cross-language differences in fine-grained tag set inventories, this work focuses on accurately assigning core POS categories (e.g. noun, verb, adverb, adjective, etc.), with additional distinctions in verb tense, noun number and pronoun type as captured in the English tagset inventory. Although impoverished relative to some languages, and incapable of resolving details such as grammatical gender, this Brown-corpus-based tagset granularity is sufficient for many applications. Furthermore, many finer-grained part-of-speech distinctions are resolved primarily by morphology, as handled in Section 7. Finally, if one desires to induce a finer-grained tagging capability for case, for example, one should project from a reference language such as Czech, where case is lexically marked.</Paragraph>
      <Paragraph position="1"> Figure 3 illustrates six scenarios encountered when projecting POS tags from English to a language such as French. The first two show straightforward 1-to-1 projections, which are encountered in roughly two-thirds of English words. Phrasal (1-to-N) alignments offer greater challenges, as typically only a subset of the aligned words accept the English tag. To distinguish these cases, we initially assign position-sensitive phrasal parts-of-speech via subscripting (e.g. Les/NNSa16 lois/NNSa17 ), and subsequently learn a probablistic mapping to core, non-phrasal parts of speech (e.g.</Paragraph>
      <Paragraph position="2"> Pa18 DTa19 NNSa16a21a20 ) that is used along with tag sequence and lexical prior models to re-tag these phrasal POS projections.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Noise-robust POS Tagger Training
</SectionTitle>
      <Paragraph position="0"> Even at the relatively low tagset granularity of English, direct projection of core POS tags onto French achieves only 76% accuracy using EGYPT's automatic word alignments (as shown in Table 1). Part of this deficiency is due to word-alignment error; when word alignments were manually corrected, direct projection core-tag accuracy increased to 85%. Also, standard bigram taggers trained on the automatically projected data achieve only modest success at generalization (86% when reapplied to the noisy training data). More highly lexicalized learning algorithms exhibit even greater potential for overmodeling the specific projection errors of this data.</Paragraph>
      <Paragraph position="1"> Thus our research has focused on noise-robust techniques for distilling a conservative but effective tagger from this challenging raw projection data. In particular, we modify standard n-gram modeling to separate the training of the tag sequence model a22a23a18a25a24a26a20 from the lexical prior models a22a23a18a28a27a29a19a24a26a20 , and apply different confidence weighting and signal amplification techniques to both.</Paragraph>
      <Paragraph position="2">  Figure 4 illustrates the process of hierarchically smoothing the lexical prior model a30a22a23a18a25a31a32a19a33a26a20 . One motivating empirical observation is that words in French, English and Czech have a strong tendency to exhibit only a single core POS tag (e.g. a34 or a35 ), and very rarely have more than 2. In English, with relatively high a22a23a18 POSa19a33a26a20 ambiguity, only 0.37% of the tokens in the Brown Corpus are not covered by a word type's two most frequent core tags, and in French the percentage of tokens is only 0.03%. Thus we employ an ag- null (a) Direct transfer (on auto-aligned data) .76 .69 N/A N/A (b) Direct transfer (on hand-aligned data) .85 .78 N/A N/A (c) Standard bigram model (on auto-aligned data) .86 .82 .82 .68 (d) Noise-robust bigram induction (on auto-aligned data) .96 .93 .94 .91 (e) Fully supervised bigram training (on goldstandard) .97 .96 .98 .97  gressive re-estimation in favor of this bias, amplifying the model probability of the majority POS tag, and reducing or zeroing the model probability of 2nd or lower ranked core tags proportional to their relative frequency with respect to the majority tag. This process is then applied recursively, similarly amplifying the probability of the majority subtags within each core tag. Further details, including the handling of 1-to-N phrasal alignment projections, are given in Yarowsky and Ngai (2001).</Paragraph>
      <Paragraph position="4"> In contrast, the training of the tag sequence model a22a23a18a25a31a50a49a43a19a31a50a49a52a51a54a53a48a55a57a56a58a56a59a56 a20 focuses on confidence weighting and filtering of projected training subsequences. The contribution of each candidate training sentence is weighted proportionally with both its EGYPT/GIZA sentence-level alignment score and an agreement measure between the projected tags and the 1st iteration lexical priors, a rough measure of alignment reasonableness. Given the observed bursty distribution of alignment errors in the corpus, this downweighting of low-confidence alignment regions substantially improves sequence model quality with tolerable reduction in training volume.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Evaluation of POS Tagger Induction
</SectionTitle>
      <Paragraph position="0"> As shown in Table 1, performance is evaluated on two evaluation data sets, including an independent 200K-word hand-tagged French dataset provided by Universit'e de Montr'eal, which is used to gauge stand-alone tagger performance. Signal amplification and noise reduction techniques yield a 71% error reduction, achieving a core tagset accuracy of 96%, closely approaching the upper-bound 97% performance of an equivalent bigram model trained directly on an 80% subset of the hand-tagged evaluation set (using 5-fold cross-validation). Thus robust training on 500K words of very noisy but automatically-derived tag projections can approach the performance obtained by fully supervised learning on 80K words of hand-tagged training data.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. NOUN PHRASE BRACKETER
INDUCTION
</SectionTitle>
    <Paragraph position="0"> Our empirical studies show that there is a very strong tendency for noun phrases to cohere as a unit when translated between languages, even when undergoing significant internal re-ordering. This strong noun-phrase cohesion even tends to hold for relatively free word order languages such as Czech, where both native speakers and parallel corpus data indicate that nominal modifiers tend to remain in the same contiguous chunk as the nouns they modify. This property allows collective word alignments to serve as a reliable basis for bracket projection as well.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 BaseNP Projection Methodology
</SectionTitle>
      <Paragraph position="0"> The projection process begins by automatically tagging and bracketing the English data, using Brill (1995) and Ramshaw &amp; Marcus (1994), respectively.</Paragraph>
      <Paragraph position="1"> As illustrated in Figure 5, each word within an English noun phrase is then subscripted with the number of its NP in the sentence, and this subscript is projected onto the aligned French (or Chinese) words. In the most common case, the corresponding French/Chinese noun phrase is simply the maximal span of the projected subscript.</Paragraph>
      <Paragraph position="2"> Figure 6 shows some of the projection challenges encountered.</Paragraph>
      <Paragraph position="3"> Nearly all such cases of interwoven projected NPs are due to alignment errors, and a strong inductive bias towards NP cohesion was utilized to resolve these incompatible projections.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 BaseNP Training Algorithm
</SectionTitle>
      <Paragraph position="0"> For stand-alone tool development, the Ramshaw &amp; Marcus IOB bracketing framework and a fast transformation-based learning system (Ngai and Florian, 2001) were applied to the noisy baseNPprojected data described above.</Paragraph>
      <Paragraph position="1"> As with POS tagger induction, bracketer induction is improved by focusing training on the highest quality projected data and excluding regions with the strongest indications of word-alignment error. Thus sentences with the lowest 25% of model-3 alignment scores were excluded from training, as were sentences where projected bracketings overlapped and conflicted (also an indicator of alignment errors). Data with lower-confidence POS tagging were not filtered,a60 however, as this filtering reduces robustness when the stand-alone bracketers are applied to noisy tagger output. Additional details are provided in Yarowsky and Ngai (2001).</Paragraph>
      <Paragraph position="2"> Current efforts to further improve the quality of the training data include use of iterative EM bootstrapping techniques. Separate projection of bracketings from aligned parallel data with a 3rd language also shows promise for providing independent supervision, which can further help distinguish consensus signal from noise.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 BaseNP Projection Evaluation
</SectionTitle>
      <Paragraph position="0"> Because no bracketed evaluation data were available to us for French or Chinese, a third party fluent in these languages handbracketed a small, held-out 40-sentence evaluation set in both languages, using a set of bracketing conventions that they felt were appropriate for the languages. Table 2 shows the performance relative to these evaluation sets, as measured by exact-match bracketing precision (Pr), recall (R) and F-measure (F).</Paragraph>
      <Paragraph position="1">  It is important to note, however, that many decisions regarding BaseNP bracketing conventions are essentially arbitrary, and agreement rates between additional human judges on these data were measured at 64% and 80% for French and Chinese respectively.</Paragraph>
      <Paragraph position="2"> Since the translingual projections are essentially unsupervised and have no data on which to mimic arbitrary conventions, it is also reasonable to evaluate the degree to which the induced bracketings are deemed acceptable and consistent with the arbitrary goldstandard (e.g. no crossing brackets). To this end, an additional pool of 3 judges were asked to further adjudicate the differences between the goldstandard and the projection output, annotating such situations as either acceptable/compatible or unacceptable/incompatible.</Paragraph>
      <Paragraph position="3"> Overall, these translingual projection results are quite encouraging. For the Chinese, they are similar to Wu's 78% precision result for translingual-grammar-based NP bracketing, and especially promising given that no word segmentation (only raw characters) were used. For French, the increase from 59% to 91% F-measure for the stand-alone induced bracketer shows that the training algorithm is able to generalize successfully from the noisy raw projection data, distilling a reasonably accurate (and transferable) model of baseNP structure from this high degree of noise.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6. NAMED ENTITY TAGGER INDUCTION
</SectionTitle>
    <Paragraph position="0"> Multilingual named entity tagger induction is based on the extended combination of the part-of-speech and noun-phrase bracketing frameworks. The entity class tags used for this study were FNAME, LNAME, PLACE and OTHER (other entities including organizations). They were derived from an anonymously donated MUC-6 named entity tagger applied to the English side of the French-English Canadian Hansards data.</Paragraph>
    <Paragraph position="1"> Initial classification proceeds on a per-word basis, using an aggressively smoothed transitive projection model similar to those described in Section 7. For a given second-language word FW and all English words a61a62a27a63a49 aligned to it:</Paragraph>
    <Paragraph position="3"> The co-training-based algorithm given in Cucerzan and Yarowsky (1999) was then used to train a stand-alone named entity tagger from the projected data. Seed words for this algorithm were those French words that were both POS-tagged as proper nouns and had an above-threshold entity-class confidence from the lexical projection models.</Paragraph>
    <Paragraph position="4"> Performance was measured in terms of per-word entity-type classification accuracy on the French Hansard test data, using the 4class inventory listed above. Classification accuracy of raw tag projections was only 64% (based on automatic word alignment).</Paragraph>
    <Paragraph position="5"> In contrast, the stand-alone co-training-based tagger trained on the projections achieved 85% classification accuracy, illustrating its effectivess at generalization in the face of projection noise. Notably, most of its observed errors can be traced to entity classification errors from the original English tagger. In fact, when evaluated on the English translation of the French test data set, the English tagger only achieved 86% classification accuracy on this directly comparable data set. It appears that the projection-induced French tagger achieves performance nearly as high as its original training source.</Paragraph>
    <Paragraph position="6"> Thus further improvements should be expected from higher quality English training sources.</Paragraph>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
7. MORPHOLOGICAL ANALYSIS
INDUCTION
</SectionTitle>
    <Paragraph position="0"> Bilingual corpora can also serve as a very successful bridge for aligning complex inflected word forms in a new language with their root forms, even when their surface similarity is quite different or highly irregular.</Paragraph>
    <Paragraph position="1"> croyaient croissant croire croitre  As illustrated in Figure 7, the association between a French verbal inflection (croyant) and its correct root (croire), rather than a similar competitor (cro^itre), can be identified by a single-step transitive association via an English bridge word (believing). However, in the case of morphology induction, such direct associations are relatively rare given that inflections in a second language tend to associate with similar tenses in English while the singular/infinitive forms tend to associate with analogous singular/infinitive forms, and thus croyaient (believed) and its root croire have no direct English link in our aligned corpus.</Paragraph>
    <Paragraph position="2"> However, Figure 2 (first page) illustrates that an existing investment in a lemmatizer for English can help bridge this gap by joining a multi-step transitive association croyaienta36 believeda36 believea36 croire. Figure 8 illustrates how this transitive linkage via English  lemmatization can be potentially utilized for all other English lemmas (such as THINK) with which croyaient and croire also associate, offering greater potential coverage and robustness via multiple bridges.</Paragraph>
    <Paragraph position="3"> Formally, these multiple transitive linkages can be modeled as shown below, by summing over all English lemmas (a61a98a97a100a99a102a101a104a103 ) with which either a candidate foreign inflection (a105 infl) or its root (a105 root) exhibit an alignment in the parallel corpus:  This projection/bridge-based similarity measure a22 mpa18a85a105 roota19a105 infla20 can be quite effective on its own, as shown in the MProj only entries in Table 3 (for multiple parallel corpora in 3 different languages), especially when restricted to the highest-confidence subset of the vocabulary (5.2% to 77.9% in these data) for which the association exceeds simple fixed probability and frequency thresholds. When estimated using a 1.2 million word subset of the French Hansards, for example, the MProj measure alone achives 98.5% precision on 32.7% of the inflected French verbs in the corpus (constituting 97.6% of the tokens in the corpus). Unlike traditional stringtransduction-based morphology induction methods where irregular verbs pose the greatest challenges, these typically high-frequency words are often the best modelled data in the vocabulary making these multilingual projection techniques a natural complement to existing models.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
7.1 Trie-based Morphology Models
</SectionTitle>
      <Paragraph position="0"> The high precision on the MProj-covered subset also make these partial pairings effective training data for robust supervised algorithms that can generalize the string transformation behavior to the remaining uncovered vocabulary. While any supervised morphological analysis technique is possible here, we employ a trie-based modeling technique where the probability of a given stem-change (from the inventory observed in the MProj-paired training data) is modeled hierarchically using variable suffix context, as described in Yarowsky and Wicentowski (2000):</Paragraph>
      <Paragraph position="2"> An important property of the trie-based models is their effectiveness at clustering words that exhibit similar morphological behavior, both reducing model size and facilitating generalization to previously unseen examples. This property is illustrated in Figure 9, showing a sample (inflection a36 root) trie branch for French verbal inflections, with suffix histories a127 ='oie', a127 ='noie', a127 ='roie', etc.</Paragraph>
      <Paragraph position="4"> relative probabilities of the competing analyses iea36 ir and iea36 yer differ substantially for diffent suffix histories, and that there are subexceptions that tend to cluster by affix history. This allows for the successful analysis of 8 of the 9 italicized test words that had not been seen in the bilingual projection data or where the MProj model yielded no root candidate above threshold.</Paragraph>
      <Paragraph position="5">  tion a36 root probabilities (a22a23a18a52a121 a36 a119a104a19a127a107a49a128a20 ) for variable length suffix histories (a127 a49 ). MTrie analyses on test data are given in italics.</Paragraph>
      <Paragraph position="6"> Table 3 illustrates the performance of a variety of morphology induction models. When using the projection-based MProj and trie-based MTrie models together (with the latter extending coverage to words that may not even appear in the parallel corpus), full verb lemmatization precision on the 1.2M word Hansard subset exceedsa138 99.5% (by type) and 99.9% (by token) with 95.8% coverage by type and 99.8% coverage by token. A backoff model based on Levenshtein-distance and distributional context similarity handles the relatively small percentage of cases where MProj and MTrie together are not sufficiently confident, bringing the system coverage to 100% coverage with a small drop in precision to 97.9% (by type) and 99.8% (by token) on the unrestricted space of inflected verbs observed in the full French Hansards. As shown in Section 7.3, performance is strongly correlated with size of the initial aligned bilingual corpus, with a larger Hansard subset of 12M words yielding 99.4% precision (by type) and 99.9% precision (by token). Performance on Czech is discussed in Section 7.3.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
7.2 Morphology Induction via Aligned Bibles
</SectionTitle>
      <Paragraph position="0"> Performance using even small parallel corpora (e.g. a 120K sub-set of the French Hansards) still yields a respectable 93.2% (type) and 98.9% (token) precision on the verb-lemmatization test set for the full Hansards. Given that the Bible is actually larger (approximately 300K words, depending on version and language) and available on-line or via OCR for virtually all languages (Resnik et al., 2000), we also conducted several experiments on Bible-based morphology induction, further detailed in Table 3.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
7.2.1 Boosting Performance via Multiple
Parallel Translations
</SectionTitle>
      <Paragraph position="0"> Even though at most one translation of the Bible is typically available in a given foreign language, numerous English Bible versions are freely available and a performance increase can be achieved by simultaneously utilizing alignments to each English version.</Paragraph>
      <Paragraph position="1"> As illustrated in Figure 10, different aligned Bible pairs may exhibit (or be missing) different full or partial bridge links for a given word (due both to different lexical usage and poor textual parallelism in some text-regions or version pairs). However, a22 a16 a18a85a105a114a108a50a109a111a109a111a110a111a19a61 a97a100a99a102a101 a103a139a20 and a22a54a16a74a18a85a61 a97a100a99a102a101 a103a120a19a105 infla20 need not be estimated from the same Bible pair. Even if one has only one Bible in a given source language, each alignment with a distinct English version gives new bridging opportunities with no additional resources needed on the source language side. The baseline approach (evaluated here) is simply to concatenate the different aligned versions together. While word-pair instances translated the same way in each version will be repeated, this rather reasonably reflects the increased confidence in this particular alignment. An alternate model would weight version pairs differently based on the otherwise-measured translation faithfulness and alignment quality between the version pairs. Doing so would help decrease noise. Increasing from 1 to 3 English versions reduces the type error rate (at full coverage) by 22% on French and</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Bridge Languages
</SectionTitle>
      <Paragraph position="0"> Once lemmatization capabilities have been successfully projected to a new language (such as French), this language can then serve as an additional bridging source for morphology induction in a third language (such as Spanish), as illustrated in Figure 11. This can be particularly effective if the two languages are very similar (as in Spanish-French) or if their available Bible versions are a close translation of a common source (e.g. the Latin Vulgate Bible). As shown in Table 3, using the previously analyzed French Bible as a bridge for Spanish achieves performance (97.4% precision) comparable to the use of 3 parallel English Bible versions.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
7.3 Morphology Induction: Observations
</SectionTitle>
      <Paragraph position="0"> This section includes additional detail regarding the morphology induction experiments, supplementing the previous details and analyses given in Section 7 and Table 3.</Paragraph>
      <Paragraph position="1"> a156 Performance induction using the French Bible as the bridge source is evaluated using the full test verb set extracted from the French Hansards. The strong performance when trained only using the Bible illustrates that even a small single text in a very different genre can provide effective transfer to modern (conversational) French. While the observed genre and topic-sensitive vocabulary differs substantially between the Bible and Hansards, the observed inventories of stem changes and suffixation actually have large overlap, as do the set of observed high-frequency irregular verbs. Thus the inventory of morphological phenomena seem to translate better across genre than do lexical choice and collocation models.</Paragraph>
      <Paragraph position="2"> a156 Over 60% of errors are due to gaps in the candidate rootlists. Currently the candidate rootlists are derived automatically by applying the projected POS models and selecting any word with the probability of being an uninflected verb greater than a generous threshold and also ending in a canonical verb suffix. False positives are easily tolerated (less than 5% of errors are due to spurious non-root competitors), but with missing roots the algorithms are forced either to propose previously unseen roots or align to the closest previously observed root candidate. Thus while no non-English dictionary was used in the computation of these results, it would substantially improve performance to have a dictionary-based inventory of potential roots, increasing coverage and decreasing noise from competing non-roots and spelling errors.</Paragraph>
      <Paragraph position="3"> a156 Performance in all languages has been significnatly hindered by low-accuracy parallel-corpus word-alignments using the original Model-3 GIZA tools. Use of Och and Ney's recently released and enhanced GIZA++ word-alignment models (Och and Ney, 2000) should improve performance for all of the applications studied in this paper, as would iterative realignments using richer alignment features (including lemma and part-of-speech) derived from this research.</Paragraph>
      <Paragraph position="4"> a156 The current somewhat lower performance on Czech is due to several factors. They include (a) very low accuracy initial word-alignments due to often non-parallel translations of the Reader's Digest sample and the failure of the initial word-alignment models to handle the highly inflected Czech morphology. (b) the small size of the Czech parallel corpus (less than twice the length of the Bible). (c) the common occurrence in Czech of two very similar perfective and non-perfective root variants (e.g. odol'avat and odolat, both of which mean to resist). A simple monolingual dictionaryderived list of canonical roots would resolve ambiguity regarding which is the appropriate target.</Paragraph>
      <Paragraph position="5"> a156 Many of the errors are due to all (or most) inflections of a single verb mapping to the same incorrect root. But for many applications where the function of lemmatization is to cluster equivalent words (e.g. stemming for information retrieval), the choice of label for the lemma is less important than correctly linking the members of the lemma.</Paragraph>
      <Paragraph position="6">  that large quantities of parallel text currently exist in translation bureau archives and OCR-able books, not to mention the increasing online availability of bitext on the web, the natural growth of available bitext quantities should continue to support performance improvement.</Paragraph>
      <Paragraph position="7"> a156 The system analysis examples shown in Table 4 are representative of model performance and are selected to illustrate the range of encountered phenomena. All system evaluation is based on the task of selecting the correct root for a given inflection (which has a long lexicography-based consensus regarding the &amp;quot;truth&amp;quot;). In contrast, the descriptive analysis of any such pairing is very theory dependent without standard consensus. The &amp;quot;TopBridge&amp;quot; column shows the strongest English bridge lemma utilized in mapping (typically one of many potential bridge lemmas).</Paragraph>
      <Paragraph position="8"> These results are quite impressive in that they are based on essentially no language-specific knowledge of French, Spanish or Czech. In addition, the multilingual bridge algorithm is surface-form independent, and can just as readily handle obscure infixational or reduplicative morphological processes.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML