File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-0431_evalu.xml
Size: 1,631 bytes
Last Modified: 2025-10-06 13:59:00
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0431"> <Title>Meta-Learning Orthographic and Contextual Models for Language Independent Named Entity Recognition</Title> <Section position="6" start_page="0" end_page="0" type="evalu"> <SectionTitle> 5 Results </SectionTitle> <Paragraph position="0"> N-Grams, in various instances, were able to capture information about various structural phenomenon. For example, the bi-gram 'ae' occurred in an entity in approximately 96% of instances in the English training set and 91% in the German set, showing that the compulsion to not assimilate old forms of names 'Israel' and 'Michael' to something like 'Israil' and 'Michal' is more emergent than the constraint to maintain form. An example bi-gram indicating a word from a foreign language with a different phonology is 'cz', representing the voiced palatal fricative, which is not commonly used in English or German. The fact two characters were needed to represent one underlying phoneme in itself suggests this. Within English, the suffix 'gg' always indicates a named entity, with the exception of the word 'egg', which has retained both g's in accordance with the English constraint of content words being three or more letters long. All other word's with an etymological history of a 'gg' suffix such as 'beg' have assimilated to the shorter form.</Paragraph> <Paragraph position="1"> The meta-learning strategy improved the German test set results by Fb=1 9.06 over a vote across the classifiers.</Paragraph> <Paragraph position="2"> For English test set, this improvement was Fb=1 0.40.</Paragraph> </Section> class="xml-element"></Paper>