File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-3229_concl.xml

Size: 2,488 bytes

Last Modified: 2025-10-06 13:54:31

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3229">
  <Title>A Resource-light Approach to Russian Morphology: Tagging Russian using Czech resources</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Ongoing Research
</SectionTitle>
    <Paragraph position="0"> We are currently working on improving both the morphological analysis and tagging. We would like  to improve the recall of filters following morphological analysis, e.g., using n maximal values instead of 1, using some basic knowledge of derivational morphology, etc. We are incorporating phonological conditions on stems into the guesser module as well as trying to deal with different morphological phenomena specific to Russian, e.g., verb reflexivization. However, we try to stay language independent (at least within Slavic languages) as much as possible and limit the language dependent components to a minimum.</Paragraph>
    <Paragraph position="1"> Currently, we are working on more sophisticated russifications that would be still easily portable to other languages. For example, instead of omitting auxiliaries randomly, we want to use the syntactic information present in Prague Dependency Tree-bank to omit only the 'right' ones.</Paragraph>
    <Paragraph position="2"> If possible, we would like to avoid entirely throwing away the Czech emission probabilities, because our intuition tells us that there are useful lexical similarities between Russian and Czech, and that some suitable process of cognate detection will allow us to transfer information from the Czech to the Russian emission probabilities. Just as a knowledge of English words is sometimes helpful (modulo sound changes) when reading German, a knowledge of the Czech lexicon should be helpful (modulo character set issues) when reading Russian. We are seeking the right way to operationalize this intuition in our system, bearing in mind that we want a sufficiently general algorithm to make the method portable to other languages, for which we assume we have neither the time nor the expertise to undertake knowledge-intensive work. A potentially suitable cognate algorithm is described by (Kondrak, 2001).</Paragraph>
    <Paragraph position="3"> Finally, we would like to extend our work to Slavic languages for which there are even fewer available resources than Russian, such as Belarusian, since this was the original motivation for undertaking the work in the first place.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML