File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1112_metho.xml

Size: 1,596 bytes

Last Modified: 2025-10-06 14:15:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1112">
  <Title>Categories \] % Alignable PN Precision Recall Person \]LO0% Place Organisation Law Title Publication</Title>
  <Section position="4" start_page="106" end_page="106" type="metho">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> We have not yet calculated how many detected collocations are included in the glossary, although it has become clear that a high proportion of these detected collocations have not been considered by the translators who created the dictionary. These tend to include only collocations which have a clear terminological appearance. It is hard to discriminate between general language collocations and domain specific terminology and this discussion is beyond the scope of this paper.</Paragraph>
    <Paragraph position="1"> The correspondence table with Spanish and Basque grammatical patterns is at present problematic. This is due to the lack of morphological information in the output of the Basque lemmatizer. Basque is an aglutinative language which has postpositions and other functional elements added as suffixes. The information such suffixes provide is not shown by the lemmatizer and this inevitably hinders the efficiency of the correspondence table. However we are confident that future versions of both the Basque and Spanish lemmatizers will become closer because they are currently developed within the same project team. When their output becomes more homogeneous, the efficiency of the correspondence table will be greately increased.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML