File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/w94-0102_intro.xml

Size: 2,752 bytes

Last Modified: 2025-10-06 14:05:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="W94-0102">
  <Title>AMALGAM: : Automatic Mapping Among Lexico-Grammatical Annotation Models</Title>
  <Section position="3" start_page="16" end_page="16" type="intro">
    <SectionTitle>
Anticipated Results
</SectionTitle>
    <Paragraph position="0"> The tangible 'deliverables' of use to the Speech and Language research community include: . Final implementations of algorithms for mapping between pairs of tagsets . Final implementations of algorithms for mapping be- null The mapping software, Multi-tagged Corpus and MultiTreebank (along with postediting handbooks and documentation) will be delivered to ICAME and Oxford Text Archive for public distribution; they will also be available for incorporation into the SEC Speech Database. Reports on the findings of the three stages of investigations will be made widely available to all interested parties through SALT and ELSNET (UK and European Networks of Excellence) and other channels including conference presentations and journal papers.  The implemented mapping algorithms will be made widely available to the UK and international speech and language research community. They will allow research groups who are using corpus-based training data to make use of other corpora straightforwardly, without substantial modifications. Any current and future users of corpora will have a much expanded resource.</Paragraph>
    <Paragraph position="1"> The Multi-Tagged Corpus and the Multi-Treebank will be distributed, along with the main Spoken English Corpus, through ICAME. They will also be available for incorporation into the SEC Speech Database currently being created by Gerry Knowles and Peter Roach, further enhancing the SEC as a general research resource. Both the Multi-Tr'eebank and the Multi-Tagged corpus will potentially be used by speech and language technology groups for many research and teaching purposes, including: training data for speech-recognisers, optical text recognisers, word processor text-critiquing systems, machine translation systems, natural language interfaces, and NLP applications generally; and for providing examples for English Language Teaching (ELT) grammar textbooks and training material. In addition, the Multi-Treebank may be used as a testbed and benchmark for parsers (explored in the workpla.). It would also be a rich resource for grammar-learning ,,xperiments - a research topic of growing interest (see ,,.g. \[8\], \[11\], \[16\], \[33\]). We envisage supplying the computational linguistics research community with a valuable research rcso,,rc,', and the ACL Workshop will be all invaluable ol?port, nity for us to survey potential customer require.w.ts and preferences!</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML