File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/e06-1047_concl.xml

Size: 2,201 bytes

Last Modified: 2025-10-06 13:55:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-1047">
  <Title>Parsing Arabic Dialects</Title>
  <Section position="9" start_page="374" end_page="375" type="concl">
    <SectionTitle>
8 Summary of Results and Discussion
</SectionTitle>
    <Paragraph position="0"> We have built three frameworks for leveraging MSA corpora and explicit knowledge about the lexical, morphological, and syntactic differences between MSA and LA for parsing LA. The results on TEST are summarized in Table 7, where performance is given as absolute and relative reduction in labeled F-measure error (i.e., 100[?]F). We see that some important improvements in parsing  error reduction in F-measure over baseline (using MSA parser on LA test corpus); all numbers are for best obtained results using that method quality canbeachieved. Wealso remindthereader that on the ATB, state-of-the-art performance is currently about 75% F-measure.</Paragraph>
    <Paragraph position="1"> There are several important ways in which we can expand our work. For the sentence-transduction approach, we plan to explore the use of a larger set of permutations; to use improved language models on MSA (such as language models built on genres closer to speech); to use lattice parsing (Sima'an, 2000) directly on the translation lattice and to integrate this approach with the treebank transduction approach. For the treebank and grammar transduction approaches, we would like toexplore moresystematic syntactic, morphological, and lexico-syntactic transformations. We would also like to explore the feasibility of inducing the syntactic and morphological transformations automatically. Specifically for the treebank transduction approach, it would be interesting to apply an LA language model for the lexical substitution phase as ameans of pruning out implausible word sequences.</Paragraph>
    <Paragraph position="2"> For all three approaches, one major impediment to obtaining better results is the disparity in genre and domain whichaffects the overall performance.</Paragraph>
    <Paragraph position="3"> This may be bridged by finding MSA data that is more in the domain of the LA test corpus than the MSA treebank.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML