File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/96/c96-2208_concl.xml
Size: 1,009 bytes
Last Modified: 2025-10-06 13:57:42
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2208"> <Title>The Automatic Extraction of Open Compounds from Text Corpora</Title> <Section position="7" start_page="1145" end_page="1145" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> This paper has shown an algorithm for data preparation and open compound extraction. The cornpetitive selection and unified selection of rightward and leftward sorted strings play an important role in improving accuracy of the extraction. In the experiment, we applied Thai spelling rules to restrict the search path for string counts. Some types of spelling irregularities can be excluded by this process. By adjusting the value of threshold, we can extract suitable entries for open compound registration regardless of the size of the input file. Furthermore, our method has ensured the extraction of new words from the text file of the language that has no explicit word boundary, such as Thai.</Paragraph> </Section> class="xml-element"></Paper>