File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/a94-1025_concl.xml
Size: 2,130 bytes
Last Modified: 2025-10-06 13:57:08
<?xml version="1.0" standalone="yes"?> <Paper uid="A94-1025"> <Title>A robust category guesser for Dutch medical language</Title> <Section position="10" start_page="153" end_page="153" type="concl"> <SectionTitle> 9 Discussion </SectionTitle> <Paragraph position="0"> A choice was to be made between keeping more potential analyses likely to be correct versus restricting the cohort to one (or a limited set of) analysis which may be incorrect. As a general strategy, we prefer to restrict as early as possible the search space on all the levels of the language understanding system. Otherwise useless hypotheses will be propagated through the whole system causing a combinatorial explosion. However, this attitude can lead to the rejection of valid solutions and, in the worst case, can be responsible for a complete failure of the language understanding system.</Paragraph> <Paragraph position="1"> A possible optimisation resides in the storage of the medical suffixes and endstrings in their inflected forms. They could be integrated in the already existing full form dictionary. In order to accelerate the decomposition phases, the morphemes or strings could be stored in reversed order.</Paragraph> <Paragraph position="2"> These reorganisations of the data structures also influence the high level algorithm (cf. section 7).</Paragraph> <Paragraph position="3"> Since all the words, suffixes and endstrings would be stored in the database as full forms, the inflectional analyser (cf. section 3.2) would be merely needed for 7Sometimes the surface form alone does not permit an unequivocal categorization (f.i. in principle, a Dutch noun formally equals the first person singular present of a regular verb).</Paragraph> <Paragraph position="4"> the computation of a hypothetical canonical form and its syntactic characteristics when applying the catch all rule. This leads without any doubt to a faster execution of the category guesser as a whole.</Paragraph> <Paragraph position="5"> As a corollary, the overall architecture of the entire component becomes simpler and more homogeneous.</Paragraph> </Section> class="xml-element"></Paper>