File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-1048_intro.xml
Size: 2,908 bytes
Last Modified: 2025-10-06 14:06:34
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1048"> <Title>Sylvain_Delisle @uqtr.uquebec.ca</Title> <Section position="2" start_page="0" end_page="307" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Any language processing program---in our case, a top-down parser which outputs only the first tree it could find--must make decisions as to what processing strategy, or rule ordering, is most appropriate for the problem (i.e. string) at hand.</Paragraph> <Paragraph position="1"> Given the size and the intricacy of the rule-base and the goal (to optimise a parser's precision, or recall, or even its speed), this becomes a complex decision problem. Without precise knowledge of the kinds of texts that will be processed, these decisions can at best be educated guesses. In the parser we used, they were performed with the help of hand-crafted heuristic rules, which are briefly presented in section 2.</Paragraph> <Paragraph position="2"> Even when the texts are available to fine-tune the parser, it is not obvious how these decisions are to be made from texts alone. Indeed, the decisions may often be expressed as rules whose representation is in terms which are not directly or easily available from the text (e.g. non-terminals of the grammar of the language in which the texts are written). Hence, any technique that may automatically or semi-automatically adapt such rules to the corpus at hand will be valuable. As it is often the case, there may be a linguistic shift in the kinds of texts that are processed, especially if the linguistic task is as general as parsing. It is then interesting to adapt the &quot;version&quot; of the parser to the corpus at hand.</Paragraph> <Paragraph position="3"> We report on an experiment that targets this kind of adaptability. We use machine learning as an artificial intelligence technique that achieves adaptability. We cast the task described above as a classification task: which, among the parser's top-level rules, is most appropriate to launch the parsing of the current input string? Although we restricted ourselves to a subset of a parser, our objective is broader than just applying an existing learning system on this problem. What is interesting is: a) definition of the attributes in which examples are given, so that the attributes are both obtainable automatically from the text and lead to good rules--this is called &quot;feature engineering&quot;; b) selection of the most interesting learned rules; c) incorporation of the learned rules in the parser; d) evaluation of the performance of the learned rules after they have been incorporated in the parser. It is the lessons from the whole cycle that we followed in the work that we report here, and we suggest it as a methodology for an adaptive optimisation of language processing programs.</Paragraph> </Section> class="xml-element"></Paper>