File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/p99-1009_concl.xml
Size: 2,673 bytes
Last Modified: 2025-10-06 13:58:21
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1009"> <Title>Man* vs. Machine: A Case Study in Base Noun Phrase Learning</Title> <Section position="6" start_page="69" end_page="70" type="concl"> <SectionTitle> 6 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> In this paper we have described research we undertook in an attempt to ascertain how people can perform compared to a machine at learning linguistic information from an annotated corpus, and more importantly to begin to explore the differences in learning behavior between human and machine. Although people did not match the performance of the machine-learned annotator, it is interesting that these &quot;language novices&quot;, with almost no training, were able to come fairly close, learning a small number of powerful rules in a short amount of time on a small training set. This challenges the claim that machine learning offers portability advantages over manual rule writing, seeing that relatively unmotivated people can near-match the best machine performance on this task in so little time at a labor cost of approximately US$40.</Paragraph> <Paragraph position="1"> We plan to take this work in a number of directions. First, we will further explore whether people can meet or beat the machine's accuracy at this task. We have identified one major weakness of human rule writers: capturing information about low frequency events. It is possible that by providing the person with sufficiently powerful corpus analysis tools to aide in rule writing, we could overcome this problem.</Paragraph> <Paragraph position="2"> We ran all of our human experiments on a fixed training corpus size. It would be interesting to compare how human performance varies as a function of training corpus size with how machine performance varies.</Paragraph> <Paragraph position="3"> There are many ways to combine human corpus-based knowledge extraction with machine learning. One possibility would be to combine the human and machine outputs. Another would be to have the human start with the output of the machine and then learn rules to correct the machine's mistakes. We could also have a hybrid system where the person writes rules with the help of machine learning. For instance, the machine could propose a set of rules and the person could choose the best one. We hope that by further studying both human and machine knowledge acquisition from corpora, we can devise learning strategies that successfully combine the two approaches, and by doing so, further improve our ability to extract useful linguistic information from online resources.</Paragraph> </Section> class="xml-element"></Paper>