File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-0410_concl.xml

Size: 2,347 bytes

Last Modified: 2025-10-06 13:53:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0410">
  <Title>Semi-supervised Verb Class Discovery Using Noisy Features</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
7 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> We have explored manual, unsupervised, and semi-supervised methods for feature selection in a clustering approach for verb class discovery. We find that manual selection of a subset of features based on the known classification performs better than using a full set of noisy features, demonstrating the potential benefit of feature selection in our task. An unsupervised method we tried (Dash et al., 1997) did not prove useful, because of the problem of having no consistent threshold for feature inclusion. We instead proposed a semi-supervised method in which a seed set of verbs is chosen for training a supervised classifier, from which the useful features are extracted for use in clustering. We showed that this feature set outperformed both the full and the manually selected sets of features on all three of our clustering evaluation metrics. Furthermore, the method is relatively insensitive to the precise make-up of the selected seed set.</Paragraph>
    <Paragraph position="1"> As successful as our seed set of features is, it still does not achieve the accuracy of a supervised learner. More research is needed on the definition of the general feature space, as well as on the methods for selecting a more useful set of features for clustering. Furthermore, we might question the clustering approach itself, in the context of verb class discovery. Rather than trying to separate a set of new verbs into coherent clusters, we suggest that it may be useful to perform a nearest-neighbour type of classification using a seed set, asking for each new verb &amp;quot;is it like these or not?&amp;quot; In some ways our current clustering task is too easy, because all of the verbs are from one of the target classes. In other ways, however, it is too difficult: the learner has to distinguish multiple classes, rather than focus on the important properties of a single class. Our next step is to explore these issues, and investigate other methods appropriate to the practical problem of grouping verbs in a new language.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML