File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-0717_intro.xml
Size: 1,197 bytes
Last Modified: 2025-10-06 14:01:01
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0717"> <Title>Inducing Syntactic Categories by Context Distribution Clustering</Title> <Section position="4" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Previous Work </SectionTitle> <Paragraph position="0"> Previous work falls into two categories. A number of researchers have obtained good results using pattern recognition techniques. Finch and Chater (1992), (1995) and Schfitze (1993), (1997) use a set of features derived from the co-occurrence statistics of common words together with standard clustering and information extraction techniques. For sufficiently frequent words this method produces satisfactory results.</Paragraph> <Paragraph position="1"> Brown et al. (1992) use a very large amount of data, and a well-founded information theoretic model to induce large numbers of plausible semantic and syntactic clusters. Both approaches have two flaws: they cannot deal well with ambiguity, though Schfitze addresses this issue partially, and they do not cope well with rare words. Since rare and ambiguous words are very common in natural language, these limitations are serious.</Paragraph> </Section> class="xml-element"></Paper>