File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/97/w97-0116_relat.xml
Size: 3,251 bytes
Last Modified: 2025-10-06 14:16:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0116"> <Title>Acquiring German Prepositional Subcategorization Frames from Corpora</Title> <Section position="5" start_page="163" end_page="164" type="relat"> <SectionTitle> 4 Related Work </SectionTitle> <Paragraph position="0"> The automatic extraction of English subcategorization frames has been considered in (Brent, 1991; Brent, 1993), where a procedure is presented that takes untamed text as input and generates a list of verbal subcategorization frames. The procedure uses a very simple heuristics to identify verbs; the synt~tic types of nearby phrases are identified by relying on local morpho-syntactic cues. Once potential verbs and SFs are identifled, a final com- null portent attempts to determine when a lexical form occurs with a cue often enough so that it is unlikely to be due to errors; an automatically computed error rate is used to filter out potentially erroneous cues. Prepositional frames are not considered, since, according to the author, &quot;it is not clear how a machine learning system could do this \[determine which PPs are arguments and which are adjuncts\].&quot; In (Manning, 1991) another method is introduced for producing a dictionary of English verbal subcategorization frames. This method makes use of a stochastic tagger to determine part of speech, and a Finite state parser which r~m.~ on the output of the ta~er, identifying auxiliary sequences, noting putative complements after verbs and collecting histogram-type frequencies of possible SFs. The final component assesses the frames encountered by the parser by using the same model as (Brent, 1993), with the error rate set empirically. Prepositional verbal frames are learned by the system by relying on PPs as cues for subcategorization; since the system cannot differentiate between complement and adjunct prepositional cues, it learns frequent prepositional adjuncts as well.</Paragraph> <Paragraph position="1"> In order to evaluate the acquired dictionary, M~nn~ng compares the frames obtained for 40 random verbs to those in a published dictionary, yielding for these verbs an overall precision and recall rates of 90~ and 43% respectively. However, if only the prepositional frames listed for these verbs are considered, the rates drop to appro~mately 84% and 25%, respectively.</Paragraph> <Paragraph position="2"> In the experiment described, the error bounds for the filtering procedure were chosen with the aim of &quot;get\[ing\] a highly accurate dictionary at the expense of recall.&quot; His system did not consider nomlnal and adjectival frames. (Carrol and Rooth, 1997) present a learnln~ procedure for English sub-categorization information. Unlike previous approaches, it is based on a probabilistic context free grammar. The system uses expected frequencies of head words and frames--calculated using a hand-written grammar and occurrences in a text corpuswto iteratively estimate probability parameters for a PCFG using the expectation maximi~.ation algorithm. These parameters are used to rh~racterize verbal, nominal and adjectival SFs. The model does not distinguish between complements and adjunct prepositional cues.</Paragraph> </Section> class="xml-element"></Paper>