File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/a00-2031_metho.xml
Size: 3,342 bytes
Last Modified: 2025-10-06 14:07:04
<?xml version="1.0" standalone="yes"?> <Paper uid="A00-2031"> <Title>Assigning Function Tags to Parsed Text*</Title> <Section position="4" start_page="236" end_page="236" type="metho"> <SectionTitle> 3 Experiment </SectionTitle> <Paragraph position="0"> In the training phase of our experiment, we gathered statistics on the occurrence of function tags in sections 2-21 of the Penn treebank.</Paragraph> <Paragraph position="1"> Specifically, for every constituent in the treebank, we recorded the presence of its function tags (or lack thereof) along with its conditioning information. From this we calculated the empirical probabilities of each function tag referenced in section 2 of this paper. Values of )~ were determined using EM on the development corpus (treebank section 24).</Paragraph> <Paragraph position="2"> To test, then, we simply took the output of our parser on the test corpus (treebank section 23), and applied a postprocessing step to add function tags. For each constituent in the tree, we calculated the likelihood of each function tag according to the feature tree in figure 4, and for each category (see figure 2) we assigned the most likely function tag (which might be the null tag).</Paragraph> <Paragraph position="3"> 2The reader will note that the 'features' listed in the tree are in fact not boolean-valued; each node in the given tree can be assumed to stand for a chain of boolean features, one per potential value at that node, exactly one of which will be true.</Paragraph> </Section> <Section position="5" start_page="236" end_page="237" type="metho"> <SectionTitle> 4 Evaluation </SectionTitle> <Paragraph position="0"> To evaluate our results, we first need to determine what is 'correct'. The definition we chose is to call a constituent correct if there exists in the correct parse a constituent with the same start and end points, label, and function tag (or lack thereof). Since we treated each of the four function tag categories as a separate feature for the purpose of tagging, evaluation was also done on a per-category basis.</Paragraph> <Paragraph position="1"> The denominator of the accuracy measure should be the maximum possible number we could get correct. In this case, that means excluding those constituents that were already wrong in the parser output; the parser we used attains 89% labelled precision-recall, so roughly 11% of the constituents are excluded from the function tag accuracy evaluation. (For reference, we have also included the performance of our function tagger directly on treebank parses; the slight gain that resulted is discussed below.) Another consideration is whether to count non-tagged constituents in our evaluation. On the one hand, we could count as correct any constituent with the correct tag as well as any correctly non-tagged constituent, and use as our denominator the number of all correctly-labelled constituents. (We will henceforth refer to this as the 'with-null' measure.) On the other hand, we could just count constituents with the correct tag, and use as our denominators the total number of tagged, correctly-labelled constituents. We believe the latter number ('nonull') to be a better performance metric, as it is not overwhelmed by the large number of untagged constituents. Both are reported below.</Paragraph> </Section> class="xml-element"></Paper>