File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/c90-1005_intro.xml
Size: 2,538 bytes
Last Modified: 2025-10-06 14:04:50
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-1005"> <Title>Tagging for Learning: Collecting Thematic Relations from Corpus</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Two text processing problems rely heavily on co-occurrence patterns- the way that words appear together, possibly idiosyncraticly. First, statistically weighted co-occurrence information can assist in the &quot;bracketing&quot; of noun groups, which can otherwise lead to a eombinatoric explosion of parse trees \[1\].</Paragraph> <Paragraph position="1"> Second, co-occurrence relations can provide evidence of semantic information for thematic-role assignment, an important task that is otherwise fraught with inaccuracy. null Only co-occurrence patterns collected over a corpus can help to determine which is .object and which is recipient in PAID DIVIDEND (IS SECURE) vs. PAID SHAREHOLDERS (ARE SATISFIED). A sufficiently rich lexicon would include the semantic preferences for distinguishing these thematic roles, but such a lexicon does not yet exist.</Paragraph> <Paragraph position="2"> Co-occurrence patterns are a means of probing a global corpus for clues that help resolve ambiguity at the local sentence level. Patterns such as PAID TO SHAREHOLDERS and PAID THEM THE DIVIDEND are detected in the corpus at large. Through these latter examples, in which the distinction between recipient and object relative to the dative verb PAY is made explicit, the former cases in which tile relation is implicit can be resolved.</Paragraph> <Paragraph position="3"> In contrast to previous work which addressed the identification of surface relations, i.e., SVO triples \[2\], in our work we address the acquisition of semantic relations, focussing at the assigment of thematic roles. This task (i.e. tagging for acquisition) requires high reliability and so it relies less on statistical properties and more on deterministic local marking.</Paragraph> <Paragraph position="4"> In this paper we discuss a technique for parsing and semanticly analyzing complex sentences with the aid of co-occurrence relations, and show how these relations are acquired from tagged corpus.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 The Phenomenon </SectionTitle> <Paragraph position="0"> Consider, for example, the sentence below, taken from the Dow-Jones newswire:</Paragraph> </Section> </Section> class="xml-element"></Paper>