File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/e99-1027_intro.xml
Size: 1,599 bytes
Last Modified: 2025-10-06 14:06:52
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1027"> <Title>Specifying a shallow grammatical representation</Title> <Section position="3" start_page="204" end_page="205" type="intro"> <SectionTitle> 2 ENGCG tag set </SectionTitle> <Paragraph position="0"> Descriptions of the morphological tags used by the English Constraint Grammar tagger are available in several publications. Brief descriptions can be found in several recent ACL conference proceedings by Voutilainen and his colleagues (e.g.</Paragraph> <Paragraph position="1"> EACL93, ANLP94, EACL95, ANLP97, ACL-EACL97). An in-depth description is given in Karlsson et al., eds., 1995 (chapters 3-6). Here, only a brief sample is given.</Paragraph> <Paragraph position="2"> ENGCG tagging is a two-phase process. First, a lexical analyser assigns one or more alternative analyses to each word. The following is a morphological analysis of the sentence The raids were coordinated under a recently expanded federal pro- null ,,<. >.</Paragraph> <Paragraph position="3"> Each indented line constitutes one morphological analysis. Thus program is five-ways ambiguous after ENGCG morphology. The disambiguation part of the ENGCG tagger ~ then removes those alternative analyses that are contextually illegitimate according to the tagger's hand-coded constraint rules (Voutilainen 1995). The remai-~ng analyses constitute the output of the tagger, in Overall, this tag set represents about 180 different analyses when certain optional auxiliary tags (e.g. verb subcategorisation tags) are ignored.</Paragraph> </Section> class="xml-element"></Paper>