File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1071_intro.xml

Size: 5,413 bytes

Last Modified: 2025-10-06 14:01:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1071">
  <Title>Integrating Shallow Linguistic Processing into a Unication{based</Title>
  <Section position="3" start_page="0" end_page="3" type="intro">
    <SectionTitle>
syntacticandsemanticanalysisofthesentences
</SectionTitle>
    <Paragraph position="0"> itprocesses,howeveritfailsinproducing aresult when the linguistic structure being processedand/orwordsintheinputsentencesfall null beyond the coverage of the grammatical resources. Natural Language Processing (NLP) systemswithmonolithicgrammars,inaddition,  havetodealwithhugesearchspaceduetoseveral sources of non{determinism (i.e. ambiguity). Thisisparticularlytrueofbroad{coverage unication{based grammars where all dimensionsoflinguisticinformationareinterleaved,as null theoriessuchasHPSGpropose. Lackofrobustness and inecient processing makesuchsystems inadequate for practical applications e.g. NaturalLanguageInterfaces(NLI).</Paragraph>
    <Paragraph position="1">  ThispaperpresentsaNLPsystemwhichintegratesalinguistic Part{of{Speech(PoS)tagger and chunker (as opposed to data{driven) asapreprocessingmodule ofabroad{coverage unication{basedgrammarofSpanish.</Paragraph>
    <Paragraph position="2"> By integrating shallow and deep processing  theeciencyoftheoverallanalysisprocessimproves signicantly, since we can release the parser from certain tasks that maybeecientlyandreliablydealtwithbycomputation- null allyless expensivetechniques. Theintegration ofshallowprocessing,inaddition,providesthe unication{basedgrammarwithlargercoverage  forsyntacticstructuresandallowsustoimplement default lexical entry templates for virtuallyunlimitedlexicalcoveragewhileavoidingin- null creaseinambiguity.</Paragraph>
    <Paragraph position="3"> Thesystemwepresentisinspiredby(Abney,</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Coverage of the Grammar
</SectionTitle>
      <Paragraph position="0"> The range of linguistic phenomena that the grammar handles includes: all types of sub-categorization structures, determination (simple and complex), a full coverage of agreement (subject{verb, subject{attribute, agreementwithin theNP),null{subjects (pro{drop, impersonal sentences), compound tenses and periphrastic forms, clausal complements (completive clauses and indirect questions), control and raising structures, support verb constructions, passive constructions (with the copula, withorwithoutthe`by{agent'complement,and  reexivepassive),modiersofverbs,nouns,adjectives and adverbs, negation, sentential adjuncts, topicalization, relative and interrogatives clauses, surface word order variation, co-ordination (binary,enumeration and coordination of unlike categories), clitics (clitic{NP alternation, clitic doubling, clitic climbing, enclitics), NPs with no noun{head, non{sentential input strings and special constructions (number,dates,...). null</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="2" type="sub_section">
      <SectionTitle>
2.2 The ALEP Architecture
</SectionTitle>
      <Paragraph position="0"> ALEP distinguishes preprocessing operations and linguistic processing operations. The former|TextHandling(TH)andorphographemic null  analyses|accountforsurfacepropertiesofinput text (document formatting, delimitation oftextual structural elements, orthographemic aspects of morphology), while the latter | parsing and renement |deal with its non{ surface properties (morphosyntactic analysis, constituent structure, semantic representation). null  A special rule{based operation | Lifting |interfaces the output of the preprocessingoperationwiththeparsingoperation. null</Paragraph>
    </Section>
    <Section position="3" start_page="2" end_page="3" type="sub_section">
      <SectionTitle>
2.3 The ALEP Linguistic Formalism
</SectionTitle>
      <Paragraph position="0">  TheALEPlinguisticformalismhasbeendeveloped on the basis of the specications result- null A distinctive feature of the ALEP processing architecture is the division of the analysis task into two sub{ tasks: `parsing', which builds up a complete but shallow phrase structure tree, and `renement', which traverses the structure top{down, thus monotonically performing feature decoration, typically with semantic information.</Paragraph>
      <Paragraph position="1"> 1991). It is a so called \lean&amp;quot; formalism compilableintorst{order(Prolog)termsandthus null avoiding computationally expensive formaldevices. null AnALEPgrammarisimplemented byspecifyinglexical entriesandgrammarrules, based onatypesystemthatconstitutesamonotonic  simpletypehierarchywithappropriatenessconditions. null Lexical entries are based on the data structureLinguisticDescription(LD),collectingcon- null straints on the type system. The lexical component of our grammar plays a crucial role in  thegrammaticaldescriptionneededforprocessing. Itisahighlylexicalizedgrammarwherelinguisticphenomena,suchassubject{verbagree- null ment, subcategorization, modication, control relations,etc.,traditionallydealtwithbymeans ofspecializedphrasestructurerules,aretreated inthelexicon. Grammarrulesarethusreduced toasmallsetofbinary{branchingcontext{free phrasestructure rules, which arebased on the datastructureLinguisticStructure(LS).</Paragraph>
      <Paragraph position="2">  The adopted approach in the grammar we present follows HPSG proposals (Pollard and Sag,1994).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML