File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/c90-1019_intro.xml
Size: 5,704 bytes
Last Modified: 2025-10-06 14:04:48
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-1019"> <Title>Deep Sentence Understanding in a Restricted Domain*</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> We present here the current prototype of the text understanding system HI~LI~NE. The objective of this system is to achieve a deep understanding of small reports dealing with a restricted domain (here, patient discharge summaries in a medical specialty). This means that H~LJ~NE should rely on an extensive description of all types of required knowledge. This implies of course a deep domain knowledge base, including the needed common sense knowledge.</Paragraph> <Paragraph position="1"> Precise understanding must not only rely on complete domain knowledge, but also on enough syntactic information. This is the reason why H~LI~NE includes a full syntactic module, whose task is to provide the semantic construction module with the (deep) structures of sentences.</Paragraph> <Paragraph position="2"> One problem with syntactic processing is that it gives rise to numerous ambiguities. These ambiguities are filtered on semantic grounds by a disambiguation module that does not build any semantic representation.</Paragraph> <Paragraph position="3"> *supported by AIM project 1003 and PRC Communication Homme-Machine, PSle Langage Naturel. Semantic construction is concerned with the recognition of domain entities that can be expressed by word groups. We thus had to adopt a lexical semantics approach compatible with descriptions. Domain entities, once instantiated, provide the basis on which a model of the current state of the world (here, the patient state) is built. The same lexical semantic in{brmation is used both to help syntactic processing, and in a more extensive way to access domain models in order to build semantic representations.</Paragraph> <Paragraph position="4"> The prototype includes the following main modules: The syntactic module implements the Lexical Functional Grammar formalism \[7\]. The parser builds c-structure and f-structure bottom-up in parallel on a chart, so that f-structure validity can constrain c--structure construction.</Paragraph> <Paragraph position="5"> Ambiguous attachments are submitted to evaluation and ranking by the disambiguation module. This module applies a set of general heuristic rules that operate on the semantic definition of the LFG predicates.</Paragraph> <Paragraph position="6"> Semantic construction relies on dynamic domain models that integrate common sense. LFG predicates are characterized by semantic components that point to parts of the knowledge base.</Paragraph> <Paragraph position="7"> The prototype runs in Common Lisp (VAX Lisp) and K, a proprietary language embedded in Common Lisp. The remaining sections describe these modules in more detail.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2 Parsing with a Lexical- Functional Grammar </SectionTitle> <Paragraph position="0"> We chose to implement the LFG fl:amework for several reasons. Being a linguistic theory, 82 1 it should provide better foundations for principled syntactic coverage. A formalism with a context-free backbone was easier to implement. Furthermore, LFG extracts from a sentence a predicate-argument structure which consitutes a good starting point for semantic processing. Our implementation of LFG does not include yet a schema for long-distance dependencies (or functional uncertainty) and coordination. It allows cyclic f-structures.</Paragraph> <Paragraph position="1"> Our parser uses a chart to build both c-structure and f-structure. Incomplete and complete constituents are represented by active and inactive cs-edges, while incomplete and complete fstructures are placed on active and inactive fs-edges. The parsing strategy is bottomup, left to right (left corner). Top-down parsing is also available, as well as right to left parsing. LFG schemas are evaluated as soon as possible. Equational (construction) schemas are evaluated when encountered, and constraint schemas (existential, equational and negation of those) are kept on fs-edges until they can be evaluated. When fs-edges are combined, remaining constraints are propagated to the resulting fs-edge. Each new active f-structure is tested for consistency and coherence. Furthermore, the value of a closed function is tested for completeness (this should be revised if a scheme for long-distance dependencies is implemented). When a constraint is violated, its fs-edge is flagged as invalid.</Paragraph> <Paragraph position="2"> Grammar rules are described as regular expressions which are compiled into (reversible) transition networks. Each arc of those networks is labelled with a category and a disjunction of conjunctions of schemas. A model of hierarchical lexical entry representation has been developed, with data-driven lexical rules. It is not currently coupled to the parser, and will not be presented here. The prototype uses a simple word list with what would be the result of this model as lexical entries.</Paragraph> <Paragraph position="3"> The prototype uses a small French grammar that contains 14 networks, equivalent to 90 rules. It was assembled by borrowing from the literature \[3,10\] and by completing with grammar manuals and real text confrontation. It has the particularity of building cyclic f-structures for constructions where a head plays a role inside an adjunct. This is how we process attributive adjectives, participial phrases, and (in a very limited way) relative phrases.</Paragraph> </Section> </Section> class="xml-element"></Paper>