File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1005_metho.xml
Size: 11,576 bytes
Last Modified: 2025-10-06 14:11:25
<?xml version="1.0" standalone="yes"?> <Paper uid="C82-1005"> <Title>TRANSFORMATION OF NATURAL LANGUAGE INTO LOGICAL FORMULAS</Title> <Section position="3" start_page="29" end_page="29" type="metho"> <SectionTitle> CATN AS A TOOL </SectionTitle> <Paragraph position="0"> A Cascaded Augmented Transition Network looks like two or more &quot;cascades&quot; which succesively perform the same information. Each of them is an ATN grammar (1) which has, In addition,, a new action called TRANSMIT. The TRANSMIT action may be set on every arc and causes a piece of Information to be sent from the current &quot;cascade&quot; to the lower one. Whenever a TRANSMIT occurs each information about the current &quot;cascade '= is saved on the stack while the parser operates on the lower &quot;cascade&quot; until new information or data is required. Then the higher &quot;cascade&quot; is activated from the same point It has been stopped.</Paragraph> <Paragraph position="1"> Two stages of our parsing system correspond to the CATN-casc~des. In the present realisation the structC/ re popped from the syntactical stage is TRANSMITed into semantic ~nterpretation because a free word-order of Polish sentences prohibits another solution. Partlculary, the places of the subject and the main verb in the sentence may be varying.</Paragraph> <Paragraph position="2"> If the second stage is not able to find an appropriate interpretation for syntactical structure the first stage is activated to build an alternative parsing. I/hen such a parsing cannot be rebuilt the parser fails.</Paragraph> <Paragraph position="3"> In the other Implementation of CATN we used the Earley's algorithm, a well-known context-free parsing method (10). In this case the syntactical analyser produces all possible p~rslngs at once. The semantical interpreter has to verify them and reject each meaning-less parsing.</Paragraph> <Paragraph position="4"> THE FIRST STAGE - SYNTACTICAL ANALYSIS A surface structure of a sentence is received after the First Stage of the parser was applled to an utterance. It means that such elements as VERB/ACTION, SUBJECT, OBJECT (direct and indirect), PREPOSI-TION PHRASES etc. are found out.</Paragraph> <Paragraph position="5"> Polish natural languaKe is a typlcal example of a flexiona\] language. One of its most characteristic features is a free word-order in a sentence. It is very Important for the parser to know each lex~cal parameter of nouns, adjectives, adverbs, numbers, preposltlons etc. These parameters are number, gender, case, person and de~ree. They</Paragraph> </Section> <Section position="4" start_page="29" end_page="29" type="metho"> <SectionTitle> TRANSFORMATION OF NATURAL LANGUAGE INTO LOGICAL FORMULAS 31 </SectionTitle> <Paragraph position="0"> are carrled over the whole phrase and decide about the role of the phrase in the sentence. A flexlonal form of the main verb also influences the construction of the sentence. Especlally, however, the flexlonal properties of the main verb could help the parser to find out the subject and the direct object.</Paragraph> <Paragraph position="1"> These problems and several others as post-modifiers problem, wh-movement, conjunction, etc. were solved succesfully.</Paragraph> <Paragraph position="2"> The syntactical analysis comprises a wlde subset of Polish language eg. simple affirmative sentences and questions, complements and relatlve clauses and certain types of complex sentences. We had to take into account a number of speclal properties of the medical dialect which rarely occur in a common conversation. The grammar is able to parse not only the common Polish but the &quot;medical&quot; Polish as well. It means, among others, a great deal of participles, gerunds, modal verbs (eg. moze - could, powinien - should) and vague adverbs (eg.</Paragraph> <Paragraph position="3"> prawdopodobnle - propably, czesto - frequently, rzadko - rarely, czasami - sometimes).</Paragraph> <Paragraph position="4"> The syntactical analyser transforms an input sentence into an unflexional and ordered form. Sorle examples of the output of the First Stage are given be\]ow. The I-hark divides the whole sentence into phrases. An empty p|ace between two Is points out a missing phrase. The S and END flags indicate the beginning and the ending of each simple clause in the sentence. If the DCL fla~ occurs just after S-mark in the top-level clause the sentence is dealt as an assertion. In a question there are one or more question words instead. The MODIFIERS fla~ divides a direct object (if any) into the main phrase and post modifiers. This last flag ls an Important one because the head word of a direct object phrase nay be a predicative e\]ement of the clause.</Paragraph> <Paragraph position="5"> (eg. byc przyczyna - to be a cause). Notice, that a predicative element of the top-level clause becomes the main predicative element of the whole sentence.</Paragraph> <Paragraph position="6"> alkohol podany doustnie powoduje wzmozone wydzielanie gastryny.</Paragraph> <Paragraph position="7"> (alcohol given per os cause greater secretion of gastrin.) Nevertheless, because such information ts not sufficient an interpretation in the Second Stage Is needed.</Paragraph> <Paragraph position="8"> 32 L BOLC and T. STRZALKOWSKI The First Stage contains the main ATN net named SENTENCE which can perform Polish natural sentences. There are four speclal subnets: NOUN_PHR, ADJ_PHRA, ADV_PH~A, Q_.EXPR which can recognize different types of phrases eg. nominal phrases, adjectival phrases, adverbial phrases and question expressions respectlvely.</Paragraph> <Paragraph position="9"> The First Stage uses a syntactical dictionary which contains the flexional forms of the words, THE SECOND STAGE - SEMANTICAL INTERPRETATION When the syntactical analysis has been completed the Second Stage of the parser tries to Find out a semantical interpretation ofthe syntactical structure. The maln predicative element of this structure (eg. VERB/ACTION or OBJECT) creates one or more Instances of framework descrlbing an event. That Framework looks like a pattern-concept pair (8), (12), nevertheless there are more framelndicating verbs (7).</Paragraph> <Paragraph position="10"> For example the FOllowing verbs and verb expressions: powodowac (cause), stymulowac (stimulate), prowadzlc do (conclude), byc przyczyna (to be a cause), byc skutkiem (to be a result), etc. refer to the conceptualization #IMPLY and podac (to give), stosowac (to apply), etc. to the conceptuallzatlon #APPLY.</Paragraph> <Paragraph position="11"> The pattern determines which phrases may be expected round the predicate and which of them must occur. The interpretation process is driven by such a pattern so It Is called e~oectatioD-drlve~. It may be called structure-driven too because there are structural conditions in the pattern which must hold true during the parsing tlme.</Paragraph> <Paragraph position="12"> A concept is a notation t~t represents the meaning of a clause.</Paragraph> <Paragraph position="13"> Together this pair associates different forms of an utterance with Its meaning.</Paragraph> <Paragraph position="14"> The #APPLY conceptualizatlon looks like:</Paragraph> </Section> <Section position="5" start_page="29" end_page="29" type="metho"> <SectionTitle> (APPLY TYPE TREATMENT </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> where TYPE Is an indicator which points out that the described event is a treatment. AGT, OBJ, HANNER determine that there may be three phrases round the predlc~te, but only one of them must occur In an utterance. (OBL means obligatory parameter, OPT - optional one). None of these phrases could have a preposition before it - (). The AGT-phrase (agent that applies something) must be a human; the OBJ-phrase (object which is applied) must be a medicament; the MANNER slot may be filled when the wanner ~f appllcatlon is specified (eg. doustnle - per os). The CONCEPT indicator describes the way an atomic formula has to be built. As It is seen above, we shall receive a 5-nary pre-</Paragraph> </Section> <Section position="6" start_page="29" end_page="29" type="metho"> <SectionTitle> TRANSFORMATION OF NATURAL LANGUAGE INTO LOGICAL FORMULAS 33 </SectionTitle> <Paragraph position="0"> dicate ca\]led #APPLY which arguments w111 be constructed during the Interpretation process. The BUILDQ function ls a special ATN form which provides BUILDing of Quoted expressions (see (1) for details).</Paragraph> <Paragraph position="1"> A filling of frame slots is done after the syntactical and semantical requirements were satisfied. When the who\]e pattern were completed an atomic formula would be generated. Therefore, the interpretation process is an attempt to saueeze the syntactical structure of a sentence into one or more Instances of framework of an event. Beside the maln predlcate(s), a great deal of additional information would be joined the output formula. These facts are stored In part in pattern-concept pairs and in expert subnets of interpreter. They create a system knowledge. It is necessary for the system to have such a knowledge because none of the real text corps is able to describe comp\]etely a domain of the real world.</Paragraph> <Paragraph position="2"> A great deal of context information may also be used from the special context stack. It helps to solve the problems of pronoun references and elllpsls.</Paragraph> <Paragraph position="3"> If the &quot;squeezing&quot; could not be made the First Stage is actvated again.</Paragraph> <Paragraph position="4"> In addition, the semantical dictionary is appended to the Second Stage. It keeps al1 patterns of frameworks mentioned above. It contains some special entities too for Indicating the reference between verbs and patterns.</Paragraph> <Paragraph position="5"> The Second Stage also contains the main ATN net named FORMULA. It guides the interpretation process and controls the semantical correctness of utterances. There are aslo some expert nets which can recognize special medical expressions (eg. names of sicknesses and symptoms organs, treatments, etc.). These subnets are a changeable part of the system and they decide about the system knowledge. The expert subnets may communicate with the main net through the middle \]eve1 of interpreter - the CASES net. Thls net handles nomimai phrase structures eg. prepositions, conjunctions and post-modiflers.</Paragraph> <Paragraph position="6"> The Second Stage produces a formula of the First Order Predicate Calculus corresponding to the input sentence. The formula has an implicative form where the main predicate of the utterance is a conclusion and other generated facts are presumptions.</Paragraph> <Paragraph position="7"> Two generated formulas are given below. First of them is an assertion, the remaining one denotes a question. They are In LISP notation so a clarlflcation is needed. IMPLSYM and KONJSYM marks are the logical operators IMPLY (=>) and ArID (&). Pn integer just after the KONJSYM mark indicates how many factors were joined. Each predicate name is preceded by a hash-mark (#) and followed by an integer to indicate a number of arguments. Arguments look like a oair or rrlole which determines the type of argument, the name of a varlable and a constant (if any) respectlvely.</Paragraph> <Paragraph position="8"> The parser can also produce other kinds of formal representation of natural lamguage.</Paragraph> </Section> class="xml-element"></Paper>