File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1102_metho.xml
Size: 5,655 bytes
Last Modified: 2025-10-06 14:13:44
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1102"> <Title>Non-directionality and Self-Assessment in an Example-based System Using Genetic Algorithms</Title> <Section position="4" start_page="618" end_page="619" type="metho"> <SectionTitle> 3 Experhrmntatior~ </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="618" end_page="619" type="sub_section"> <SectionTitle> 3.1 gXl)e.riments </SectionTitle> <Paragraph position="0"> We tested the perl'ornlallce of the system ror analysis, generation and non-directional emnpleCio,.</Paragraph> <Paragraph position="1"> For analysis, ;~ board is extracted front the data base (call it reference board). A new board is built by associating tile slming i)arL of the reference board with a variable as its I,ree pare. It, beconles the inImC \[.o the sys-Cent. Of course, the reference board is eliminated from \[,he d ataba.'m.</Paragraph> <Paragraph position="2"> A IirsC ineasure is given by +,he system itself: it: is rite fitness of the output, which is the distance between tile output and tile input. A second nleasure is tile distance I)eCwee.n I.he outlnll, and the reference board, which reo fleets the absolute qualiCy of the output. Moreov(!r, run-Limes have been measured.</Paragraph> <Paragraph position="3"> This procedure was carried out tbr eacll board of the C/la\[,a base so Chat average vahles (:oH\]d I)e ('oIlll)llCe+\]. There were 225 b,m'ds in the dal.a I>as,, For generation, the same procedure was al)plied , but, of course, the tree part is kept in building tim input board. Pot non-dlrectional completion, an uneomplete board is automatically built by inserting variables at random positions in the string and tree parts of the reference board.</Paragraph> </Section> <Section position="2" start_page="619" end_page="619" type="sub_section"> <SectionTitle> 3.2 R.esnlts Analysis Analysis gives an average error of about 9.2 </SectionTitle> <Paragraph position="0"> elements relatively to the exact output after thirty generations. The average number of elements (nodes and words) in a board is 24.5, hence, the error rate is 38%, not a very good result. The fitness gives the average number of words wrong in the average string output by the system: around 3.2 words for a 8.5 word-long sentence. null suits than analysis. The average error in the tree only is 1.1 node for 16 node-heavy trees and the absolute error rate fails to 12%. However, as expected, generation is slower than analysb because more tree distance computations are performed.</Paragraph> <Paragraph position="1"> must be considered ,as purely ilhtstrative, because the form of boards for non-directional completio i: un're- P !'d&quot; stricted. As could be expected, because no p .... ~s complete in the input, quality is worse than for analysis and generation, although fitness appears to be quite good,</Paragraph> </Section> </Section> <Section position="5" start_page="619" end_page="619" type="metho"> <SectionTitle> 3.,3 Discussion </SectionTitle> <Paragraph position="0"> We will now discuss the advantages and drawbacks of our system.</Paragraph> <Paragraph position="1"> 3.3.3. Non-dlre('tionallty The general fimction of the system is to build a cornplete sentence and its complete associated syntactic tree from a partially specified sentence and a partially specified tree. Hence, ana\[ysis and generation turn out to lie only l/articular cases of this general operation. This t'eatnre is what we called ares-directionality. It is more general than hi-directionality. Until now, we are not aware of any m~tural language processing system having this property.</Paragraph> <Paragraph position="2"> Frorn the applications point of view, non--directionality allows one to envisage linguistically fotmded editing operations. For example, suppose we would like to replace refltnd the fee by pay *lte fee back all over a text. We would like tile operation to apply for any tense of the verb. Tim Rlllowing I/oard could be used to retrieve all possible candidates. It says tllat we want a w'.rbal phrase (structural constraint) and that tile substring fl'.e must appear (string constraint). Of course, to perform such an operation, we would not advise the use of genetic algorithms . ..</Paragraph> <Paragraph position="4"/> </Section> <Section position="6" start_page="619" end_page="619" type="metho"> <SectionTitle> 3.3.2 Assessment </SectionTitle> <Paragraph position="0"> Because parts of the input ma~y be modified in tile ontput, assessnlent is necessary. The system delivers a score which is not directly connected to the knowledge of the system. 1I; is the distance between tile input and the output. Minimising this distance is precisely the ta.sk of the system. As this score is a theoretical metric between structures, it is not'stud( to a particular representation. It conld be applied to evahlate similar systems using difl'erent representations, for example dependency structures.</Paragraph> <Paragraph position="1"> l)espite the previous points, important criticisms can still he addressed to the current system.</Paragraph> <Paragraph position="2"> l,;xperiulenLs carried out with int/ut sentences from outside tile data base have shown that the system has a normalising ell'ect: otfl;puts are cast to resemble senfences and trees fronl the database. This is a neg~l.b.e effect if a rreeq0ptlt syTstem is wanted. But,, it' a \]a%e enough data base is built and if standardisation is requlred, as is the ease with technical documents in many companies, this may be seen as a positive feature.</Paragraph> </Section> class="xml-element"></Paper>