File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-1702_metho.xml
Size: 9,218 bytes
Last Modified: 2025-10-06 14:08:10
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1702"> <Title>Cascading XSL filters for content selection in multilingual document generation</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 CSA - Parallel selection - Phase 1 </SectionTitle> <Paragraph position="0"> In the phase of parallel selection two of the three specific user aspects are taken into account: subject and languages. These aspects identify the relevant XML master document in the chosen language (as illustrated in figure 2.). There is one master document for each subject covered by the system, and these documents contain parallel aligned versions of the texts in each language (English, Spanish and Basque, in As a result of this first filtering phase, the appropriate language division of the master document is selected. This text division is the input for subsequent filtering phases in which the particular segments of the document will be discriminated.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 CSA - Horizontal filtering - Phase 2 </SectionTitle> <Paragraph position="0"> The horizontal filtering phase concerns the third remaining user aspect that is moment in time, which is used to suit the generated text to the particular moment of the learning plan. This aspect cuts horizontally the parallel selection of the previous section.</Paragraph> <Paragraph position="1"> The master document is structured in accordance with a set of course scheduling parameters. Each day and learning unit within the day is correlated with corresponding set of learning entities in the XML master document.</Paragraph> <Paragraph position="2"> In this way, the generated document can be targeted for learning unit 1 of day 1, or any other day or unit. The XML master file also contains some informative elements that the reader may need to know even before the course starts or after it has finished. These will be generated also as a result of some specific user aspects that are activated. Figure 3 shows a graphical representation of horizontal filtering.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 CSA - Vertical filtering - Phase 3 </SectionTitle> <Paragraph position="0"> The final phase of vertical filtering comprises the five user aspects of level expertise, reason to read, professional background, opinion or motivation and time available. These five aspects will be relevant to discriminate those parts of the discourse tree which have been previously selected and filtered.</Paragraph> <Paragraph position="1"> Nuclei will be always maintained because they are, by definition, irreplaceable segments of the text and convey the main message.</Paragraph> <Paragraph position="2"> Satellites are segments of the text that will be subject to the algorithm's process of selection. The set of discrimination rules applied in this first version of the content selection algorithm is described below. These rules apply in subsequent checking levels of filtering, and therefore have a cascading effect. It is known that RST covers an indefinite number of relation-satellites (Knott, 1995) which have been classified by Hovy & Maier (1997), but we will only mention the set of relation-satellites used in the master document taken as example.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Vertical filter - Level of expertise </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> discard example, exercise, background and preparation relation-satellites; Rationale for the rule: Any user with a null or basic level of expertise on the selected subject will need all the information available to understand the text. Alternatively, a user with a medium or high level of expertise will not require examples, exercises, background, preparation and similar relation-satellites.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Vertical filter - Reason to read </SectionTitle> <Paragraph position="0"> If reason_to_read = &quot;to get an idea&quot; Then discard exercise and elaboration (all the types of elaboration: textual elaboration, link elaboration and image elaboration) relation-satellites; If reason_to_read = &quot;to get deep into it&quot;Then no relation-satellite is discarded; Rationale: Any user wishing to broaden his knowledge in the selected subject will need additional information. Conversely, a user with the intention of just getting an idea does not need any exercise, elaboration, or similar relation-satellites, which often require a more active role on the part of the user.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Vertical filter - Professional background </SectionTitle> <Paragraph position="0"> If job_studies = &quot;not related subject&quot; Then no relation-satellite is discarded; If job_studies = &quot;related subject&quot; Then discard background and preparation relation-satellites; Rationale: Any user whose professional background is not related to the subject will need all the additional supporting text to understand its meaning. Conversely, if the user is related to the selected subject, we may assume that background, preparation and similar relation-satellites will be unnecessary.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4 Vertical filter - Opinion or motivation </SectionTitle> <Paragraph position="0"> If opinion_motivation = &quot;against&quot; or opinion_motivation = &quot;without an opinion or motivation&quot; Then no relation-satellite is discarded; If opinion_motivation = &quot;in favour&quot; Then discard motivate, antithesis, concession and justify relation-satellite; Rationale: A motivated or favourable user will not require additional motivation and, therefore, the motivate, antithesis, concession, justification, and similar relation-satellites will be disregarded, since they play a role in changing the opinion of the user to be in favour of the course material.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.5 Vertical filter - Time available </SectionTitle> <Paragraph position="0"> If time_available = &quot;a little bit of time&quot; Then discard all the relation-satellites; If time_available = &quot;quite some time&quot; Then discard exercise relation-satellite; If time_available = &quot;enough time&quot; Then no relation-satellite is discarded; Rationale: Time availability is a crucial user aspect. If the user is in a rush or has little time, the system has to provide only the most elementary information. In such case only nuclei will be generated. If the user has a bit more time, but not much, exercises are not offered, since they are usually quite time consuming and they require an active participation of the user. Finally, if the user has plenty of time, all the additional information is delivered.</Paragraph> <Paragraph position="1"> 3.6 Final comments on vertical filters Cascading filters apply to the relation-satellites that are still active after the previous phases in the generation process. When a vertical filter 3 tries to get rid of a relation-satellite already abandoned at a previous phase (2 or 1), there will be nothing to act upon, but this circumstance will produce no consequence, since the CSA continues the filtering process on the remaining text. Thus, the order in which the vertical filters are applied is not relevant.</Paragraph> <Paragraph position="2"> After the filtering process has been successfully completed, there is still a final presentation task. A good presentation is, in our opinion, one that will provide the student with an optimal version of the document to read, understand and fruitfully assimilate its content.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Implementation </SectionTitle> <Paragraph position="0"> The javascript code manages the user aspects (one of the inputs of the algorithm) and the application of the casdading filters (the CSA).</Paragraph> <Paragraph position="1"> Depending on the user aspects given by the user, the variables sXSL1 to sXSL5 take the value of the filter to be applied for each user aspect (level of expertise, reason to read, background, opinion or motivation and time available).</Paragraph> <Paragraph position="2"> The sResult variable contains the XML file whose content will be varying after each filter is applied. Table 3 shows the code that executes a filter.</Paragraph> <Paragraph position="3"> objData.loadXML(sResult); objStyle.load(sXSL1); sResult=objData.transformNode(objStyle); XSL filters pass on (or not) one element to the following vertical filter depending on the rules described before. Table 4 shows how this is done with the relation-satellite</Paragraph> </Section> class="xml-element"></Paper>