File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-1602_metho.xml
Size: 25,735 bytes
Last Modified: 2025-10-06 14:08:33
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1602"> <Title>Text Simplification for Reading Assistance: A Project Note</Title> <Section position="3" start_page="0" end_page="1" type="metho"> <SectionTitle> 2 Research issues and our approach </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="1" type="sub_section"> <SectionTitle> 2.1 Readability assessment </SectionTitle> <Paragraph position="0"> The process of text simplification for reading assistance can be decomposed into the following three subprocesses: a. Problem identification: identify which portions of a given text will be difficult for a given user to read, b. Paraphrase generation: generate possible candidate paraphrases from the identified portions, and c. Evaluation: re-assess the resultant texts to choose the one in which the problems have been resolved. Given this decomposition, it is clear that one of the key issues in reading assistance is the problem of assessing the readability or comprehensibility of text because it is involved in subprocesses (a) and (c). Readability assessment is doubtlessly a tough issue (Williams et al., 2003). In this project, however, we argue that, if one targets only a particular population segment and if an adequate collection of data is available, then corpus-based empirical approaches may well be feasible. We have already proven that one can collect such readability assessment data by conducting survey questionnaires targeting teachers at schools for the deaf.</Paragraph> <Paragraph position="1"> In this paper, we use the terms readability and comprehensibility interchangeably, while strictly distinguishing them from legibility of each fragment (typically, a sentence or paragraph) of a given text.</Paragraph> </Section> <Section position="2" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 2.2 Paraphrase acquisition </SectionTitle> <Paragraph position="0"> One of the good findings that we obtained through the aforementioned surveys is that there are a broad range of paraphrases that can improve the readability of text. A reading assistance system is, therefore, hoped to be able to generate sufficient varieties of paraphrases of a given input. To create such a system, one needs to feed it with a large collection of paraphrase patterns. Very timely, the acquisition of paraphrase patterns has been actively studied in recent years: AF Manual collection of paraphrases in the context of language generation, e.g. (Robin and McKeown, 1996), AF Derivation of paraphrases through existing lexical resources, e.g. (Kurohashi et al., 1999), AF Corpus-based statistical methods inspired by the work on information extraction, e.g. (Jacquemin, 1999; Lin and Pantel, 2001), and AF Alignment-based acquisition of paraphrases from comparable corpora, e.g. (Barzilay and McKeown, 2001; Shinyama et al., 2002; Barzilay and Lee, 2003).</Paragraph> <Paragraph position="1"> One remaining issue is how effectively these methods contribute to the generation of paraphrases in our application-oriented context.</Paragraph> </Section> <Section position="3" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 2.3 Paraphrase representation </SectionTitle> <Paragraph position="0"> One of the findings obtained in the previous studies for paraphrase acquisition is that the automatic acquisition of candidates of paraphrases is quite realizable for various types of source data but acquired collections tend to be rather noisy and need manual cleaning as reported in, for example, (Lin and Pantel, 2001). Given that, it turns out to be important to devise an effective way of facilitating manual correction and a standardized scheme for representing and storing paraphrase patterns as shared resources.</Paragraph> <Paragraph position="1"> Our approach is (a) to define first a fully expressible formalism for representing paraphrases at the level of tree-to-tree transformation and (b) devise an additional layer of representation on its top that is designed to facilitate handcoding transformation rules.</Paragraph> </Section> <Section position="4" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 2.4 Post-transfer text revision </SectionTitle> <Paragraph position="0"> In paraphrasing, the morpho-syntactic information of a source sentence should be accessible throughout the transfer process since a morpho-syntactic transformation in itself can often be a motivation or goal of paraphrasing. Therefore, such an approach as semantic transfer, where morpho-syntactic information is highly abstracted away as in (Dorna et al., 1998; Richardson et al., 2001), does not suit this task. Provided that the morpho-syntactic stratum be an optimal level of abstraction for representing paraphrasing/transfer patterns, one must recall that semantic-transfer approaches such as those cited above were motivated mainly by the need for reducing the complexity of transfer knowledge, which could be unmanageable in morpho-syntactic transfer.</Paragraph> <Paragraph position="1"> Our approach to this problem is to (a) leave the description of each transfer pattern underspecified and (b) implement the knowledge about linguistic constraints that are independent of a particular transfer pattern separately from the transfer knowledge. There are a wide range of such transfer-independent linguistic constraints. Constraints on morpheme connectivity, verb conjugation, word collocation, and tense and aspect forms in relative clauses are typical examples of such constraints.</Paragraph> <Paragraph position="2"> These four issues can be considered as different aspects of the overall question how one can make the development and maintenance of a gigantic resource for paraphrasing tractable. (1) The introduction of readability assessment would free us from cares about the purposiveness of each paraphrasing rule in paraphrase acquisition. (2) Paraphrase acquisition is obviously indispensable for scaling up the resource. (3) A good formalism for representing paraphrasing rules would facilitate the manual refinement and maintenance of them. (4) Post-transfer error detection and revision would make the system tolerant to flows in paraphrasing rules.</Paragraph> <Paragraph position="3"> While many researchers have addressed the issue of paraphrase acquisition reporting promising results as cited above, the other three issues have been left relatively unexplored in spite of their significance in the above sense. Motivated by this context, in the rest of this paper, we address these remaining three.</Paragraph> </Section> </Section> <Section position="4" start_page="1" end_page="1" type="metho"> <SectionTitle> 3 Readability assessment </SectionTitle> <Paragraph position="0"> To the best of our knowledge, there have never been no reports on research to build a computational model of the language proficiency of deaf people, except for the remarkable reports by Michaud and McCoy (2001). As a subpart of their research aimed at developing the ICICLE system (McCoy and Masterman, 1997), a language-tutoring application for deaf learners of written English, Michaud and McCoy developed an architecture for modeling the writing proficiency of a user called SLALOM. SLALOM is designed to capture the stereotypic linear order of acquisition within certain categories of morphological and/or syntactic features of language. Unfortunately, the modeling method used in SLALOM cannot be directly applied to our domain for three reasons.</Paragraph> <Paragraph position="1"> AF Unlike writing tutoring, in reading assistance, target sentences are in principle unlimited. We therefore need to take a wider range of morpho-syntactic features into account.</Paragraph> <Paragraph position="2"> AF SLALOM is not designed to capture the difficulty of any combination of morpho-syntactic features, which it is essential to take into account in reading assistance.</Paragraph> <Paragraph position="3"> AF Given the need to consider feature combinations, a simple linear order model that is assumed in SLALOM is unsuitable.</Paragraph> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 3.1 Our approach: We ask teachers </SectionTitle> <Paragraph position="0"> To overcome these deficiencies, we took yet another approach where we designed a survey questionnaire targeting teachers at schools for the deaf, and have been collecting readability assessment data.</Paragraph> <Paragraph position="1"> In this questionnaire, we ask the teachers to compare the readability of a given sentence with paraphrases of it. The use of paraphrases is of critical importance in our questionnaire since it makes manual readability assessment significantly easier and more reliable. We targeted teachers of Japanese or English literacy at schools for the deaf for the following reasons. Ideally, this sort of survey would be carried out by targeting the population segment in question, i.e., deaf students in our study. In fact, pedagogists and psycholinguists have made tremendous efforts to examine the language proficiency of deaf students by giving them proficiency tests. Such efforts are very important, but they have had difficulty in capturing enough of the picture to develop a comprehensive and implementable reading proficiency model of the population due to the expense of extensive language proficiency testing.</Paragraph> <Paragraph position="2"> In contrast, our approach is an attempt to model the knowledge of experts in this field (i.e., teaching deaf students). The targeted teachers have not only rich experiential knowledge about the language proficiency of their students but are also highly skilled in paraphrasing to help their students' comprehension.</Paragraph> <Paragraph position="3"> Since such knowledge gleaned from individual experiences already has some generality, extracting it through a survey should be less costly and thus more comprehensive than investigation based on language proficiency testing.</Paragraph> <Paragraph position="4"> In the questionnaire, each question consists of several paraphrases, as shown in Figure 1 (a), where (A) is a source sentence, and (B) and (C) are paraphrases of (A). Each respondent was asked to assess the relative readability of the paraphrases given for each source sentence, as shown in Figure 1 (b).</Paragraph> <Paragraph position="5"> The respondent judged sentence (A) to be the most difficult and judged (B) and (C) to be comparable.</Paragraph> <Paragraph position="6"> A judgment that sentence D7</Paragraph> <Paragraph position="8"> is judged likely to be understood by a larger subset of students than D7</Paragraph> </Section> </Section> <Section position="5" start_page="1" end_page="1" type="metho"> <SectionTitle> CY </SectionTitle> <Paragraph position="0"> . We asked the respondents to annotate the paraphrases with format-free comments, giving the reasons for their judgments, alternative paraphrases, etc., as shown in Figure 1 (b).</Paragraph> <Paragraph position="1"> To make our questionnaire efficient for model acquisition, we had to carefully control the variation in paraphrases. To do that, we first selected around 50 morpho-syntactic features that are considered influential in sentence readability for deaf people. For each of those features, we collected several simple example sentences from various sources (literacy textbooks, grammar references, etc.). We then manually produced several paraphrases from each of the collected sentences so as to remove the feature that characterized the source sentence from each paraphrase. For example, in Figure 1, the feature characterizing sentence (A) is a non-restrictive relative clause (i.e., sentence (A) was selected as an example of this feature). Neither (B) nor (C) has this feature. We also controlled the lexical variety to minimize the effect of lexical factors on readability; we also restricted the vocabulary to a top-2000 basic word set (NIJL, 1991).</Paragraph> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 3.1.3 Administration </SectionTitle> <Paragraph position="0"> We administrated a preliminary survey targeting three teachers. Through the survey, we observed that (a) the teachers largely agreed in their assessments of relative readability, (b) their format-free comments indicated that the observed differences in readability were largely explainable in terms of the morpho-syntactic features we had prepared, and (c) a largerscaled survey was needed to obtain a statistically reliable model. Based on these observations, we conducted a more comprehensive survey, in which we prepared 770 questions and sent questionnaires with a random set of 240 of them to teachers of Japanese or English literacy at 50 schools for the deaf. We asked them to evaluate as many as possible anonymously. We obtained 4080 responses in total (8.0 responses per question).</Paragraph> </Section> <Section position="2" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 3.2 Readability ranking model </SectionTitle> <Paragraph position="0"> The task of ranking a set of paraphrases can be decomposed into comparisons between two elements combinatorially selected from the set. We consider the problem of judging which of a given pair of paraphrase sentences is more readable/comprehensible for deaf students. More specifically, given paraphrase pair B4D7</Paragraph> <Paragraph position="2"> are comparable).</Paragraph> <Paragraph position="3"> Once the problem is formulated this way, we can use various existing techniques for classifier learning. So far, we have examined a method of using the support vector machine (SVM) classification technique. null A training/testing example is paraphrase pair ), map it to real value CSD3D6B4D8BND7B5 BE CJBCBNBDCL so that the lowest degree maps to 0 and the highest degree maps to 1. For example, the degree of readability assigned to (A) in Figure 1 (b) maps to around 0.1, whereas that assigned to (B) maps to around 0.9.</Paragraph> </Section> <Section position="3" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 3.3 Evaluation and discussion </SectionTitle> <Paragraph position="0"> To evaluate the two modeling methods, we conducted a ten-fold cross validation on the set of 4055 paraphrase pairs derived from the 770 questions used in the survey. To create a feature vector space, we used 355 morpho-syntactic features. Feature annotation was done semi-automatically with the help of a morphological analyzer and dependency parser.</Paragraph> <Paragraph position="1"> The task was to classify a given paraphrase pair into either left, right,orcomparable. Model C5's output class for B4D7 The model achieved 95% precision with 89% recall. This result confirmed that the data we collected through the questionnaires were reasonably noiseless and thus generalizable. Furthermore, both models exhibited a clear trade-off between recall and precision, indicating that their output scores can be used as a confidence measure.</Paragraph> </Section> </Section> <Section position="6" start_page="1" end_page="1" type="metho"> <SectionTitle> 4 Paraphrase representation </SectionTitle> <Paragraph position="0"> We represent paraphrases as transfer patterns between dependency trees. In this section, we propose a three-layered formalism for representing transfer patterns.</Paragraph> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.1 Types of paraphrases of concern </SectionTitle> <Paragraph position="0"> There are various levels of paraphrases as the following examples demonstrate: (1) a. She burst into tears, and he tried to comfort her.</Paragraph> <Paragraph position="1"> b. She cried, and he tried to console her. (2) a. It was a Honda that John sold to Tom. b. John sold a Honda to Tom.</Paragraph> <Paragraph position="2"> c. Tom bought a Honda from John.</Paragraph> <Paragraph position="3"> (3) a. They got married three years ago. b. They got married in 2000.</Paragraph> <Paragraph position="4"> Lexical vs. structural paraphrases Example (1) includes paraphrases of the single word &quot;comfort&quot; and the canned phrase &quot;burst into tears&quot;. The sentences in (2), on the other hand, exhibit structural and thus more general patterns of paraphrasing. Both types of paraphrases, lexical and structural paraphrases, are considered useful for many applications including reading assistance and thus should be in the scope our discussion.</Paragraph> <Paragraph position="5"> Atomic vs. compositional paraphrases The process of paraphrasing (2a) into (2c) is compositional because it can be decomposed into two subprocesses, (2a) to (2b) and (2b) to (2c). In developing a resource for paraphrasing, we have only to cover non-compositional (i.e., atomic) paraphrases. Compositional paraphrases can be handled if an additional computational mechanism for combining atomic paraphrases is devised.</Paragraph> </Section> <Section position="2" start_page="1" end_page="1" type="sub_section"> <SectionTitle> Meaning-preserving vs. reference-preserving </SectionTitle> <Paragraph position="0"> paraphrases It is also useful to distinguish reference-preserving paraphrases from meaning-preserving ones. The above example in (3) is of the reference-preserving type. This types of paraphrasing requires the computation of reference to objects outside discourse and thus should be excluded from our scope for the present purpose.</Paragraph> </Section> <Section position="3" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.2 Dependency trees (MDSs) </SectionTitle> <Paragraph position="0"> Previous work on transfer-based machine translation (MT) suggests that the dependency-based representation has the advantage of facilitating syntactic transforming operations (Meyers et al., 1996; Lavoie et al., 2000). Following this, we adopt dependency trees as the internal representations of target texts.</Paragraph> <Paragraph position="1"> We suppose that a dependency tree consists of a set of nodes each of which corresponds to a lexeme or compound and a set of edges each of which represents the dependency relation between its ends. We call such a dependency tree a morpheme-based dependency structure (MDS). Each node in an MDS is supposed to be annotated with an open set of typed features that indicate morpho-syntactic and semantic information. We also assume a type hierarchy in dependency relations that consists of an open set of dependency classes including dependency, compound, parallel, appositive and insertion.</Paragraph> </Section> <Section position="4" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.3 Three-layered representation </SectionTitle> <Paragraph position="0"> Previous work on transfer-based MT systems (Lavoie et al., 2000; Dorna et al., 1998) and alignment-based transfer knowledge acquisition (Meyers et al., 1996; Richardson et al., 2001) have proven that transfer knowledge can be best represented by declarative structure mapping (transforming) rules each of which typically consists of a pair of source and target partial structures as in the middle of Figure 2.</Paragraph> <Paragraph position="1"> Adopting such a tree-to-tree style of representation, however, one has to address the issue of the trade-off between expressibility and comprehensibility. One may want a formalism of structural rule editing translation compilation simplified MDS transfer rule N shika V- nai -> V no wa N dake da.</Paragraph> <Paragraph position="2"> (someone does not V to nothing but N) (it is only to N that someone does V) MDS transfer rule sp_rule(108, negation, RefNode) :match(RefNode, X4=[pos:postp,lex: shika]), transformation patterns that is powerful enough to represent a sufficiently broad range of paraphrase patterns. However, highly expressible formalisms would make it difficult to create and maintain rules manually.</Paragraph> <Paragraph position="3"> To mediate this trade-off, we devised a new layer of representation to add on the top of the layer of tree-to-tree pattern representation as illustrated in Figure 2. At this new layer, we use an extended natural language to specify transformation patterns. The language is designed to facilitate the task of hand-coding transformation rules. For example, to define the tree-to-tree transformation pattern given in the middle of Figure 2, a rule editor needs only to specify its simplified form: (4) N shika V- nai AX V no ha N dake da.</Paragraph> <Paragraph position="4"> (Someone does V to nothing but N AX It is only to N that someone does V) A rule of this form is then automatically translated into a fully-specified tree-to-tree transformation rule. We call a rule of the latter form an MDS rewriting rule (SR rule), and a rule of the former form a simplified SR rule (SSR rule).</Paragraph> <Paragraph position="5"> The idea is that most of the specifications of an SR rule can usually be abbreviated if a means to automatically complement it is provided. We use a parser and macros to do so; namely, the rule translator complements an SSR rule by macro expansion and parsing to produce the corresponding SR rule specifications. The advantages of introducing the SSR rule layer are the following: AF The SSR rule formalism allows a rule writer to edit rules with an ordinary text editor, which makes the task of rule editing much more efficient than providing her/him with a GUI-based complex tool for editing SR rules directly.</Paragraph> <Paragraph position="6"> AF The use of the extended natural language also has the advantage in improving the readability of rules for rule writers, which is particularly important in group work.</Paragraph> <Paragraph position="7"> AF To parse SSR rules, one can use the same parser as that used to parse input texts. This also improves the efficiency of rule development because it significantly reduces the burden of maintaining the consistency between the POS-tag set used for parsing input and that used for rule specifications. The SSR rule layer shares underlying motivations with the formalism reported by Hermjakob et al. (2002). Our formalism is, however, considerably extended so as to be licensed by the expressibility of the SR rule representation and to be annotated with various types of rule applicability conditions including constraints on arbitrary features of nodes, structural constraints, logical specifications such as disjunction and negation, closures of dependency relations, optional constituents, etc.</Paragraph> <Paragraph position="8"> The two layers for paraphrase representation are fully implemented on our paraphrasing engine KURA (Takahashi et al., 2001) coupled with another layer for processing MDSs (the bottom layer illustrated in Figure 2). The whole system of KURA and part of the transer rules implemented on it (see Section 5 below) are available at http://cl.aistnara.ac.jp/lab/kura/doc/. null</Paragraph> </Section> </Section> <Section position="7" start_page="1" end_page="1" type="metho"> <SectionTitle> 5 Post-transfer error detection </SectionTitle> <Paragraph position="0"> What kinds of transfer errors tend to occur in lexical and structural paraphrasing? To find it out, we conducted a preliminary investigation. This section reports a summary of the results. See (Fujita and Inui, 2002) for further details.</Paragraph> <Paragraph position="1"> We implemented over 28,000 transfer rules for Japanese paraphrases on the KURA paraphrasing engine based on the rules previously reported in (Sato, 1999; Kondo et al., 1999; Kondo et al., 2001; Iida et al., 2001) and existing lexical resources such as thesauri and case frame dictionaries. The implemented rules ranged from such lexical paraphrases as those that replace a word with its synonym to such syntactic/structural paraphrases as those that remove a cleft construction from a sentence, devide a sentence, etc. We then fed KURA with a set of 1,220 sentences randomly sampled from newspaper articles and obtained 630 transferred output sentences.</Paragraph> <Paragraph position="2"> The following are the tendencies we observed: AF The transfer errors observed in the experiment exhibited a wide range of variety from morphological errors to semantic and discourse-related ones. AF Most types of errors tended to occur regardless of the types of transfer. This suggests that if one creates an error detection module specialized for a particular error type, it works across different types of transfer.</Paragraph> <Paragraph position="3"> AF The most frequent error type involved inappropriate conjugation forms of verbs. It is, however, a matter of morphological generation and can be easily resolved.</Paragraph> <Paragraph position="4"> AF Errors in regard to verb valency and selectional restriction also tended to be frequent and fatal, and thus should have preference as a research topic.</Paragraph> <Paragraph position="5"> AF The next frequent error type was related to the difference of meaning between near synonyms.</Paragraph> <Paragraph position="6"> However, this type of errors could often be detected by a model that could detect errors of verb valency and selectional restriction.</Paragraph> <Paragraph position="7"> Based on these observations, we concluded that the detection of incorrect verb valences and verb-complement cooccurrence was one of the most serious problems that should have preference as a research topic. We are now conducting experiments on empirical methods for detecting this type of errors (Fujita et al., 2003).</Paragraph> </Section> class="xml-element"></Paper>