File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/n03-2013_metho.xml
Size: 5,382 bytes
Last Modified: 2025-10-06 14:08:16
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2013"> <Title>Automatic Expansion of Equivalent Sentence Set Based on Syntactic Substitution</Title> <Section position="3" start_page="0" end_page="1" type="metho"> <SectionTitle> 2 Acquisition of Paraphrasing Rules: Hierarchical Phrase Alignment </SectionTitle> <Paragraph position="0"> Hierarchical Phrase Alignment is based on the assumption that an &quot;equivalent phrase pair has the same information and the same grammatical role.&quot; We decompose this assumption into the following two conditions for computation. null The words in the phrase pair correspond, with no deficiency and no excess.</Paragraph> <Paragraph position="1"> The phrases are of the same syntactic category.</Paragraph> <Paragraph position="2"> Therefore, HPA is a task to extract phrase pairs that satisfy the above two conditions. The procedure of HPA is summarized as follows.</Paragraph> <Paragraph position="3"> 1. Tag and parse two equivalent sentences.</Paragraph> <Paragraph position="4"> 2. Extract corresponding words (called word links) between the sentences. In this paper, we regard identical words and words that belong to the same group in a thesaurus as word links.</Paragraph> <Paragraph position="5"> 3. Check all combinations of syntactic nodes between the sentences. If the node pair satisfies the above two conditions, then output the pair as an equivalent phrase. Namely, if no words in the phrase link to the outside of the other phrase, and the nodes have the same category, the phrase pair is regarded as equivalent. null Figure 1 shows an example of equivalent phrase extraction from source equivalent sentences. The upper sentence is interrogative, the lower sentence is imperative, and they have the same meaning. For example, focusing on the upper phrase &quot;get me,&quot; this phrase is VP and contains two word links. However, no nodes contain only the links 'get', and 'me' in the lower sentence. On the other hand, focusing on the upper phrase &quot;get me a taxi,&quot; it contains four word links that correspond to the lower phrase &quot;get a taxi for me&quot;, and they have the same syntactic category. Therefore, the node pair VP(4) is regarded as an equivalent phrase.</Paragraph> <Paragraph position="6"> By iterating the above process, HPA consequently extracts eight nodes as equivalents from the source sentences shown in Figure 1. Excluding the identical phrases, the following three phrases are acquired as equivalent phrases.</Paragraph> <Paragraph position="7"> &quot;get me a taxi&quot; and &quot;get a taxi for me&quot; &quot;10 in the morning&quot; and &quot;10 a.m.&quot; &quot;at 10 in the morning&quot; and &quot;at 10 a.m.&quot; HPA can extract phrasal correspondences from source equivalent sentences even if their sentence structures are significantly different. In addition, because node pairs have to be in the same syntactic category, unparaphrasable correspondences, such as &quot;morning&quot; and &quot;a.m.,&quot; are ignored even though they have word links.</Paragraph> </Section> <Section position="4" start_page="1" end_page="1" type="metho"> <SectionTitle> 3 Expansion of Equivalent Sentence Set </SectionTitle> <Paragraph position="0"> The equivalent phrases extracted by HPA are substitutable with one another because they are semantically and grammatically equivalent. Therefore, they are regarded as bi-directional paraphrasing rules. When we paraphrase from any N sentences, target equivalent sentences are generated by the following procedure, where The original method of HPA has two additional features. 1) Ambiguity of parsing is resolved by comparing parse trees of input sentences. 2) It employs partial parsing to analyze irregular sentences. Details are described in (Imamura, 2001). Would you get me a taxi at 10 in the morning? Please get a taxi for me at 10 a.m.</Paragraph> <Paragraph position="2"> English Equivalent Sentences (The lines between the sentences denote word links, the trees denote parsing results, and the numbers on the nodes denote corresponding equivalent phrases.) the range from Step 1 to Step 3 corresponds to the acquisition phase, and Steps 4 and 5 correspond to the generation phase.</Paragraph> <Paragraph position="3"> 1. First, select one sentence from the source equivalent sentence set.</Paragraph> <Paragraph position="4"> 2. Process HPA with the remaining (N[?]1) sentences, and extract equivalent phrases.</Paragraph> <Paragraph position="5"> 3. Repeat Steps 1 and 2 for all combinations of the source sentences. All phrases that construct the source set and their paraphrasing rules are acquired. 4. Next, select one tree created by HPA from the source equivalent sentence set, and trace the tree top-down. If a node registered in the paraphrasing rules is found, substitute the equivalent phrase for the node. Substitution is recursively done until it reaches a leaf.</Paragraph> <Paragraph position="6"> 5. Repeat Step 4 with all sentences in the source set. For example, when the source equivalent sentence set contains only the two sentences shown in Figure 1, the following six sentences are generated. Our method generates all sentences constructed from the phrases of N sentences.</Paragraph> </Section> class="xml-element"></Paper>