File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-2034_metho.xml
Size: 9,481 bytes
Last Modified: 2025-10-06 14:10:26
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2034"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Discriminative Reranking for Semantic Parsing</Title> <Section position="5" start_page="264" end_page="266" type="metho"> <SectionTitle> 3 Features for Reranking SAPTs </SectionTitle> <Paragraph position="0"> In our setting, reranking models discriminate between SAPTs that can lead to correct MRs and those that can not. Intuitively, both syntactic and semantic features describing the syntactic and semantic substructures of a SAPT would be good indicators of the SAPT's correctness.</Paragraph> <Paragraph position="1"> The syntactic features introduced by Collins (2000) for reranking syntactic parse trees have been proven successfully in both English and Spanish (Cowan and Collins, 2005). We examine if these syntactic features can be adapted for semantic parsing by creating similar semantic features. In the following section, we first briefly describe the syntactic features introduced by Collins (2000), and then introduce two adapted semantic feature sets. A SAPT in CLANG is shown in Figure 3 for illustrating the features throughout this section.</Paragraph> <Paragraph position="2"> COMMA for a clearer description of features, and the NULL semantic labels are not shown. The head of the rule &quot;PRN-POINT- -LRB-POINT NP-NUM1 COMMA NP-NUM2 -RRB-&quot; is -LRB-POINT. The semantic labels NUM1 and NUM2 are meta concepts in CLANG specifying the semantic role filled since NUM can fill multiple semantic roles in the predicate POINT.</Paragraph> <Section position="1" start_page="265" end_page="265" type="sub_section"> <SectionTitle> 3.1 Syntactic Features </SectionTitle> <Paragraph position="0"> All syntactic features introduced by Collins (2000) are included for reranking SAPTs. While the full description of all the features is beyond the scope of this paper, we still introduce several feature types here for the convenience of introducing semantic features later.</Paragraph> <Paragraph position="1"> 1. Rules. These are the counts of unique syntactic context-free rules in a SAPT. The example in Figure 3 has the feature f(PRN- -LRB- NP COMMA NP -RRB-)=1.</Paragraph> <Paragraph position="2"> 2. Bigrams. These are the counts of unique bigrams of syntactic labels in a constituent. They are also featured with the syntactic label of the constituent, and the bigram's relative direction (left, right) to the head of the constituent. The example in Figure 3 has the feature f(NP COMMA, right, PRN)=1.</Paragraph> <Paragraph position="3"> 3. Grandparent Rules. These are the same as Rules, but also include the syntactic label above a rule. The example in Figure 3 has the feature f([PRN- -LRB- NP COMMA NP -RRB-], NP)=1, where NP is the syntactic label above the rule &quot;PRN- -LRB- NP COMMA NP -RRB-&quot;.</Paragraph> <Paragraph position="4"> 4. Grandparent Bigrams. These are the same as Bigrams, but also include the syntactic label above the constituent containing a bigram. The example in Figure 3 has the feature f([NP COMMA, right, PRN], NP)=1, where NP is the syntactic label above the constituent PRN.</Paragraph> </Section> <Section position="2" start_page="265" end_page="266" type="sub_section"> <SectionTitle> 3.2 Semantic Features </SectionTitle> <Paragraph position="0"> A similar semantic feature type is introduced for each syntactic feature type used by Collins (2000) by replacing syntactic labels with semantic ones (with the semantic label NULL not included). The corresponding semantic feature types for the features in Section 3.1 are: 1. Rules. The example in Figure 3 has the feature f(POINT- POINT NUM1 NUM2)=1.</Paragraph> <Paragraph position="1"> 2. Bigrams. The example in Figure 3 has the feature f(NUM1 NUM2, right, POINT)=1, where the bigram &quot;NUM1 NUM2&quot;appears to the right of the head POINT.</Paragraph> <Paragraph position="2"> 3. Grandparent Rules. The example in Figure 3 has the feature f([POINT- POINT NUM1 NUM2], POINT)=1, where the last POINT is syntactic nodes from the SAPT in Figure 3 (with syntactic labels omitted.) the semantic label above the semantic rule &quot;POINT- POINT NUM1 NUM2&quot;.</Paragraph> <Paragraph position="3"> 4. Grandparent Bigrams. The example in Fig- null ure 3 has the feature f([NUM1 NUM2, right, POINT], POINT)=1, where the last POINT is the semantic label above the POINT associated with PRN.</Paragraph> <Paragraph position="4"> Purely-syntactic structures in SAPTs exist with no meaning composition involved, such as the expansions from NP to PRN, and from PP to &quot;TO NP&quot; in Figure 3. One possible drawback of the semantic features derived directly from SAPTs as in Section 3.2.1 is that they could include features with no meaning composition involved, which are intuitively not very useful. For example, the nodes with purely-syntactic expansions mentioned above would trigger a semantic rule feature with meaning unchanged (from POINT to POINT). Another possible drawback of these features is that the features covering broader context could potentially fail to capture the real high-level meaning composition information. For example, the Grandparent Rule example in Section 3.2.1 has POINT as the semantic grandparent of a POINT composition, but not the real one ACTION.PASS.</Paragraph> <Paragraph position="5"> To address these problems, another semantic feature set is introduced by deriving semantic features from trees where purely-syntactic nodes of SAPTs are removed (the resulting tree for the SAPT in Figure 3 is shown in Figure 4). In this tree representation, the example in Figure 4 would have the Grandparent Rule feature f([POINT-</Paragraph> <Paragraph position="7"/> </Section> </Section> <Section position="6" start_page="266" end_page="267" type="metho"> <SectionTitle> 4 Experimental Evaluation </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="266" end_page="267" type="sub_section"> <SectionTitle> 4.1 Experimental Methodology </SectionTitle> <Paragraph position="0"> Two corpora of natural language sentences paired with MRs were used in the reranking experiments.</Paragraph> <Paragraph position="1"> For CLANG, 300 pieces of coaching advice were randomly selected from the log files of the 2003 RoboCup Coach Competition. Each formal instruction was translated into English by one of four annotators (Kate et al., 2005). The average length of an natural language sentence in this corpus is 22.52 words. For GEOQUERY, 250 questions were collected by asking undergraduate students to generate English queries for the given database. Queries were then manually translated into logical form (Zelle and Mooney, 1996). The average length of a natural language sentence in this corpus is 6.87 words.</Paragraph> <Paragraph position="2"> We adopted standard 10-fold cross validation for evaluation: 9/10 of the whole dataset was used for training (training set), and 1/10 for testing (test set). To train a reranking model on a training set, a separate &quot;internal&quot; 10-fold cross validation over the training set was employed to generate n-best SAPTs for each training example using a base-line learner, where each training set was again separated into 10 folds with 9/10 for training the baseline learner, and 1/10 for producing the n-best SAPTs for training the reranker. Reranking models trained in this way ensure that the n-best SAPTs for each training example are not generated by a baseline model that has already seen that example. To test a reranking model on a test set, a baseline model trained on a whole training set was used to generate n-best SAPTs for each test example, and then the reranking model trained with the above method was used to choose a best SAPT from the candidate SAPTs.</Paragraph> <Paragraph position="3"> The performance of semantic parsing was measured in terms of precision (the percentage of completed MRs that were correct), recall (the percentage of all sentences whose MRs were correctly generated) and F-measure (the harmonic mean of precision and recall). Since even a single mistake in an MR could totally change the meaning of an example (e.g. having OUR in an MR instead of OP-PONENT in CLANG), no partial credit was given for examples with partially-correct SAPTs.</Paragraph> <Paragraph position="4"> Averaged perceptron (Collins, 2002a), which has been successfully applied to several tagging and parsing reranking tasks (Collins, 2002c; Collins, 2002a), was employed for training rerank- null ing models. To choose the correct SAPT of a training example required for training the averaged perceptron, we selected a SAPT that results in the correct MR; if multiple such SAPTs exist, the one with the highest baseline score was chosen. Since no partial credit was awarded in evaluation, a training example was discarded if it had no correct SAPT. Rerankers were trained on the 50best SAPTs provided by SCISSOR, and the number of perceptron iterations over the training examples was limited to 10. Typically, in order to avoid over-fitting, reranking features are filtered by removing those occurring in less than some minimal number of training examples. We only removed features that never occurred in the training data since experiments with higher cut-offs failed to show any improvements.</Paragraph> </Section> </Section> class="xml-element"></Paper>