File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-2034_intro.xml
Size: 7,243 bytes
Last Modified: 2025-10-06 14:03:40
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2034"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Discriminative Reranking for Semantic Parsing</Title> <Section position="4" start_page="263" end_page="264" type="intro"> <SectionTitle> 2 Background </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="263" end_page="263" type="sub_section"> <SectionTitle> 2.1 Application Domains </SectionTitle> <Paragraph position="0"> national AI research initiative using robotic soccer as its primary domain. In the Coach Competition, teams of agents compete on a simulated soccer field and receive advice from a team coach in a formal language called CLANG. In CLANG, tactics and behaviors are expressed in terms of if-then rules. As described in Chen et al. (2003), its grammar consists of 37 non-terminal symbols and 133 productions. Negation and quantifiers like all are included in the language. Below is a sample rule with its English gloss: ((bpos (penalty-area our)) (do (player-except our {4}) (pos (half our)))) &quot;If the ball is in our penalty area, all our players except player 4 should stay in our half.&quot; 2.1.2 GEOQUERY: a DB Query Language GEOQUERY is a logical query language for a small database of U.S. geography containing about 800 facts. The GEOQUERY language consists of Prolog queries augmented with several meta-predicates (Zelle and Mooney, 1996). Negation and quantifiers like all and each are included in the language. Below is a sample query with its English gloss: answer(A,count(B,(city(B),loc(B,C), const(C,countryid(usa))),A)) &quot;How many cities are there in the US?&quot;</Paragraph> </Section> <Section position="2" start_page="263" end_page="264" type="sub_section"> <SectionTitle> 2.2 SCISSOR: the Baseline Model </SectionTitle> <Paragraph position="0"> SCISSOR is based on a fairly standard approach to compositional semantics (Jurafsky and Martin, 2000). First, a statistical parser is used to construct a semantically-augmented parse tree that captures the semantic interpretation of individual words and the basic predicate-argument structure of a sentence. Next, a recursive deterministic procedure is used to compose the MR of a parent node from the MR of its children following the tree structure.</Paragraph> <Paragraph position="1"> Figure 1 shows the SAPT for a simple natural language phrase describing the concept PLAYER in CLANG. We can see that each internal node in the parse tree is annotated with a semantic label (shown after dashes) representing concepts in an application domain; when a node is semantically vacuous in the application domain, it is assigned with the semantic label NULL. The semantic labels on words and non-terminal nodes represent the meanings of these words and constituents respectively. For example, the word our represents a TEAM concept in CLANG with the value our, whereas the constituent OUR PLAYER 2 represents a PLAYER concept. Some type concepts do not take arguments, like team and unum (uniform number), while some concepts, which we refer to as predicates, take an ordered list of arguments, like player which requires both a TEAM and a UNUM as its arguments.</Paragraph> <Paragraph position="2"> SAPTs are given to a meaning composition process to compose meaning, guided by both tree structures and domain predicate-argument requirements. In figure 1, the MR of our and 2 would fill the arguments of PLAYER to generate the MR of the whole constituent PLAYER(OUR,2) using this process.</Paragraph> <Paragraph position="3"> SCISSOR is implemented by augmenting Collins' (1997) head-driven parsing model II to incorporate the generation of semantic labels on internal nodes. In a head-driven parsing model, a tree can be seen as generated by expanding non-terminals with grammar rules recursively.</Paragraph> <Paragraph position="4"> To deal with the sparse data problem, the expansion of a non-terminal (parent) is decomposed into primitive steps: a child is chosen as the head and is generated first, and then the other children (modifiers) are generated independently parameter PL1(Li|...), using the same notation as in Ge and Mooney (2005). The symbols P, H and Li are the semantic label of the parent , head, and the ith left child, w is the head word of the parent, t is the semantic label of the head word, d is the distance between the head and the modifier, and LC is the left semantic subcat.</Paragraph> <Paragraph position="5"> constrained by the head. Here, we only describe changes made to SCISSOR for reranking, for a full description of SCISSOR see Ge and Mooney (2005).</Paragraph> <Paragraph position="6"> In SCISSOR, the generation of semantic labels on modifiers are constrained by semantic subcategorization frames, for which data can be very sparse. An example of a semantic subcat in Figure 1 is that the head PLAYER associated with NN requires a TEAM as its modifier. Although this constraint improves SCISSOR's precision, which is important for semantic parsing, it also limits its recall. To generate plenty of candidate SAPTs for reranking, we extended the back-off levels for the parameters generating semantic labels of modifiers. The new set is shown in Table 1 using the parameters for the generation of the left-side modifiers as an example. The back-off levels 4 and 5 are newly added by removing the constraints from the semantic subcat. Although the best SAPTs found by the model may not be as precise as before, we expect that reranking can improve the results and rank correct SAPTs higher.</Paragraph> </Section> <Section position="3" start_page="264" end_page="264" type="sub_section"> <SectionTitle> 2.3 The Averaged Perceptron Reranking Model </SectionTitle> <Paragraph position="0"> Averaged perceptron (Collins, 2002a) has been successfully applied to several tagging and parsing reranking tasks (Collins, 2002c; Collins, 2002a), and in this paper, we employed it in reranking semantic parses generated by the base semantic parser SCISSOR. The model is composed of three parts (Collins, 2002a): a set of candidate SAPTs GEN, which is the top n SAPTs of a sentence from SCISSOR; a function Ph that maps a sentence Inputs: A set of training examples (xi,y[?]i ), i = 1...n, where xi is a sentence, and y[?]i is a candidate SAPT that has the highest similarity score with the gold-standard SAPT x and its SAPT y into a feature vector Ph(x,y) [?] Rd; and a weight vector -W associated with the set of features. Each feature in a feature vector is a function on a SAPT that maps the SAPT to a real value. The SAPT with the highest score under a parameter vector -W is outputted, where the score is calculated as:</Paragraph> <Paragraph position="2"> The perceptron training algorithm for estimating the parameter vector -W is shown in Figure 2. For a full description of the algorithm, see (Collins, 2002a). The averaged perceptron, a variant of the perceptron algorithm is often used in testing to decrease generalization errors on unseen test examples, where the parameter vectors used in testing is the average of each parameter vector generated during the training process.</Paragraph> </Section> </Section> class="xml-element"></Paper>