File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/01/n01-1003_concl.xml
Size: 2,632 bytes
Last Modified: 2025-10-06 13:53:01
<?xml version="1.0" standalone="yes"?> <Paper uid="N01-1003"> <Title>SPoT: A Trainable Sentence Planner</Title> <Section position="10" start_page="21" end_page="21" type="concl"> <SectionTitle> 7 Discussion </SectionTitle> <Paragraph position="0"> We have presented SPoT, a trainable sentence planner.</Paragraph> <Paragraph position="1"> SPoT re-conceptualizes the sentence planning task as consisting of two distinct phases: (1) a very simple sentence plan generator SPG that generates multiple candidate sentence plans using weighted randomization; and (2) a sentence plan ranker SPR that can be trained from examples via human feedback, whose job is to rank the candidate sentence plans and select the highest ranked plan. Our results show that: a89 SPoT's SPR selects sentence plans that on average are only 5% worse than the sentence plan(s) selected as the best by human judges.</Paragraph> <Paragraph position="2"> a89 SPoT's SPR selects sentence plans that on average are 36% better than a random SPR that simply selects randomly among the candidate sentence plans.</Paragraph> <Paragraph position="3"> We validated these results in an independent experiment in which 60 subjects evaluated the quality of different realizations for a given turn. (Recall that our trainable sentence planner was trained on the scores of only two human judges.) This evaluation revealed that the choices made by SPoT were not statistically distinguishable from the choices ranked at the top by the two human judges. More importantly, they were also not distinguishable statistically from the current hand-crafted template-based output of the AT&T Communicator system, which has been developed and fine-tuned over an extended period of time (whereas SPoT is based on judgments that took about three person-days to make). SPoT also was rated better than two rule-based versions of our SPG which we developed as baselines. All systems out-performed the random choice. We will report on these results in more detail in a future publication.</Paragraph> <Paragraph position="4"> In future work, we intend to build on the work reported in this paper in several ways. First, we believe that we could utilize additional features as predictors of the quality of a sentence plan. These include features based on the discourse context, and features that encode relationships between the sp-tree and the DSyntS. We will also expand the capabilities of the SPG to cover additional sentence planning tasks in addition to sentence scoping, and duplicate the methods described here to retrain SPoT for our extended SPG.</Paragraph> </Section> class="xml-element"></Paper>