File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1433_metho.xml

Size: 9,746 bytes

Last Modified: 2025-10-06 14:15:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1433">
  <Title>General information Home Team Visiting Team Referee Spectators</Title>
  <Section position="1" start_page="0" end_page="292" type="metho">
    <SectionTitle>
SYSTEM DEMONSTRATION
GOALGETTER: GENERATION OF SPOKEN SOCCER REPORTS
Mari~t Theune and Esther Klabbers
IPO, Center for Research on User-System Interaction,
</SectionTitle>
    <Paragraph position="0"> Abstract - In this paper we describe a demonstration Of the GoalGetter system, which generates spoken soccer reports (in Dutch) on the basis of tabular data. Two types of speech output are available. The demo runs via the web. It includes the possibility of !creating your own match' and having GoalGetter generate a report on this match. * . ._ 1. About the system * The GoalGetter system is a Data-to-Speech system which generates spoken soccer reports (in Dutch) on the  basis of tabular data. The system takes as input data about a soccer match that are derived from a Teletext * page. 1 The output of the system is a spoken, natural language report conveying the main events of the match described on the Teletext page, GoalGetter was developed on the basis of D2S, a generic system for the creation of Data-to-Speech systems. The general architecture of D2S is represented in Figure 1. It consists of two modules, the Language Generation Module (LGM), and the Speech Generation Module (SGM). The LGM takes data as input .and * produces enriched text, i.e., text which is annotated with prosodic markers indicating pitch accents and intonational boundarie s. This text is sentas input to the SGM, which turns it into a speech signal. Data n degagdeg I SPeech I _- Generation Enriched Module \] &amp;quot; Text ~ Generatidegn IModule Speech Signal Figure 1&amp;quot; Global architecture of D2S Language generation in the LGM is done using syntactic templates, which are syntactic tree structures containing slots for variable expressions. The selection and ordering of the templates and the filling of their slots depend mainly on conditions on the discourse context, which is represented in a Discourse Model. In order to achieve variation in the generated texts, the system generates a number of different expressions for each piece of information. If various expressions are equally suitable given the current contexL one is chosen at random. For a detailed description of the syntactic template technique we refer to \[Van Deemter &amp; Odijk 97\] and \[Theune et al. 97\].</Paragraph>
    <Paragraph position="1"> In the Prosody module, accentuation and phrasing are determined using information about the syntactic structure of a sentence and about its context. Phrases expressing 'new' or 'contrastive' information are accented while those expressing 'given' information are deaccented. Speech generation in the SGM is done 1. Teletext is a system with which information is broadcast along with the television signal and decoded in the receiver. The information is distributed over various 'pages', most of which contain textual information, but some contain tables.</Paragraph>
    <Paragraph position="3"> by means of either speech synthesis or an advanced form of phrase concatenation, using different prosodic versions of otherwise identical phrases*. Which version is chosen depends on the prosodic markings provided by the LGM. For Dutch speech synthesis, we use the phonetics-to-speech system SPENGI (SPeech synthesis ENGIne) developed at IPO, which employs PSOLA-based diphones.</Paragraph>
  </Section>
  <Section position="2" start_page="292" end_page="294" type="metho">
    <SectionTitle>
2. Outline of the demonstration
</SectionTitle>
    <Paragraph position="0"> The demonstration of the GoalGetter system runs via the web and consists of three parts. First we present an example report that has been generated off-line by GoalGetter. This allows us to prepare an (annotated) English translation of the Dutch text in advance. (In the other parts of the demonstration, preparing a translation is not possible since reports will be generated on-line and i t is impossible to predict the exact contents of the reports, due to the variation mentioned above. In those cases we providea rough translation on the spot.) Second, we generate a report on-line, using an existing input table. FinallY, we fill an input table with data selected by those attending the demonstration, and generate a report expressing these data. All modules depicted in Figure 1 are included in the demonstration. For each 'generation round', first a plain text version of the generated report is shown, and then the enriched text with the prosodic markers. Finally, the system's</Paragraph>
    <Paragraph position="2"> Output from language generation Vite~e ging op be2oek bij Ajax en won met een - twee vijfentwintig duizend toe~houwevz kwamen naxr De Meer de Vitesse speler Gorter benutte in de twaalfde mintrat een penalty in de drieentwintigste mimmt braeht de Ajax zpeler Litmanen de teams op gelijke hoogte  de verdediger Atteveld maa.kte na vijfenveertig minuten her winnende doelptmt voor Vite~e en bepaalde dawmee de einck~t and op een- twee Grim van Ajax ontving een gele kaart van ~heldsreehter Temmln~ er vielen geenrode kaarten Figure 2: Screenshot showing Teletext page and generated report Example report - In order to give a first impression of the GoalGetter system, we start the demonstration by 2. Due to the fact that a few phrases are missing from the phrase database, phrase concatenation is only available in the first part of the demonstration. The output from diphone synthesis is available in all parts.  showing an example 'Teletext page &amp;quot;3 and a report that was generated onthe basis of this page in both plain and enriched text format. A written English translation will be provided. We play a spoken version of the report using both speech synthesis and phrase concatenation.</Paragraph>
    <Paragraph position="3"> The screenshot in Figure 2 shows a Teletext page and the plain * text version of a generated report. 4 Since the screenshot was taken from the English language version of the web demonstration, to the right of the Teletext page we can see a translation table of the words and abbreviations occurring on the page. Below this table, there is a 'Create report' button, which can be used to generate a new report on the basis of the same Teletext page.</Paragraph>
    <Paragraph position="4"> Below the report, there are three more buttons. Clicking the first replaces the plain text version of the report with the enriched text version, shown in Figure 3. 5 The other two buttons can be clicked to play the audio files that were created when the report was generated, using phrase concatenation and diphone synthesis * respectively. The sound files can be either in aiff or in wav format.</Paragraph>
    <Paragraph position="5"> Output from language generation I/&amp;quot;Vite~ze ging op &amp;quot;bezoek bij &amp;quot;Ajax/I en &amp;quot;won met &amp;quot;@een H ~ &amp;quot;@twee/H &amp;quot;vijfentmintig dulzend &amp;quot;'toe~aouwe~ / kwaman na~ De &amp;quot;Meet I/I :le % &amp;quot;Vite~e speler &amp;quot;Gorter/ benutte in de &amp;quot;t-4taalfde &amp;quot;mint.nat een &amp;quot;penalty/// in de &amp;quot;drieentwintigste mintrat/brncht de % &amp;quot;Ajax speler &amp;quot;Litmw~n /de teams op &amp;quot;gelijke &amp;quot;hoogte//I :le % &amp;quot;verdediger &amp;quot;Arteveld / maa.~te na &amp;quot;vijfenveertig minuten bet &amp;quot;winrm~e doelpuaat / voor &amp;quot;Vite~e Hen ~*paalde ,:la~'mee de ?elndstimd/op &amp;quot;~n II- &amp;quot;@trace l/I &amp;quot;Grim van &amp;quot;Ajax / ontving een &amp;quot;gele &amp;quot;l~art \[ van &amp;quot;~heidsrechter &amp;quot;Temmink//1' vielen &amp;quot;geen &amp;quot;rode ~'ten I/!  Generate a report on-line - The next step is to generate a report on-line. We use the same Teletext page as before, which gives us the opportunity tO point out the variation in the generated reports. Although the written report (including prosodic markers) is generated almost instantaneously, the fact that the sound files have to be created at the same time causes a small delay of about five seconds. When the report has been generated and one of the speech buttons is Clicked, the corresponding sound file has already been created and onlY has to be read.</Paragraph>
    <Paragraph position="6"> Creating a match - Finally, the web demonstration allows* one to define the input for GoalGetter by filling an empty input table according to one's own preferences. We use this feature to 'create a match' using data selected by those present at the demonstration. On the basis of these data, a 'Teletext pag e' is created and reports can be generated in the manner explained above.</Paragraph>
    <Paragraph position="7"> In Figure 4 we can see how an input table is filled. First, general information about the match is specified, such as the teams involved, the referee and the number of spectators. The teams and the referee are chosen from a predefined list, via a pull-down menu. After the general information has been determined, the main  events of the match can be specified in the 'Goals and cards' section. This is done by selecting a player from one of the teams, selecting the event this player was involved in (goal, own goal, penalty, yellow or red card), specifying the minute the event occurred in (for cards this is optional), and then clicking on the Add button to add the 'event' to the input table. This action can be repeated until an interesting match has been created. Then clicking the 'Create the Teletext page' button will produce a page similar to the one shown in Figure 2, and a report can be generated.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML