File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-1435_metho.xml

Size: 8,037 bytes

Last Modified: 2025-10-06 14:07:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1435">
  <Title>Demonstration of ILEX 3.0</Title>
  <Section position="3" start_page="0" end_page="257" type="metho">
    <SectionTitle>
2 Generating from Bare Data
</SectionTitle>
    <Paragraph position="0"> We start initially with a relational database, as defined by a set of tab-delimited database files, plus some minimal semantics. As discussed in the paper, we use assume a relational database to consist of two types of files:  1. Entity Files: each of which provides data for a particular entity type. Each row (or record) defines the attributes of a different entity. See figure 1.</Paragraph>
    <Paragraph position="1"> 2. Link Files: where a particular attribute may have multiple fillers, we use link files to define the entity-entity relations. See figure 2.</Paragraph>
    <Paragraph position="2"> To generate from these files, the dolnain-editor needs to provide two additional resources: 1. Data-type specification for each entity-file, a specification of what data-type the values in the ~ Material silver enamel gold Figure 2: A Sample from a Link file .</Paragraph>
    <Paragraph position="3"> 3.</Paragraph>
    <Paragraph position="4">  column are, e.g., string, entity-id, domain type, etc.</Paragraph>
    <Paragraph position="5"> Domain Taxonomy: detailing the taxonomic organisation of the various classes of the entities. Mapping Domain taxonomy onto Upper Model: ILEX uses an Upper Model (a domain-independent semantic taxonomy, see Bateman (1990)), which supports the grammatical expression of entities, e.g., selection of pronoun, differentiation between mass and count entities, between things and qualities, etc. We require that the basic types in the domain taxonomy are mapped onto the upper model, to allow the entities to be grammaticalised and lexicalised appropriately.</Paragraph>
    <Paragraph position="6"> With just this semantics, we can generate texts, although impoverished texts, such as: The class of J-997 is necklace. It's designer is Jessie M. King. It's date is 1905. Several tricks are needed to generate without a specified domain semantics: Use of standard clause templates: lacking any knowledge of how different attributes are to be expressed, the system-can only generate each attribute using a standard template structures, such as the X of Y is Z or It's X is Z. The attribute names, e.g., Designer, Style, etc. can be assumed to work as the lexical head of the Subject. This ploy sometimes goes wrong, but in general works. (this approach borrowed from Dale et al. (1998)).</Paragraph>
    <Paragraph position="7">  * Referring to Entities: there are a number of strategies open for referring to entities. If the Name attribute.is.supplied-(a:defined- attribute within the ILEX system), then the system can use this for referring. Lacking a name, it is possible for the system to form nominal references using the Class attribute of the entity (all entities in ILEX databases are required to have this attribute provided). We could thus generate indefinite references such as a brooch as first mentions, and on subsequent mentions, generate forms such as the brooch or the brooch whose designer is Jessie M. King. Without specification of which entities should be considered part of the general knowledge of the reader, we must assume all entities are initially unknown.</Paragraph>
    <Paragraph position="8"> * Fact Annotations: ILEX was designed to work with various extra information known about facts, such as the assumed level of interest to the current reader model, the importance of the fact to the system's educational agenda, and the assumed assimilation of the information (how well does the system believe the reader to already understand it). See the main paper for more details.</Paragraph>
    <Paragraph position="9"> Lacking this information, the system assumes an average value for interest and importance, and a 0 value for assimilation (totally unknown). null With only default values, the system cannot customise the text to the particular user. It may provide information already well known by the user, and thus risking boring them. Also, there can be no selection of information to ensure that the more interesting and important information is provided on earlier pages (the reader may not bother to look at later pages).</Paragraph>
    <Paragraph position="10"> Other information (defeasible rules), which allows us to organise the material into complex rhetorical structure, is also missing.</Paragraph>
    <Paragraph position="11"> So, these tricks allow us to generate simple texts, consisting of a list of template-formatted clauses.</Paragraph>
  </Section>
  <Section position="4" start_page="257" end_page="257" type="metho">
    <SectionTitle>
3 Adding Expression information
</SectionTitle>
    <Paragraph position="0"> In the next step, we will add in information about how the various attributes should be expressed. This  includes three main resources: 1. Syntactic expression of attributes: for each attribute, we provide a specification of how the ......... ~. ~.~ribu:te~should~be~-expressed. syntactically. 2. Lexicalisation of domain types: by providing  a lexicon, which maps domain types to lexical items, we avoid problems of using the domain type itself as the spelling. The lexical information allows correct generation of inflectional forms (e.g., of the plural for nouns, comparative or superlative forms for adjectives).</Paragraph>
    <Paragraph position="1"> 3. Restrictive modifiers for referring expressions: In choosing restrictive modifiers for forming referring expressions, some facts work better than others. For instance, the brooch designed by King is more likely to refer adequately than the brooch which was 3 inches long. ILEX allows the user to state the preferential order for choosing restrictive modifiers.</Paragraph>
    <Paragraph position="2"> The addition of these resources will result in improved expression within the clauses, but not affect the text structure itself, which are still a list of clauses in random order.</Paragraph>
  </Section>
  <Section position="5" start_page="257" end_page="257" type="metho">
    <SectionTitle>
4 Adding User Annotations
</SectionTitle>
    <Paragraph position="0"> In the next step, we add in the user model, which provides, for each attribute type, predicted user interest, importance for the system, and expected user assimilation.</Paragraph>
    <Paragraph position="1"> Using these values, ILEX can start to organise the text, placing important/interesting information on earlier pages, and avoiding information already known by tile user.</Paragraph>
  </Section>
  <Section position="6" start_page="257" end_page="258" type="metho">
    <SectionTitle>
5 Adding Defeasible Rules, Stories
</SectionTitle>
    <Paragraph position="0"> As a final step, we add in various resources which improve the texture of the text.</Paragraph>
    <Paragraph position="1"> o Defeasible Rules: ILEX allows the assertion of generalisations like most Art Deco jewels use enamel. These rules allow the generation of complex rhetorical structures which indude Generalisation, Exemplification and Concession. The use of these relations improves tim quality of the text generated.</Paragraph>
    <Paragraph position="2"> * Stories: much of the information obtainable about tile domain is in natural language. Often, the information is specific to a particular  entity, and as such, it would be a waste of time to reduce the in.formation into ILEX's Pred-Arg knowledge structure, just to regenerate the text. Because of this, ILEX allows the association of canned text with a database entity (e.g., J999), or type of entity (e.g., jewels designed for Liberty). The text can then be included in the text when the entity or type of entity is mentioned. null The intermixing of generated and canned text improves the quality of generated texts by providing more variety of structures, and allowing anecdotes, which would be difficult to model in terms of the knowledge representation system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML