File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/h94-1032_metho.xml

Size: 16,257 bytes

Last Modified: 2025-10-06 14:13:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1032">
  <Title>PRINCIPLES OF TEMPLATE DESIGN</Title>
  <Section position="4" start_page="0" end_page="177" type="metho">
    <SectionTitle>
2. Basic Ontology
</SectionTitle>
    <Paragraph position="0"> In constructing a representation for a domain or task, the first questions to ask are:  1. What axe the basic entities? What properties of these objects and what relations among them are we interested in? 2. What kinds of changes in such properties and relations  are we interested in? Answers to any one of these questions depend on answers to the others. Answers to the first provide the basic ontology of  the representation.</Paragraph>
    <Paragraph position="1"> Basic Entities The basic entities should be things that endure throughout the temporal focus of the task. 1 They enter into the relations and axe characterized by the properties of primary interest and are the participants in events that may change those properties and relations. In the joint ventures domain, companies are the primary candidates for basic entities. In the long run, they get formed, split, merge, and go out of business, but for many analytical purposes, and in partieulax for the purposes implicit in the MUC-5 task, we can think of them as permanent. It is companies that enter into joint venture relationships and through such relationships bring about the one crucial exception to the rough-and-ready rule just mentioned: the creation of new, joint venture companies. In the same domain, facilities and people axe also good candidates for basic entities.</Paragraph>
    <Paragraph position="2"> The basic entities may be represented by structured objects with a number of slots, as follows:  or by an atomic element such as an identifier, a set fill, a number, or a string:</Paragraph>
    <Paragraph position="4"> COMPANY: ''General Motors'' The difference in outcome between these two cases is that in the former you have to look elsewhere for the information about the entity, whereas in the latter you don't. In general, it's better not to have to, so unless there is a good deal of information that needs to be recorded about the type of entity in question, it is better to use an atomic element to represent such entities. Again, within the joint venture domain, companies are good candidates for representation as structured objects, since we need to know their aliases, location, nationality, officers, etc. On the other hand, within that same domain, it may be that the only information we need to record about a person, aside from his relation to a company, is his name, so in that case it is better to represent the person (atomically) by his name.</Paragraph>
    <Paragraph position="5"> 1 For more on this, see next section.</Paragraph>
    <Paragraph position="6"> Natural Kinds It is better if the types of basic entities, especially those represented by structured objects, are 'natural kinds', that is, if they correspond to fairly natural, intuitive ways of classifying and characterizing the domain of interest. For example, companies, people, facilities are natural kinds in this sense. Ordered pairs of Industry Types and Product/Services axe not. Rather than have basic entities of unnatural kinds, one may opt for more, or more complex, slot fills in objects of more natural varieties. Still, it should be remarked that one's eommonsense demarcation of a domain into basic entities is always subject to revision by the particular analytical demands of the task at hand. Thus, in the case of WBMH, while units (e.g., divisions, battalions, etc.) are a perfectly natural kind of entity, deployments, that is relatively short-lived activities involving elements from units, may be less natural but they axe at least equally central.</Paragraph>
    <Paragraph position="7"> Associating Properties with the Right Objects It is important to determine whether the property encoded in the slot of an object is really a property of that object, rather than of some other related object. For example, in the Tipster templates, Total Capitalization was viewed as a property of the Contribution object, whereas it is really a property of the Tie-Up Relationship, and thus should be associated with that object. This misplacement of properties seems especially likely when the entities in question axe types of relationships or activities, as they are in this ease. We return to the issue of representing relations below.</Paragraph>
  </Section>
  <Section position="5" start_page="177" end_page="178" type="metho">
    <SectionTitle>
3. Temporal Granularity
</SectionTitle>
    <Paragraph position="0"> We have noted that the issue of what kinds of changes are of interest relative to a given task is centrally important to the design of templates for the task. The resolution of this issue is a crucial determinant, in particular, of what we call the temporal granularity of the representation. Certain properties of and relations among entities are relatively permanent; others are relatively short-lived. But what counts as permanent and what as short-lived is itself dependent on our interests and purposes, both theoretical and practical. An analysis of the kinds of changes that are of interest should determine, even if only roughly, a temporal interval or length of time as its focus or window. See Fig. 1. Note that there is a mutual dependence here: Properties and relations that are apt to change within that time interval are temporary; those that are likely to hold throughout the designated interval are, with respect to this task, permanent. Thus, the fixing of a temporal granularity allows the resolution of many problems in template design by defining limits on what we have to specify.</Paragraph>
    <Paragraph position="1"> For example, in the joint ventures domain, we are interested in the formation (or dissolution) of tie-up relations among companies. Thus such relations are temporary, whereas subsidiary relations are permanent. If we were interested in buy-outs, subsidiary relations would be viewed as temporary, changes in such relationships being an important focus for the task. In the domain of troop movements or deployments, locations and associated equipment are temporary, whereas a unit's place in the command hierarchy is permanent, even though on the scale of decades (or even much less), that might change.</Paragraph>
    <Paragraph position="2">  Note that temporal granularity is task-relative rather than message-relative. The messages may have been written from very different temporal perspectives, with very different interests and purposes. We need to extract the information from them in a form that is appropriate for the task at hand.</Paragraph>
  </Section>
  <Section position="6" start_page="178" end_page="178" type="metho">
    <SectionTitle>
4. Representing Relations
</SectionTitle>
    <Paragraph position="0"> A relation can be represented in one of two ways, as a separate object in its own right, or as a property of one of its arguments. See Fig. 2 For example, the subsidiary relation could be represented by its own Entity Relationship object, or it could be represented by a Parent Company slot in the Entity object.</Paragraph>
    <Paragraph position="1"> The following criteria seem useful in deciding which of these options to adopt:  I. If the relation is of primary interest in the task, option (a) may be the best choice.</Paragraph>
    <Paragraph position="2"> 2. If a lot of other information needs to be recorded about that relation, option (a) is a good choice; if only the two arguments need to be recorded, option (b) is probably better.</Paragraph>
    <Paragraph position="3"> 3. If the relation is permanent relative to the temporal granularity of the information task, then option (b) is a good choice.</Paragraph>
    <Paragraph position="4"> 4. If some other relation, Relation2, depends on Relationl,  in the sense that the former cannot exist without the latter existing, then Relation2 is a good candidate for being represented via option (b).</Paragraph>
    <Paragraph position="5"> With respect to the second criterion, if in addition to the two arguments, we want to specify the time, the location,  and various other aspects of the relation, then option (a) is indicated. With respect to the third criterion, if the relation is at least as permanent as the entities, then option (b) is a good choice. These two criteria overlap to some extent. If the relation is permanent, there is likely no need to record its time.</Paragraph>
    <Paragraph position="6"> In the specific case of the Subsidiary relation in Tipster, it is not the relation of primary interest (Tie-Ups are), there are no other properties that need to be specified for the relation other than the parent and child companies, and the relation is permanent with respect to the temporal focus of the task. Therefore, option (b) seems appropriate.</Paragraph>
    <Paragraph position="7"> The Tipster template presents an apposite example of criterion 4 as well. A Contribution, as conceptualized in the template, is a relationship, just as a Tie-Up-Relationship is, so it certainly could qualify for object status. However, it is dependent on a Tie-Up-Relationship; a Contribution relationship among companies can't exist without a Tie-Up-Relationship among them. This indicates option (b) is appropriate.</Paragraph>
  </Section>
  <Section position="7" start_page="178" end_page="179" type="metho">
    <SectionTitle>
5. Events
</SectionTitle>
    <Paragraph position="0"> We can classify events, and the relations among entities that they involve, in di~erent ways for different purposes. On the basis of an examination of a variety of templates, we hypothesize that there axe three central event types. First, there are those that directly relate two or more basic entities, such as a company manufacturing a product or a terrorist organization attacking a target or a vendor supplying a buyer with a part. These very same events, however--especially if, as in the examples just mentioned, they involve purposive agents-can also be classified in terms of their purpose or aim. This type of classification typically involves further reference to an activity or condition, as when a company manufactures a product in order to enter a new market or when two companies form a joint venture for the purpose of carrying out some activity. Third, there is the specially important type of event involving communicative relations among basic entities, together with a content communicated, itself comprising some further activity or event of any of the three types. Thus, a typical event structure might be represented as in Fig. 3.</Paragraph>
    <Paragraph position="1"> Of course, in many cases there would be equations identifying the various entities involved. Thus, GM might announce it is forming a joint venture with Toyota for the manufacture of cars by GM in 3apan, where Source-Ent = Entl = Ent3 = GM. We also note that a Communication-Event can have a Basic-Event for its third argument.</Paragraph>
    <Paragraph position="2">  In addition to these three event types, there are relations between events that we may need to represent, such as causality or the subevent (part-whole) relation, as in Fig. 4. Thus, a shooting event could cause a dying event, and a troop movement might be part of a larger attack.</Paragraph>
    <Paragraph position="3"> In general, the template structure should be no deeper than this. It is better for the trees to be very broad (i.e., for individual objects to have lots of slots) than to be very deep.</Paragraph>
  </Section>
  <Section position="8" start_page="179" end_page="179" type="metho">
    <SectionTitle>
6. Entity Snapshots
</SectionTitle>
    <Paragraph position="0"> In many applications, there are a large number of temporary or transient properties of entities that are of primary concern. If we design the template around the enduring basic entities themselves, it might seem that these temporary properties should be demoted to mere slots rather than be represented as entities in their own right. These slots, on the other hand, would also have to allow multiple entries and each entry would have to have time stamps. A way to eliminate this complexity is to have as first-class objects, in addition to Entities, Entity Snapshots. An Entity Snapshot is an Entity at a particular point or interval in time. As such, an Entity Snapshot would have a pointer to the Entity that it is a snapshot of. It would also carry all the temporary information about the Entity. The time of the snapshot would also be one of the slots.</Paragraph>
    <Paragraph position="1"> In the WBMH domain, these Entity Snapshots, under the name Entity Information, are primary objects of interest.</Paragraph>
    <Paragraph position="2"> They represent deployments, or &amp;quot;target opportunities&amp;quot;. Such temporary properties of Entities as Equipment, Location, Direction, and so on, are really to be associated with deployments, Snapshots, rather than Entities or Units.</Paragraph>
    <Paragraph position="3"> 6.1. Entities from Entity Snapshots Often the first way one might think of an entity is in terms of its structure and properties at a particular moment in time.</Paragraph>
    <Paragraph position="4"> One later realizes that in fact the entity maintains its identity over time as its internal structure changes. In this case we should reconceptualize the entity as being a mapping from instants or temporal intervals into its structure and properties at that time.</Paragraph>
    <Paragraph position="5"> For example, one's first intuition about the nature of a department may be that it is a set of employees. Later one realizes it should have been conceptualized as a mapping from times to sets of employees. In this case, it is a good idea to have both Departments and Department Snapshots, where the set of employees is a property of the Department Snapshot. null There are a number of interesting problems of analysis that revolve around the relationship between entities and entity snapshots. Sometimes one is of primary interest, sometimes the other. For example, in Desert Shield, units were of interest; in particular a major focus of concern was the calculation of unit strengths. In Desert Storm, however, deployments were of primary interest, since it was deployments that presented the immediate danger. 2 In general, we want to be able to infer the identity of different deployments across time, to infer their membership in units, to derive some of their properties from default properties of their units, and to determine properties of units, such as unit strength and readiness, from properties of deployments.</Paragraph>
  </Section>
  <Section position="9" start_page="179" end_page="180" type="metho">
    <SectionTitle>
7. Slot Fills
</SectionTitle>
    <Paragraph position="0"> Slot fills should be uncomplicated. They should take one of the following forms:  (a) Atomic elements, such as identifiers, numbers, strings.</Paragraph>
    <Paragraph position="1"> (b) Pointers to structured objects. (c) Tuples whose elements are of types a and b. (d) Sets whose elements are of types a, b, or c. or  It is probably confusing to have tuples with more than three elements. Thus, the maximum complexity of a slot fill would be {(A1, B1, C1), (A2, B2, C2),...} Many set fills of type (d) whose elements are of type (c) may be thought of as functions. For example, if we had a set of pairs of companies and ownership percentages, we could think of it as representing a function from companies to ownership percentages. However, not all set fills of this type are conveniently thought of as functions. If we have an Officers slot for the Company object, whose filler is a set of tuPles of the form (Position Person), then an entry might be:  But in the presentation of the templates, it is often better from the user's point of view to represent them as tuples, rather than multiplying kinds of objects. This is an instance of the prindple that the user shouldn't have to go looking too fax afield for information. As you follow a complex path of pointers, it can be easy to forget what the type of an object is and where it fits into the web of relationships you're interested in.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML