File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/x96-1004_intro.xml

Size: 16,453 bytes

Last Modified: 2025-10-06 14:06:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="X96-1004">
  <Title>Some Technology Transfer: Observations from the TIPSTER Text Program</Title>
  <Section position="3" start_page="26" end_page="29" type="intro">
    <SectionTitle>
TIPSTER Program Executive Committee.
</SectionTitle>
    <Paragraph position="0"> The third step is to integrate into an experimental environment so that complete system flows can be demonstrated and important processing threads can be carried out from start to finish. This step tests the effectiveness of step two and prepares the way for step four. For TIPSTER, demonstration projects undertook this step individually in Phase II. For Phase III, the Architecture and Capabilities Platform will provide an environment to work on these issues. The Free Text Management (FTM) Project \[3\], for the National Drug Intelligence Center (NDIC), at the Federal Intelligent Document Understanding Laboratory (FIDUL) is an example of a Phase II project devoted entirely to the experimental integration of TIPSTER technologies into a near user environment, for the purpose of engineering and trade-off studies. TIPSTER pilots, such as CANIS \[4\], to some extent perform the same purpose, although a less broad range of issues can be experimented with under CANIS than in FTM.</Paragraph>
    <Paragraph position="1"> In step four, the deployment in the experimental environment is used to develop a plausible concept of operations in terms of work flow; a plausible user interface is developed to go with it. CANIS and FTM are both examples of this step. They employ early stage user interfaces for the purpose of investigating the way users would actually like to use the technology. Information from the test and use of these interfaces will be fed into specific requirements for further interface and work flow development if either system is actually going to be moved into full-time operational use. For this reason, at this stage, the interface should be easy to develop, easy to change, and familiar in some aspects to what users know already. A number of demonstration projects use Web browsers for this purpose. In step four, potential users  of the technology are exposed to some of its possible uses, as much as possible seeing their own data flow through the system. Their reactions and suggestions are collected as important sources of ideas about how to actually configure the technology against their task. At this point, it is particularly important for a developer or someone who communicates well with developers to become intimately familiar with the tasks of a number of potential user groups. This person can then experiment in the use of the technology to do these tasks. It is important to have input from users, who may also experiment with the use of the technology if they are inclined to, but it is also vitally important to have someone, familiar with the technology and its potential, look at new ways of tackling the user's issues.</Paragraph>
    <Paragraph position="2"> Step five is to respond to a real application opportunity. If the preliminary work has been well done (i.e., steps 1-4) the development of an actual application should be relatively rapid, given a reasonably stable environment to integrate into.</Paragraph>
    <Paragraph position="3"> Rapidity of insertion of the technology at this point is an important objective. Fundamentally, this is because the more quickly a project can successfully be put into place, once users have signed onto it, the more likely it is that the project can avoid a number of serious pitfalls: changed target environment, changed user task, user reorganization, changed user priorities or goals, radical improvements in the base technology which have not been integrated into the design, and perception by the user of excessive amounts of time being required for the project. However, it cannot be stated strongly enough that appropriate preparation (via steps 1-4) must have occurred in order for rapid insertion to take place. Attempts at rapid insertion without the needed preparation leads to many problems, such as applications that are not robust, poorly planned user work flows, poor estimates of integration times, and badly prioritized or developed requirements. These in turn can lead to systems that are not used, to dissatisfied customers, and to heavy reworkings of the system under the guise of O&amp;M (operations and maintenance).</Paragraph>
    <Paragraph position="4"> In step five, the use of a standard form of project management cycle becomes appropriate. Requirements for the particular application must be understood, defined, and baselined, accounting for: the user's current work flow and possible changes to it; the target software/hardware environment; the data input, output, and storage formats; maintenance and support; documentation and training; testing and evaluation plans. Rapid prototyping of interfaces, including the GUI, has been successfully used, in ADEPT \[5\] for example, to help define the requirements and to flush out technical problems. Prior to installation and use, testing and evaluation in an exact replica of the user's environment is usually necessary. Training for endusers, managers, and support personnel should accompany installation. Further evaluation after installation can determine whether the system has in fact produced the expected improvements.</Paragraph>
    <Paragraph position="5"> Step six is to provide initial system support, rapid responses to problem reports, and as much hands-on user support as possible. No TIPSTER system with which I am familiar has actually reached this stage as of the writing of this article. However, some general comments are possible from my observations of other systems developments. Customer service, as many commercial organizations are aware, makes the difference between success and failure in many instances. The way in which a system is presented during training and supported will make a big difference in how well it is liked. At the same time, to avoid runaway cost inflation on a project, there must be clear delineation between fixes and changes at this stage. Despite all the careful testing, it is inevitable that there will be some bugs in the software and these must be fixed immediately. However, users can take this stage as an opportunity to change the actual functionality of the software, in effect to change the requirements. While perhaps some of these changes can be responded to, if the budget allows, they are best tracked separately from fixes to the software.</Paragraph>
    <Paragraph position="6"> A clear understanding between the developer and the user organization concerning how much effort is being put into fixes and how much into actual modifications to respond to a change in the requirements is very helpful. Or there can be an agreed upon period in which changes are made up to a certain level of effort or cost. Finally, within six months to a year from the integration of the technology, the maintenance and support of the system should be transitioned to the user's organization.</Paragraph>
    <Paragraph position="7"> 5. COTS, GOTS and the alternative One of the issues the TIPSTER Program has had to struggle with is the form in which its technology can best be provided to the Government user. There is considerable hope among some in Government that many text handling needs can be met with commercial products. It is hoped in this manner to control the cost of using these technologies. If this is to be the case, then TIPSTER technologies would have to become part of standard commercial offerings. The TIPSTER Program has made a number of different efforts aimed at facilitating the use of its advances by commercial entities. All TIPSTER materials, including research papers, are published and researchers are encouraged to present their results at other open forums. Some commercial participants have found their way to MUC and TREC, where their research can  be benchmarked and shared with others. TIPSTER participants have been encouraged to commercialize their ideas, when feasible. A number of commercial spin-offs of TIPSTER technology are happening. The TIPSTER Program is keeping abreast of developments in standards such as Z39.50 and the Document</Paragraph>
    <Section position="1" start_page="28" end_page="29" type="sub_section">
      <SectionTitle>
Management Alliance.
</SectionTitle>
      <Paragraph position="0"> However, these efforts alone will not result in the rapid development of inexpensive, robust, and well-supported commercial products which also meet the Government analyst's requirements for information handling of text. Developing a product for commercial market itself takes time, so that advances promoted by TIPSTER, left to their own, might take longer than we wish to reach the commercial market. Commercial uses of the technology will not necessarily include all the functionality that Government analysts require. In order to get those more advanced versions of the technology for its use, the Government has to help push the final steps of technology transfer of those advanced features. This push can be accomplished during the development of a specific application, but these intermediate steps (generally steps 2-4) must be incorporated into the planning and budgeting for such an application.</Paragraph>
      <Paragraph position="1"> The bottom line is that Government analysts, particularly Intelligence analysts, have text handling and information handling needs that are more difficult to satisfy than those of the general software buying and using commercial world. While the advanced software features they need may eventually become commonplace in commercial software, this is likely to take a long time to happen, since the general demand for such features appears to be moderate, at best. This is easy enough to understand if one stops to think about the kinds of applications that most potential users, of Detection for example, have. The general user - students, small businesses, people at home certainly have very little interest in Recall and even in Precision. Their most pressing concern will be speed and getting one or two good answers to their questions in the top of their return document list. General business users, doing market research perhaps, or tracking competitors' activities, would likely be most interested in Precision in the top of their return document list; they are not tracking events at a level that requires a total and detailed picture of everything that has been said related to a particular topic over a long period of time. Additionally, the cost of missing something is measured in dollars, not the loss of life or the security of the country as it may be for an Intelligence Analyst. Even the applications, such as Insurance and Law, which have many similarities to the Intelligence application, rarely have the same far-reaching cost associated with failure as the Intelligence application. In addition, while these two applications require better Recall than other commercial ones, they do not probably place the same stresses on a Detection tool because the user is searching in the context of a narrower range of document types and a narrower range of types of questions which they need to answer.</Paragraph>
      <Paragraph position="2"> The need for Information Extraction technology appears even less pressing outside the Government.</Paragraph>
      <Paragraph position="3"> Besides the potential this technology may have to improve Detection capabilities, its major application to date is the filling of data bases from unformatted text to support further analytical tasks. While some non-Government applications, such as the development of formatted patient records from unformatted physician reports, have been investigated, there appears to be relatively low demand for this type of technology in the commercial world at this time.</Paragraph>
      <Paragraph position="4"> Given this state of affairs, COTS products do not seem even yet to be an assured answer to the need for a well-supported suite of tools employing TIPSTER technology. Government off-the-shelf software, GOTS, may perhaps provide a better answer.</Paragraph>
      <Paragraph position="5"> Government owned software covers a broad spectrum of readiness and robustness. It offers no easy solution, because to provide software in a truly off-the-shelf condition to Government users, would in fact mean setting up a small business-like unit to do software testing and upgrades, promotion of products, distribution, integration support, and maintenance, all of which cost money over and above the initial investment in the software development. Thus, even though the Government owned software can be shared reasonably freely among agencies, costs related to the distribution would need to be born by someone, presumably the agency requinng the software. So the distribution of TIPSTER GOTS would require the establishment of a small business center to test and maintain the software. Before such a business could be established, a market survey would be required to determine if such a center would pay for itself, in addition to providing, at a comparable or lower price, better service than the already existing less formally organized system of distribution.</Paragraph>
      <Paragraph position="6"> The system of distribution that exists now for TIPSTER GOTS is informal and low cost. We have a clearinghouse for information, in the form of the TIPSTER Executive Committee and other Government participants. The TIPSTER program freely advertises its involvement in Document Detection and Information Extraction through many informal contacts and a number of formal reviews and publications. Those desiring information about this type of software contact the committee or other participants and can be directed to contractors who have the kinds of software which is required. A Systems Engineering contractor also keeps on file records of the  design of all TIPSTER systems so that Government users can get detailed information about the configuration of any of the software which they may be interested in procuring. The cost for integrating, modifying, and maintaining the software is born by the using organization in fees to the vendor. So, TIPSTER GOTS is not free, but cheaper than developing the capability repeatedly in different locations.</Paragraph>
      <Paragraph position="7"> The TIPSTER software Architecture was developed to permit such sharing of software developed at different agencies. However, it also is being promoted to meet a number of other problems associated with the delivery of software to the end user. The existence of an established set of interfaces for this group of technologies allows applications to be designed more quickly and with some accumulation of knowledge, across vendors, of the best ways to use them. It will allow the Government user to upgrade applications by the insertion of key new capabilities for example, the replacement of one entity tagging module with a new one - without a complete change of application and without necessarily having to stay with the same vendor. The divorcing of the TIPSTER technologies from the human interface will make it easier to insert them seamlessly into a variety of user desktop environments. While all these improvements in the way the software is delivered and maintained will probably result in some cost savings, they should also make it easier and less disruptive for the user to upgrade systems, which is just as important.</Paragraph>
      <Paragraph position="8"> The TIPSTER community began the development of the Architecture because of the need to accelerate the deployment of the technology it had developed. As with the technology itself, there did not appear to be sufficient interest in the commercial world to produce a set of agreed upon standards for integrating Document Detection and Information Extraction as quickly as the TIPSTER Program needed them. The Program, therefore, had to initiate this development itself. The Architecture has been developed to be as useful as possible to a variety of systems, since the TIPSTER community incorporates a widening sphere of vendors and researchers and requires a software Architecture at its core which allows all of them to work together without too great a cost in adaptation of individual systems.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML