File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/a94-1029_metho.xml

Size: 6,419 bytes

Last Modified: 2025-10-06 14:13:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="A94-1029">
  <Title>Might a semantic lexicon support hypertextual authoring?</Title>
  <Section position="3" start_page="177" end_page="177" type="metho">
    <SectionTitle>
3 Linguistic evidences in HERMES
</SectionTitle>
    <Paragraph position="0"> Figure 2 shows the screendump of an excerpt of an English hypertext. Anchors (i.e. words with their role labels) are highlighted. Simple browsing</Paragraph>
    <Paragraph position="2"> tam~&amp;t~e, tad mlsstvlty, rel~ent to ntillt%e ~.ds~;m%tm t, the e~m~'~l tnfrtrld ba~l (T|R), Is id~h'essl~ i~d cl~K~ee, Thl (c)l~%ll~ed thrD2~h r~/~r~l|lzi%1on of ~.ht effoc~t~ ~lsslvlt~ ~v~r u~ ms br~ ~.~ ~Isllv~y prlv~oal ~,s~l~. T~ *C/C/~tcy Nlttr thin \[ K. ~thO~t tr~ dtrtc% i i.</Paragraph>
    <Paragraph position="3"> - Fig. 2: HERMES: hypertext over Remote Sensing d~uments 4 -</Paragraph>
  </Section>
  <Section position="4" start_page="177" end_page="178" type="metho">
    <SectionTitle>
4 Legenda of contextual role label in figure 4.
</SectionTitle>
    <Paragraph position="0"> operations (i.e. index, history or backtracking) are available. The current version has been also implemented under the X-Windows environment.</Paragraph>
    <Paragraph position="1"> Parallel navigation sessions on the same document base are allowed.</Paragraph>
    <Paragraph position="2"> In order to appreciate the advantages of automatic generation and upgrade activities, details on system performance may be of interest. Hypertext compilation, in fact, provides important information on the linguistic processing to be derived. The following analysis is based on the hypertext developed for documents (abstracts and D.I.F.) on Remote Sensing (210 documents). 10743 anchors (of 133 different words), and 5908 links have been globally derived. Fig. 3 describes the time needs during the compilation phase. We estimated a linear trend in time because the updating activity has a complexity O(n) s, where n is the number of document yet in the document base. Experimental evidence confirms a linear trend (fig.3).</Paragraph>
    <Paragraph position="3">  Note the difference from the linear regression (dotted line). This is a clear marker of relevant difference in document information density. This difference is much more evident when comparing abstracts with DIF (the last documents are composed by a large number of descriptors and a textual component used as a comment). We defined an information density score D as the ratio: number of anchors / document length (in byte). DIF and abstracts are very different in length: the average length value is 1022 bytes for abstracts and 6691 bytes for DIF documents.</Paragraph>
    <Paragraph position="4">  Despite of this we obtained an average value D of 129 for abstracts and 341 for DIF.</Paragraph>
    <Paragraph position="5"> Experimental evidence confirms abstracts are much more information dense than DIF.</Paragraph>
    <Paragraph position="6"> In figure 4 a link versus anchor plot has been shown. This graph shows the link density of anchors. In referred prototype, running on a Remote Sensing domain, there are more than 300 anchors triggering only 1 link, while only less than 100 with 6 links have been activated. Note that there are more than 600 anchors than have no links leaving from them thus showing peculiar concepts for a document and irrelevant in the domain sublanguage (we call them inactive anchors, later insertion of a document may activate them). A monotonic decreasing trend on the number of the leaving links is evident. Isolated peaks on higher values of links number are evident. Those peaks are related to very domain dependent anchors, as for example, the couples &lt;method-Focus&gt;, &lt;information-Kind of processed Information&gt;, &lt;information-Focus&gt;, &lt;image-Focus&gt;, &lt;result-Focus&gt;, &lt;image-Kind of processed information&gt;. These anchors are present in most of the documents. Thus they represent common topics in the related domain and may be considered as concepts extremely important in the sublanguage.</Paragraph>
    <Paragraph position="7"> As descriptors of the underlying knowledge domain, the more frequently activated anchors are very information. One more evaluation provided by this plot is the evidence of computational validity of our approach. In fact the quickly decreasing curve shows that a very large amount of documents generates only a few links thus avoiding an exponential explosion of physical connections. We stress such a result because a completely automated production of anchors (as provided by our system) could have generated an unforeseen amount of links.</Paragraph>
  </Section>
  <Section position="5" start_page="178" end_page="178" type="metho">
    <SectionTitle>
4. Concluding remarks
</SectionTitle>
    <Paragraph position="0"> In this paper we have proposed a new approach to hypertexts based on a NLP methodology. Moreover a general description of HERMES, an hypertextual system allowing automatic authoring, has been provided. The growth in use and dimension of hypertextual systems, makes automatic authoring a must. In this activity, the efficacy of semantics, acquired by lexical acquisition tools (i.e. ARIOSTO), has been stressed. The semantic description guided by the lexicon is worth expressive for automatic authoring. As a deep text understanding is not required, semantic interpretation is feasible and costeffective. A conceptual rather then just structural representation of documents suggests semantic rules for anchors detection and links activation.</Paragraph>
    <Paragraph position="1"> The proposed method enhances the figure of the user, as opposed to the author. User becomes the main actor in the hypertextual management, as he can browse inside the space of documents with confidence. The hypertextual space is generated according to his suggestions, as provided in the definition schema. The author suggests only how to semantically represent documents.</Paragraph>
    <Paragraph position="2"> As concluding remarks, the system evaluation has produced, as an unforeseen side-effect, some important linguistic evidences about the underlying sublanguages. This relates to HERMES capability of automatically deriving meaningful anchors.</Paragraph>
    <Paragraph position="3"> HERMES portability as well as low resource requirements are also improved by the use of lexical acquisition tools.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML