File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-1701_concl.xml
Size: 2,464 bytes
Last Modified: 2025-10-06 13:53:29
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1701"> <Title>RDF(S)/XML LINGUISTIC ANNOTATION OF SEMANTIC WEB PAGES</Title> <Section position="12" start_page="12" end_page="12" type="concl"> <SectionTitle> 5. CONCLUSIONS </SectionTitle> <Paragraph position="0"> We have seen that, even though AI researchers are devoting many efforts to finding an optimal model for the semantic annotation of web pages, the decades of work and the results obtained in the field of Corpus Linguistics on corpus annotation have been, somehow, neglected. This paper shows the results of the research carried out on how linguistic annotation can help computers understand the text contained in a document - a Semantic Web page - bringing together semantic annotation models from AI and the annotations proposed for every linguistic level from Corpus Linguistics.</Paragraph> <Paragraph position="1"> The integration of these two approaches (Corpus Linguistics and AI) entails many advantages for language engineering and AI applications. First of all, language resources will be more reusable: many of the projects involving the use of semantically annotated (web) documents must also parse to some extent the information and, prior to that, must determine somehow the grammatical category associated to every word in the document. Introducing the annotation of these two levels into the document, hence re-using one of the tools already developed for this purpose, prevents this whole process of document text tokenisation and parsing or chunking from being unnecessarily repeated each time the document is processed (reusing the annotation). Since parsing, for example, is a high time-consuming task, we can have an additional advantage, that is, reducing our overall Semantic Web page processing time. The second main advantage is that the meaning of a page with explicit semantic annotation can be reinforced by the meaning contribution provided by all of the linguistic levels; semantic analysis can also benefit from the invaluable work done so far on the development of ontologies as conceptual and consensual models.</Paragraph> <Paragraph position="2"> However, the main disadvantage lies in the limitations imposed by current technologies: obtaining automatically compact, readable and verifiable pages is a task hard to be fully specified and delimited, but the work being done in our laboratory tries to bring some light upon it.</Paragraph> </Section> class="xml-element"></Paper>