File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-2702_concl.xml

Size: 2,057 bytes

Last Modified: 2025-10-06 13:55:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2702">
  <Title>Annotation and Disambiguation of Semantic Types in Biomedical Text: a Cascaded Approach to Named Entity Recognition</Title>
  <Section position="7" start_page="16" end_page="17" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper, we have described a pipeline of XML-based modules for identification and disambiguation of several semantic types of biomedical named entities. The pipeline processes and semantically enriches documents by adding, changing or removing annotations. More precisely, the documents are augmented with UIDs referring to referential databases. In the course of the processing, the number of annotated NEs increases and the quality of the annotation improves. Thus, one of the main issues is to represent still unresolved ambiguities consistently, so that the following modules can perform both identification and disambiguation of new semantic types. As subsequent modules try to add new semantic annotations, prioritization of semantic types is enforced by the order of the term identification modules.</Paragraph>
    <Paragraph position="1"> We have shown that such approach can be employed in a real-world, online information mining system EBIMed. The end-users expect to view the original layout of the documents at all times, and thus the solution needs to provide an efficient multidimensional markup that preserves and combines existing markup (from publishers) with semantic NLP-derived tags. Since, in the biomedical domain, it is essential to provide  links from term and named-entity occurrences to referential databases, EBIMed provides identification and disambiguation of such entities and integrates text with other knowledge sources.</Paragraph>
    <Paragraph position="2"> The existing solution to annotate only longest non-overlapped entities is useful for real world use scenarios, but we also need ways to improve annotations by representing nested and overlapped terms.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML