File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-3309_intro.xml

Size: 9,849 bytes

Last Modified: 2025-10-06 14:04:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3309">
  <Title>Generative Content Models for Structural Analysis of Medical Abstracts</Title>
  <Section position="3" start_page="65" end_page="67" type="intro">
    <SectionTitle>
2 Methods
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="65" end_page="66" type="sub_section">
      <SectionTitle>
2.1 Corpus and Data Preparation
</SectionTitle>
      <Paragraph position="0"> Our experiments involved MEDLINE, the bibliographicaldatabaseofbiomedicalarticlesmaintained null by the U.S. National Library of Medicine (NLM).</Paragraph>
      <Paragraph position="1"> We used the subset of MEDLINE that was extracted for the TREC 2004 Genomics Track, consisting of citations from 1994 to 2003. In total, 4,591,008 records (abstract text and associated metadata) were extracted using the Date Completed (DCOM) field for all references in the range of 19940101 to 20031231.</Paragraph>
      <Paragraph position="2"> Viewing structural modeling of medical abstracts as a sentence classification task, we leveraged the existence of so-called structured abstracts (see Figure 1 for an example) in order to obtain the appropriate section label for each sentence. The use of section headings is a device recommended by the Ad Hoc Working Group for Critical Appraisal of the Medical Literature (1987) to help humans assess the reliability and content of a publication and to facilitate the indexing and retrieval processes. Although structured abstracts loosely adhere to the introduction, methods, results, and conclusions format, the exact choice of section headings varies from abstract to abstract and from journal to journal. In our test collection, we observed a total of 2688 unique section headings in structured abstracts--these were manually mapped to the four broad classes of &amp;quot;introduction&amp;quot;, &amp;quot;methods&amp;quot;, &amp;quot;results&amp;quot;, and &amp;quot;conclusions&amp;quot;. All sentences falling under a section heading were assigned the label of its appropriately-mapped heading (naturally, the actual section headings were removed in our test collection). As a concrete example, in the abstract shown in Figure 1, the &amp;quot;OBJEC-TIVE&amp;quot; section would be mapped to &amp;quot;introduction&amp;quot;, the &amp;quot;RESEARCH DESIGN AND METHODS&amp;quot; section to &amp;quot;methods&amp;quot;. The &amp;quot;RESULTS&amp;quot; and &amp;quot;CON-CLUSIONS&amp;quot; sections map directly to our own labels. In total, 308,055 structured abstracts were extracted and prepared in this manner, serving as the complete dataset. In addition, we created a reduced collection of 27,075 abstracts consisting of only Randomized Controlled Trials (RCTs), which represent definitive sources of evidence highly-valued in the clinical decision-making process.</Paragraph>
      <Paragraph position="3"> Separately, we manually annotated 49 unstruc- null Integrating medical management with diabetes self-management training: a randomized control trial of the Diabetes Outpatient Intensive Treatment program.</Paragraph>
      <Paragraph position="4"> OBJECTIVE- This study evaluated the Diabetes Outpatient Intensive Treatment (DOIT) program, a multiday group education and skills training experience combined with daily medical management, followed by case management over 6 months. Using a randomized control design, the study explored how DOIT affected glycemic control and self-care behaviors over a short term. The impact of two additional factors on clinical outcomes were also examined (frequency of case management contacts and whether or not insulin was started during the program). RESEARCH DESIGN AND METHODS- Patients with type 1 and type 2 diabetes in poor glycemic control (A1c ?8.5%) were randomly assigned to DOIT or a second condition, entitled EDUPOST, which was standard diabetes care with the addition of quarterly educational mailings. A total of 167 patients (78 EDUPOST, 89 DOIT) completed all baseline measures, including A1c and a questionnaire assessing diabetes-related self-care behaviors. At 6 months, 117 patients (52 EDUPOST, 65 DOIT) returned to complete a follow-up A1c and the identical self-care questionnaire. RESULTS- At follow-up, DOIT evidenced a significantly greater drop in A1c than EDUPOST. DOIT patients also reported significantly more frequent blood glucose monitoring and greater attention to carbohydrate and fat contents (ACFC) of food compared with EDUPOST patients. An increase in ACFC over the 6-month period was associated with improved glycemic control among DOIT patients. Also, the frequency of nurse case manager follow-up contacts was positively linked to better A1c outcomes. The addition of insulin did not appear to be a significant contributor to glycemic change. CONCLUSIONS- DOIT appears to be effective in promoting better diabetes care and positively influencing glycemia and diabetes-related self-care behaviors. However, it demands significant time, commitment, and careful coordination with many health care professionals. The role of the nurse case manager in providing ongoing follow-up contact seems important.</Paragraph>
      <Paragraph position="5">  tured abstracts of randomized controlled trials retrieved to answer a question about the management of elevated low-density lipoprotein cholesterol (LDL-C). We submitted a PubMed query (&amp;quot;elevated LDL-C&amp;quot;) and restricted results to English abstracts of RCTs, gathering 49 unstructured abstracts from 26 journals. Each sentence was annotated with its section label by the third author, who is a medical doctor--this collection served as our blind held-out testset. Note that the annotation process preceded our experiments, which helped to guard against annotator-introduced bias. Of 49 abstracts, 35 contained all four sections (which we refer to as &amp;quot;complete&amp;quot;), while 14 abstracts were missing one or more sections (which we refer to as &amp;quot;partial&amp;quot;).</Paragraph>
      <Paragraph position="6"> Two different types of experiments were conducted: the first consisted of cross-validation on the structured abstracts; the second consisted of training on the structured abstracts and testing on the unstructured abstracts. We hypothesized that structured and unstructured abstracts share the same underlying discourse patterns, and that content models trained with one can be applied to the other.</Paragraph>
    </Section>
    <Section position="2" start_page="66" end_page="67" type="sub_section">
      <SectionTitle>
2.2 Generative Models of Content
</SectionTitle>
      <Paragraph position="0"> Following Ruch et al. (2003) and Barzilay and Lee (2004), we employed Hidden Markov Models to model the discourse structure of MEDLINE abstracts. The four states in our HMMs correspond to the information that characterizes each section (&amp;quot;introduction&amp;quot;, &amp;quot;methods&amp;quot;, &amp;quot;results&amp;quot;, and &amp;quot;conclusions&amp;quot;) and state transitions capture the discourse flow from section to section.</Paragraph>
      <Paragraph position="1"> Using the SRI language modeling toolkit, we first computed bigram language models for each of the four sections using Kneser-Ney discounting and Katz backoff. All words in the training set were downcased, all numbers were converted into a generic symbol, and all singleton unigrams and bi-grams were removed. Using these results, each sentence was converted into a four dimensional vector, where each component represents the log probability, divided by the number of words, of the sentence under each of the four language models.</Paragraph>
      <Paragraph position="2"> We then built a four-state Hidden Markov Model that outputs these four-dimensional vectors. The transition probability matrix of the HMM was initialized with uniform probabilities over a fully connected graph. The output probabilities were modeled as four-dimensional Gaussians mixtures with diagonal covariance matrices. Using the section labels, the HMM was trained using the HTK toolkit (Young et al., 2002), which efficiently performs the forward-backward algorithm and Baum-Welch estimation. For testing, we performed a Viterbi (maximum likelihood) estimation of the label of each test sentence/vector (also using the HTK toolkit).</Paragraph>
      <Paragraph position="3">  In an attempt to further boost performance, we employed Linear Discriminant Analysis (LDA) to find a linear projection of the four-dimensional vectors that maximizes the separation of the Gaussians (corresponding to the HMM states). Venables and Ripley (1994) describe an efficient algorithm (of linear complexity in the number of training sentences) for computing the LDA transform matrix, which entails computing the within- and between-covariance  matricesoftheclasses,andusingSingularValueDecomposition (SVD) to compute the eigenvectors of the new space. Each sentence/vector is then multiplied by this matrix, and new HMM models are re-computed from the projected data.</Paragraph>
      <Paragraph position="4"> An important aspect of our work is modeling content structure using generative techniques. To assess the impact of taking discourse transitions into account, we compare our fully trained model to one that does not take advantage of the Markov assumption--i.e., it assumes that the labels are independently and identically distributed.</Paragraph>
      <Paragraph position="5"> To facilitate comparison with previous work, we also experimented with binary classifiers specifically tuned to each section. This was done by creating a two-state HMM: one state corresponds to the label we want to detect, and the other state corresponds to all the other labels. We built four such classifiers, one for each section, and trained them in the same manner as above.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML