File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/e06-2017_concl.xml
Size: 1,459 bytes
Last Modified: 2025-10-06 13:55:08
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-2017"> <Title>Computing Term Translation Probabilities with Generalized Latent Semantic Analysis</Title> <Section position="5" start_page="153" end_page="153" type="concl"> <SectionTitle> 4 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> We used the GLSA to compute term translation probabilities as a measure of semantic similarity between documents. We showed that the GLSA term-based document representation and GLSA-based term translation probabilities improve performance on document classification.</Paragraph> <Paragraph position="1"> The GLSA term vectors were computed for all vocabulary terms. However, different measures of similarity may be required for different groups of terms such as content bearing general vocabulary words and proper names as well as other named entities. Furthermore, different measures of similarity work best for nouns and verbs. To extend this approach, we will use a combination of similarity measures between terms to model the document similarity. We will divide the vocabulary into general vocabulary terms and named entities and compute a separate similarity score for each of the group of terms. The overall similarity score is a function of these two scores. In addition, we will use the GLSA-based score together with syntactic similarity to compute the similarity between the general vocabulary terms.</Paragraph> </Section> class="xml-element"></Paper>