File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1066_intro.xml
Size: 2,658 bytes
Last Modified: 2025-10-06 14:01:50
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1066"> <Title>Unsupervised Learning of Dependency Structure for Language Modeling</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Motivation </SectionTitle> <Paragraph position="0"> A trigram language model predicts the next word based only on two preceding words, blindly discarding any other relevant word that may lie three or more positions to the left. Such a model is likely to be linguistically implausible: consider the English sentence in Figure 1(a), where a trigram model would predict cried from next seat, which does not agree with our intuition. In this paper, we define a dependency structure of a sentence as a set of probabilistic dependencies that express linguistic relations between words in a sentence by an acyclic, planar graph, where two related words are connected by an undirected graph edge (i.e., we do not differentiate the modifier and the head in a dependency). The dependency structure for the sentence in Figure 1(a) is as shown; a model that uses this dependency structure would predict cried from baby, in agreement with our intuition.</Paragraph> <Paragraph position="1"> demarcate morpheme boundaries; square brackets indicate phrases (bunsetsu).</Paragraph> <Paragraph position="2"> A Japanese sentence is typically divided into non-overlapping phrases called bunsetsu. As shown in Figure 1(b), each bunsetsu consists of one content word, referred to here as the headword H, and several function words F. Words (more precisely, morphemes) within a bunsetsu are tightly bound with each other, which can be adequately captured by a word trigram model. However, headwords across bunsetsu boundaries also have dependency relations with each other, as the diagrams in Figure 1 show. Such long distance dependency relations are expected to provide useful and complementary information with the word trigram model in the task of next word prediction.</Paragraph> <Paragraph position="3"> In constructing language models for realistic applications such as speech recognition and Asian language input, we are faced with two constraints that we would like to satisfy: First, the model must operate in a left-to-right manner, because (1) the search procedures for predicting words that correspond to the input acoustic signal or phonetic string work left to right, and (2) it can be easily combined with a word trigram model in decoding. Second, the model should be computationally feasible both in training and decoding. In the next section, we offer a DLM that satisfies both of these constraints.</Paragraph> </Section> class="xml-element"></Paper>