XML Viewer - p89-1011

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/p89-1011_intro.xml
Size: 7,300 bytes
Last Modified: 2025-10-06 14:04:48
<?xml version="1.0" standalone="yes"?>
<Paper uid="P89-1011">
  <Title>Table I</Title>
  <Section position="3" start_page="0" end_page="84" type="intro">
    <SectionTitle>
THEORETICAL BACKGROUND
</SectionTitle>
    <Paragraph position="0"> In most recent work on the process of word recognition during comprehe~ion of connected speech (either by human or machine) a distinction is made between lexical access and-word recognition (eg.</Paragraph>
    <Paragraph position="1"> Marslen-Wilsun &amp; Welsh, 1978; Klan, 1979). Lexlcal access is the process by which contact is made with the lexicon on the basis of an initial aconstlo-phonetlc or phonological representation of some portion of the speech input. The result of lexical sccess is a cohort of potential word candidates which are compatible with this initial analysis. (The term cohort is used de__ccriptively in this paper and does not represent any commitment to the perticular account of lexical access end word recognition provided by any version of the cohort theory (e.g.</Paragraph>
    <Paragraph position="2"> Marslen-Wilsun, 1987).) Most theories assume that the candidates in this cohort are successively whittled down both on the basis of further acoustic-phonetic or phonological information, as more of the speech input becomes available, end on the basis of the candidates' compatibility with the linguistic and extralingulstie context of utterance. When only one candidate remains, word recognition is said to have taken place.</Paragraph>
    <Paragraph position="3"> Most psycholinguistlc work in this area has focussed on the process of word recognition after a cohort of candidates has been selected, emphasising the role of further lexical or 'higher-level' linguistic constraints such as word frequency, lexical semantic relations, or syntactic and semantic congruity of candidates with the linguistic context (e.g. Bradley &amp; Forster, 1987; Marslen-Wilson &amp; Welsh. 1978). The few explicit and well-developed models of lexical access and word recognition in continuous speech (e.g. TRACE, McCleliand &amp; Elman, 1986) have small and tmrealistic lexicons of. at most, a few hundred words and ignore phonological processes which occur in fluent speech. Therefore, they tend to ove~.stlmatz the amount and reliability, of acoustic information which can be directly extracted from the speech signal (either by human or machine) and make unrealistic and overly-optimistic assumptions concerning the size and diversity of candidates in a typical cohort. This, in turn, casts doubt on the real efficacy of the putative mechanisms which are intended to select the correct word from the cohort.</Paragraph>
    <Paragraph position="4"> The bulk of engineering systems for speech recognition have finessed the issues of lexical access and word recognition by attempting to map directly from the acoustic signal to candidate words by pairing words with acoustic representations of the canonical pronunciation of the word in the lexicon and employing pattern-matching, best-fit techniques to select the most likely candidate (e.g. Sakoe &amp; Chiba, 1971). However, these techniques have only proved effective for isolated word recognition of small vocabularies with the system trained to an individual speaker, as, for example, Zue &amp; Huuonlocher (1983) argue. Furthermore, any direct access model of this type which does not incorporate a pre-lexical symbolic representation of the input will have diPSficulty capturing many rule-governed phonological processes which affect the ~onunciation of words in fluent speech.</Paragraph>
    <Paragraph position="5"> since these processes can only be chazacteris~ adequately in terms of operations on a symbolic, phonological representation of the speech input (e.g.</Paragraph>
    <Paragraph position="6"> Church. 1987; Frazier, 1987; Wiese, 1986).</Paragraph>
    <Paragraph position="7"> The research reported here forms part of an ongoing programme to develop a computationally explicit account of lexical access and word recognition in connected s1~e-~_~, which is at least informed by experimental results concerning the psychological processes and mechanisms which underlie this task. To guide research.</Paragraph>
    <Paragraph position="8"> we make use of a substantial lexical database of English derived from machine-readable versions of the Longman Dictionary of Contonporary English (see Boguracv et aL, 1987; Boguraev &amp; Briscoe, 1989) and of the Medical Research Council's psycholinguistic database (Wilson, 1988), which incorporates word frequency information.</Paragraph>
    <Paragraph position="9"> This specialised database system provides flexible and powerful querying facilities into a database of approximately 30,000 English word forms (with 60,000 separate entries). The querying facilities can be used to explore the lexical structure of English and simulate different approaches to lexical access and word recognition. Previous work in this area has often relied on small illustrative lexicons which tends to lead to overestimation of the effectiveness of various approaches.</Paragraph>
    <Paragraph position="10"> There are two broad questions to ask concerning the process of lexical access. Firstly, what is the nature of the initial representation which makes contact with the lexicon? Secondly, at what points during the (continuous) analysis of the speech signal is lexical look-up triggered?  We can illustrate the import of these questions by considering an example like (1) (modified from Klan via Church. 1987).</Paragraph>
    <Paragraph position="12"> (Where 'I' represents a high, front vowel, 'E' schwa, 'd' a flapped or neutralised stop, and '?' a glottal stop.) The phonetic trmmcriptlon of one possible utterance of (la) in (lb) demonstrates some of the problems involved in any 'dL,~ct' mapping from the speech input to lexical enu'ies not mediated by the application of phonological rules.</Paragraph>
    <Paragraph position="13"> For example, the palatalisation of final/d/before/y/in /did/means that any attempt to relate that portion of the W'~eC/___h input to the lexicel entry for d/d is h'kely to fail. Sitrfi/ar points can be made about the flapping and glottalisadon of the B/phonemes in/hit/and/It/, and the vowel reductions to schwa. In addition. (1) illustrates the wen-known point that there are no 100% reliable phonetic or phonological cues to word boundaries in connected speech. Without further phonological and lexical analysis there is no indication in a transcrilxlon like (lb) of where words begin or end; for example, how does the lexical access system distinguish word.initial/I/ in/17/fzom word-inlernal /I/ in /hid/? In this paper, I shall argue for a model which splits the lexical access process into a pre-lexical phonological parsing stage and then a lexicel enn7 retrieval stage. The model is simil~ to that of Church (1987), however I argue, firstly, that the initial phonological representation recovered from the speech input is more variable and often less detailed than that assumed by Church and, secondly, that the lexical entry retrieval stage is more directed and ~. in order to ~ce the number of spurious lexical enuies accessed and to cernp~z~te for likely indetenninacies in the initial representation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML