File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/69/c69-6214_abstr.xml

Size: 2,899 bytes

Last Modified: 2025-10-06 13:45:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="C69-6214">
  <Title>Abstract AN APPLICATION OP COMPUTER TECHNIQUES TO ANALYSIS OF THE VERB PHRASE IN HINDI AND ENGLISH: A Preliminary Report</Title>
  <Section position="1" start_page="0" end_page="19" type="abstr">
    <SectionTitle>
Abstract
AN APPLICATION OP COMPUTER TECHNIQUES TO ANALYSIS
OF THE VERB PHRASE IN HINDI AND ENGLISH:
A Preliminary Report
</SectionTitle>
    <Paragraph position="0"> Dr, LoM, Khubohandanl and WoW. Glover Authors worked on the Project at ~oona, India with the facilities of the computer CDC 3600-160A installed at the Tats Institute for Fundamental Research, Bombay.</Paragraph>
    <Paragraph position="1"> The Project uses two sets of data: a corpus of verbal phrases drawl1 from a modern Hindi play and a oomplet@paradigm of Englisksentences generate6 from the kernel&amp;quot;he eats i%&amp;quot;. The computer was programmed to group into classes the words occurring in identical contexts, and substitute in the data corpus for these words a reference to the class where they have been put. The classification and substitution thus produced suggested phrase patterns, with the filler class for each tagmeme defined as the class represented in the particular slot of the pattern.</Paragraph>
    <Paragraph position="2"> The results obtained with a criterion for classification of &amp;quot;identical context one-deep on both sides&amp;quot; were quite satisfactory. In Hindi 25 classes were formed from the corpus of 65 phrases. Atleast one word was classified in each 37 (62%) of the phrases and all words were classified in 3 phrases (15%). \ With an increased sample of similar data these percentages would be expected to increase.</Paragraph>
    <Paragraph position="3"> In English 24 patterns were obtained and 15 classes were formed from the full paradigm of ll2 sentences.</Paragraph>
    <Paragraph position="4"> However, as so~ of the classes contained grammatically dissimilar members, the criterion was altered to &amp;quot;identical context two-deep on both sides&amp;quot;. The results with this criterion appear less promising in Hindi. The data sample was extended to 248 phrases of three words or more. The machine discovered 223 patterns and 13 classes, and in only 29 patterns (13%) one word was replaced b~ a class reference. This criterion, however, enjoyed some success in analysing the English paradigm which is, of course, highly restricted data. With the full paradigm, the machine discovered 30 patterns and 19 classes. 18 of them are quite homogeneous in membership and the sentences generatedby the patterns usir~ these classes are all legitimate.</Paragraph>
    <Paragraph position="5"> wou{~__ is felt a useful basis for further investigation be to refine the broad classes formed with a &amp;quot;onedeep&amp;quot; criterion in a subsequent run through the samm or new data,</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML