File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-1042_intro.xml

Size: 5,976 bytes

Last Modified: 2025-10-06 14:05:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-1042">
  <Title>Comlex Syntax: Building a Computational Lexicon</Title>
  <Section position="3" start_page="0" end_page="268" type="intro">
    <SectionTitle>
2 Structure
</SectionTitle>
    <Paragraph position="0"> Tile word list was derived fi'on, the file prepared by Prof. Roger Mitten from the Oxford Adwn,ced Learner's Dictionary, and contains about 38,000 head forms, although some purely British terms have been omitted, loach entry is organized as a nested set of typed feature-vahle lists. We currently use a Lisp-like parenthesized list notation, although the lexicon couhl ITo facilii~ate the transition to COMLEX by currenl, users of these dictionaries, we have i)reparcd mappings froln COMI,EX classes to those of several other dictionaries.</Paragraph>
    <Paragraph position="1"> be readily mapped into other hwn,s, such as SC, MI,marked text, if desired.</Paragraph>
    <Paragraph position="2"> SOllie sauil)le dicticll,ary entries are shown ilt Figure 1. The first syml/ol gives the part of speech; a word with several parts of speech will have several dictionary entries, one for each part of speech. Each e,itry has all :orth foatilre, giving the base fO,'lfl of tile word, No,ins, verbs, and adjectiw~s with irregular Inorphology will liave featt,res for the irregular fo,.iris :plural, :past, :pastpart, etc. Words which take con-,i)leirients will have a subcatego,'ization (:sube) \['eat,ire. For exaniple&gt; the verb &amp;quot;ai)andon&amp;quot; eali occur with a IlOllri phrase followed by a prepositional phrase with tim preposition &amp;quot;to&amp;quot; (e.g., &amp;quot;1 abandoned hii,i to the linguists.&amp;quot;) or with just a ,lOll,, phrase compleifient (&amp;quot;\[ aballdone(l the shill.&amp;quot;). Other syntactic features are recorded under :features.</Paragraph>
    <Paragraph position="3"> For example, the noun &amp;quot;abandon&amp;quot; is marked as (countable :pval (&amp;quot;wlth&amp;quot;)), indicating that it must appear in the singular with a deter,niner unless it is preceded by the preposZion &amp;quot;with&amp;quot;.</Paragraph>
    <Section position="1" start_page="0" end_page="268" type="sub_section">
      <SectionTitle>
2.1 Subcategorization
</SectionTitle>
      <Paragraph position="0"> We have paid p~u'ticular attention to providing detailed subcategorization information (information about complement structure), both for verbs and for tllose nouns and adjectives which do take cmnl)lements.</Paragraph>
      <Paragraph position="1"> In order to insure the COml)leteness of our codes, we studied the codiug e)ul)loyed by s(weral other u,ajor texicous, includh,g (,he Ih'andeis Verh Lexlcolt 2, the A(JQIJII,EX Prc, ject \[10\], the NYU Linguistic String l'roject \[9\], the OALI), and IA)OCI'\], a, nd, whenever feasiMe, haw~ sought to incorporate distinctions made in any of these all(tie,tortes. ()ur resulting feature systen, includes 92 subcategorization features Ibr w~rbs, 14 for adjectives, and 9 for llO,,ns. These features record dilforences in grammatical functional structure as well as constituent structure. In particular, tl,ey Calfl.ure four different types of control: subject control, object control, variable control, and arbitrary control. Furthermore, the notation allows us to indicate that verl) Irlay haw~ dill&gt;rent control features for different comlflement structm'es~ or ewm for dilrerent prepositions within the complement. We record, for example, that &amp;quot;blame ... on&amp;quot; involves arbitrary control (&amp;quot;lie 2 l)ewdoped by J. (ih'in;sha.w and I{..lackendoff.</Paragraph>
      <Paragraph position="2">  IAarned the country's health i~roblems (.m eating tc, o much chocolate.&amp;quot;), whereas &amp;quot;blanle for&amp;quot; involw,s ol)-. ject control (&amp;quot;lie blamed John for going too fast.&amp;quot;). The names fl)r the ditferent complmnent types are b~sed on the conventions used ill the Ih-ancleis wwb lexicon, where each COml)Mneut is designated by tl,, names of its constituents, together with a few tags to indicate things such as control phenonleua. Earh corn plement type is formally defined by n fr;uue (see Fig-.</Paragraph>
      <Paragraph position="3"> ure 2). Tile frame includes the constituellt structure, :cs, tile grammatical structure, :gs, one cu, nlm'e :features, and one or more ex~unples, :ex. Tile constit.uent structure lists the constituents in sequence; the grammarital structure indicates the functional role played by e,~ch c(mstituent. The elemenl.s of the constitueut structure are indexed, and these indices are referenced in the grammatical structure field (in up-.frames, I.he index &amp;quot;1&amp;quot; in the grammatical structures always refers to tile surface subject of tile verb).</Paragraph>
      <Paragraph position="4"> Three verb frames are shown ill Figure 2. The fh'st, s, is for flail sententiM complements with ;m optional &amp;quot;that&amp;quot; eo,nplementizer. Tim second aim third frames I)oth represent infinitiwd conq~lemel,ts, aim dill're' only in their filnctiona\[ structure. The to-ingsc frame iv f(~r subject-cm~trol verbs, verbs for which the surface subject is the flmctional subject of both the nlatrix ;tad embedded chmses. The notation :subject 1 in the :cs tleld indicates that the surface subject is the sub-ject of tile enlbedded clause, while the :subject 1 ill the :gs Iield indicates that it is the subject of the matrix clause. The indication :features (:control subject) provides this \[nforlnation redundantly; we include I)oth indications in case one is more collvelliellt for i);trticu ltu&amp;quot; dictionary users. The to-ingrs fl'atne is for raisingto-subject verbs - - verbs for which the surface subject is tile functional subject only of the embedded c\];tuso.</Paragraph>
      <Paragraph position="5"> The functional subject position in the matrix clause is unlilled, as indicated by the notation :gs (:subject () :corap 2).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML