File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1015_metho.xml
Size: 12,729 bytes
Last Modified: 2025-10-06 14:14:56
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1015"> <Title>Semi-Automatic Recognition of Noun Modifier Relationships</Title> <Section position="4" start_page="96" end_page="97" type="metho"> <SectionTitle> 3 Noun Modifier Relationship Labels </SectionTitle> <Paragraph position="0"> Table 1 lists the NMRs used by our analyzer. The list is based on similar lists found in literature on the semantics of noun compounds. It may evolve as experimental evidence suggests changes.</Paragraph> <Paragraph position="1"> For each NMR, we give a paraphrase and example modifier-noun compounds. Following the tradition in the study of noun compound semantics, the paraphrases act as definitions and can be used to check the acceptability of different interpretations of a compound. The paraphrases serve as definitions in this section and to help with interpretation during user interactions (as illustrated in section 6). In the analyzer, awkward paraphrases with adjectives could be improved by replacing adjectives with their WordNet pertainyms (Miller, 1990), giving, for example, &quot;charity benefits from charitable donation&quot; instead of &quot;charitable benefits from charitable donation&quot;.</Paragraph> <Paragraph position="2"> Agent: compound is performed by modifier student protest, band concert, military assault Beneficiary: modifier benefits from compound student price, charitable donation Cause: modifier causes compound exam anxiety, overdue fine Container: modifier contains compound printer tray, flood water, film music, story idea Content: modifier is contained in compound paper tray, eviction notice, oil pan Destination: modifier is destination of compound game bus, exit route, entrance stairs Equative: modifier is also head composer arranger, player coach Instrument: modifier is used in compound electron microscope, diesel engine, laser printer Located: modifier is located at compound building site, home town, solar system Location: modifier is the location of compound lab printer, internal combustion, desert storm Material: compound is made of modifier carbon deposit, gingerbread man, water vapottr Object: modifier is acted on by compound engine repair, horse doctor Possessor: modifier has compound national debt, student loan, company car Product: modifier is a product of compound automobile factory, light bulb, colour printer Property: compound is modifier blue car, big house, fast computer Purpose: compound is meant for modifier concert hall soup pot, grinding abrasive Result: modifier is a result of compound storm cloud, cold virus, death penalty Source: modifier is the source of compound foreign capital, chest pain, north wind Time: modifier is the time of compound winter semester, late supper, morning class Topic: compound is concerned with modifier computer expert, safety standard, horror novel</Paragraph> </Section> <Section position="5" start_page="97" end_page="97" type="metho"> <SectionTitle> 4 Noun Modifier Bracketing </SectionTitle> <Paragraph position="0"> Before assigning NMRs, the system must bracket the head noun and the premodifier sequence into modifier-head pairs. Example (2) shows the bracketing for noun phrase (1).</Paragraph> <Paragraph position="1"> (1) dynamic high impedance microphone (2) (dynamic ((high impedance) microphone)) The bracketing problem for noun-noun-noun compounds has been investigated by Liberrnan & Sproat (1992), Pustejovsky et al. (1993), Resnik (1993) and Lauer (1995) among others. Since the NMR analyzer must handle premodifier sequences of any length with both nouns and adjectives, it requires more general techniques. Our semi-automatic bracketer (Barker, 1998) allows for any number of adjective or noun premodifiers. After bracketing, each non-atomic element of a bracketed pair is considered a subphrase of the original phrase. The subphrases for the bracketing in (2) appear in (3), (4) and (5).</Paragraph> <Paragraph position="2"> (3) high impedance (4) high_impedance microphone (5) dynamic high_impedance_microphone Each subphrase consists of a modifier (possibly compound, as in (4)) and a head (possibly compound, as in (5)). The NMR analyzer assigns an NMR to the modifier-head pair that makes up each subphrase.</Paragraph> <Paragraph position="3"> Once an NMR has been assigned, the system must store the assignment to help automate future processing. Instead of memorizing complete noun phrases (or even complete subphrases) and analyses, the system reduces compound modifiers and compound heads to their own local heads and stores these reduced pairs with their assigned NMR. This allows it to analyze different noun phrases that have only reduced pairs in common with previous phrases. For example, (6) and (7) have the reduced pair (8) in common. If (6) has already been analyzed, its analysis can be used to assist in the analysis of (7)--see section 5.1. (6) (dynamic ((high impedance) microphone)) (7) (dynamic (cardioid (vocal microphone))) (8) (dynamic microphone)</Paragraph> </Section> <Section position="6" start_page="97" end_page="99" type="metho"> <SectionTitle> 5 Assigning NMRs </SectionTitle> <Paragraph position="0"> Three kinds of construction require NMR assignments: the modifier-head pairs from the bracketed premodifier sequence; postmodifying prepositional phrases; appositives.</Paragraph> <Paragraph position="1"> These three kinds of input can be generalized to a single form--a triple consisting of modifier, head and marker (M, H, Mk). For premodifiers, Mk is the symbol nil, since no lexical item links the premodifier to the head. For postmodifying prepositional phrases Mk is the preposition. For appositives, Mk is the symbol appos. The (M, H, Mk) triples for examples (9), (10) and (11) appear in Table 2.</Paragraph> <Paragraph position="2"> (9) monitor cable plug (10) large piece of chocolate cake (11) my brother, a friend to all young people To assign an NMR to a triple (M, H, Mk), the system looks for previous triples whose distance to the current triple is minimal. The NMRs assigned to previous similar triples comprise lists of candidate NMRs. The analyzer then finds what it considers the best NMR from these lists of candi- null dates to present to the user for approval. Appositives are automatically assigned Equative.</Paragraph> <Section position="1" start_page="98" end_page="98" type="sub_section"> <SectionTitle> 5.1 Distance Between Triples </SectionTitle> <Paragraph position="0"> The distance between two triples is a measure of the degree to which their modifiers, heads and markers match. Table 3 gives the eight different values for distance used by NMR analysis.</Paragraph> <Paragraph position="1"> The analyzer looks for previous triples at the lower distances before attempting to find triples at higher distances. For example, it will try to find identical triples before trying to find triples whose markers do not match.</Paragraph> <Paragraph position="2"> Several things about the distance measures require explanation. First, a preposition is more similar to a nil marker than to a different preposition. Unlike a different preposition, the nil marker is not known to be different from the marker in an overtly marked pair.</Paragraph> <Paragraph position="3"> Next, no evidence suggests that triples with matching M are more similar or less similar than triples with matching H (distances 3 and 6).</Paragraph> <Paragraph position="4"> Triples with matching prepositional marker (distance 4) are considered more similar than triples with matching M or H only. A preposition is an overt indicator of the relationship between M and H (see Quirk, 1985: chapter 9) so a correlation is more likely between the preposition and the NMR than between a given M or H and the NMR.</Paragraph> <Paragraph position="5"> If the current triple has a prepositional marker not seen in any previous triple (distance 5), the system finds candidate NMRs in its NMR marker dictionary. This dictionary was constructed from a list of about 50 common atomic and phrasal prepositions. The various meanings of each preposition were mapped to NMRs by hand. Since the list of prepositions is small, dictionary con' struction was not a difficult knowledge engineering task (requiring just twenty hours of work of a secondary school student).</Paragraph> </Section> <Section position="2" start_page="98" end_page="99" type="sub_section"> <SectionTitle> 5.2 The Best NMRs </SectionTitle> <Paragraph position="0"> The lists of candidate NMRs consist of all those NMRs previously assigned to (M, H, Mk) triples at a minimum distance from the triple under analysis. If the minimum distance was 3 or 6, there may be two candidate lists: LM contains the NMRs previously assigned to triples with matching M, L,-with matching H. The analyzer attempts to choose a set R of candidates to suggest to the user as the best NMRs for the current triple, If there is one list L of candidate NMRs, R contains the NMR (or NMRs) that occur most frequently in L For two lists LM and L,, R could be found in several ways, We could take R to contain the most frequent NMRs in LM u L,. This absolute frequency approach has a bias towards NMRs in the larger of the two lists.</Paragraph> <Paragraph position="1"> Alternatively, the system could prefer NMRs with the highest relative frequency in their lists. If there is less variety in the NMRs in LM than in LH, M might be a more consistent indicator of NMR than H. Consider example (12).</Paragraph> <Paragraph position="2"> (12) front line Compounds with the modifier front may always have been assigned Location. Compounds with dist current triple</Paragraph> <Paragraph position="4"> wall beside a garden wall beside a garden wall beside a garden garden wall wall beside a garden wall around a garden pile of garbage pile of sweaters pile of garbage house of bricks ice in the cup nmrm(in, \[ctn,inst, loc,src,time\]) wall beside a garden garden fence wall beside a garden pile of garbage the head line may have been assigned many different NMRs. If line has been seen as a head more often than front as a modifier, one of the NMRs assigned to line may have the highest absolute frequency in LM u LH. But if Location has the highest relative frequency, this method correctly assigns Location to (12). There is a potential bias, however, for smaller lists (a single N-MR in a list always has the highest relative frequency).</Paragraph> <Paragraph position="5"> To avoid these biases, we could combine absolute and relative frequencies. Each NMR i is assigned a score si calculated as:</Paragraph> <Paragraph position="7"> This combined formula was used in the experiment described in section 7.</Paragraph> </Section> <Section position="3" start_page="99" end_page="99" type="sub_section"> <SectionTitle> 5.3 Premodifiers as Classifiers </SectionTitle> <Paragraph position="0"> Since NMR analysis deals with endocentric compounds we can recover a taxonomic relationship from triples with a nil marker. Consider example (13) and its reduced pairs in (14): (13) ((laser printer) stand) The NMR analyzer is intended to start processing from scratch. A session begins with no previous triples to match against the triple at hand. To compensate for the lack of previous analyses, the system relies on the help of a user, who supplies the correct NMR when the system cannot determine it automatically.</Paragraph> <Paragraph position="1"> In order to supply the correct NMR, or even to determine if the suggested NMR is correct, the user must be familiar with the NMR definitions. To minimize the burden of this requirement, all interactions use the modifier and head of the current phrase in the paraphrases from section 3. Furthermore, if the appropriate NMR is not among those suggested by the system, the user can request the complete list of paraphrases with the current modifier and head.</Paragraph> </Section> <Section position="4" start_page="99" end_page="99" type="sub_section"> <SectionTitle> 6.1 An Example </SectionTitle> <Paragraph position="0"> Figure 1 shows the interaction for phrases (15)(18). The system starts with no previously analyzed phrases. The NMR marker dictionary maps the preposition of to twelve NMRs: Agent, Cause, Content, Equative, Located, Material, Object, Possessor, Property, Result, Source, Topic.</Paragraph> <Paragraph position="1"> (15) small gasoline engine (16) the repair of diesel engines (17) diesel engine repair shop (18) an auto repair center User input is shown bold underlined. At any prompt the user may type 'list' to view the complete list of NMR paraphrases for the current modifier and head.</Paragraph> </Section> </Section> class="xml-element"></Paper>