MCDONNELL DOUGLAS ELECTRONIC SYSTEMS COMPANY :
Description of the INLET System Used for MUC- 3
David de Hilster and Amnon Meyers
Advanced Computing Technologies La b
McDonnell Douglas Electronics Systems Compan y
1801 East Saint Andrew Place
Santa Ana, California 92705-652 0
e-mail : vox@young .mdc.com
phone : (714)566-5956
During the past nine months, our language processing effort has focused on implementin g
a new NLP system called INLET. INLET relies heavily on the methodology developed for th e
preceding VOX system . The goal of this work is to produce an NLP shell that can be customized to
a variety of tasks and domains .
INLET provides user-friendly graphics-oriented tools for knowledge addition an d
perusal, to support customization to a variety of domains and tasks . INLET is implemented in C
on Sun SPARCstations, in order to support faster analysis than Lisp-based systems and t o
provide a more widely usable end-product .
At the present time, the basic knowledge addition system is in place, includin g
vocabulary addition interfaces (e .g ., Figure 1), a dictionary book tool (Figure 2), a hierarch y
editor (Figure 3), and a grammar rule editor (Figure 4) . A new Conceptual Grammar has been
implemented for INLET, supporting concept hierarchies, a 'conceptual' dictionary, and othe r
knowledge types (such as lexical interrelationships) .
A conceptual analyzer, called the skimmer (Figure 5), has been implemented to provid e
robust top-down language analysis capabilities. The skimmer will augment a bottom-u p
analyzer currently under construction . A skimmer trace is shown in Figure 6 .
DESCRIPTION OF THE SKIMME R
The skimmer, implemented in 2 months time, is the only operational component of the
INLET analyzer (called LASS, for Language Analysis Support System) . Six major processes
comprise the skimmer, and are described in turn . The first process performs a pass throug h
the input message to replace all fixed phrases with their atomic representation . Fo r
example, the literal phrase "farabundo marti national liberation front" is replaced by a concep t
that represents that phrase . The second process, called the pre-processor, locates and parse s
specialized constructs in the text, such as proper names, number words, temporal and locative
phrases, as well as some domain-specific phrases dealing with perpetrators, physical, and
human targets . Phrases like "thirty-three", "Ricardo Alfonso Castellar", "three months ago" ,
"San Miguel department", and "member of the FMLN", are all detected and parsed by the pre -
processor. Next, a pass through the text detects key words and phrases . Words like
"attack", "murder", and "bomb", as well as phrases like "set fire to object", "bomb damage d
object" are located in the text . Incidents found in this pass are merged or segmented by th e
grouping process. Groups of sentences associated with a single incident are then examined by
the actor/object separation process to determine perpetrators and targets . The subsequent
178
slot filling process further examines groups of sentences to fill remaining slots of the MUC 3
template. In order to assure self-consistency in a single template, a semantic trimming and
rejection process trims the templates prior to output . This last process is ad hoc, bu t
substitutes for our current lack of script-based processing .
SAMPLE TEMPLATE
0. MESSAGE ID
	
TST2-MUC3-0048
1. TEMPLATE ID
	
1
2. DATE OF INCIDENT
	
- 19 APR 89
3. TYPE OF INCIDENT
	
MURDER
4. CATEGORY OF INCIDENT
	
TERRORIST ACT
5. PERPETRATOR : ID OF INDIV(S)
	
"GUERRILLAS"
6. PERPETRATOR : ID OF ORG(S)
	
-
7. PERPETRATOR: CONFIDENCE
	
-
8. PHYSICAL TARGET: ID(S)
9. PHYSICAL TARGET: TOTAL NUM
	
'
10. PHYSICAL TARGET: TYPE(S)
	
"
11. HUMAN TARGET: ID(S)
	
"ROBERTO GARCIA ALVARADO" ("ATTORNEY GENERAL" )
12. HUMAN TARGET : TOTAL NUM
	
1
13. HUMAN TARGET : TYPE(S)
	
CIVILIAN: "ROBERTO GARCIA ALVARADO"
14. TARGET : FOREIGN NATION(S)
15. INSTRUMENT: TYPE(S)
16. LOCATION OF INCIDENT
	
EL SALVADOR: SAN SALVADOR (CITY)
17. EFFECT ON PHYSICAL TARGET(S)
18. EFFECT ON HUMAN TARGET(S) "
A sample of the system's output for message 48 from TST2 is shown above . The system
did fairly well, successfully applying an apposition rule in slot 11, for example . However,
even though it found the appositive, it incorrectly assigned civilian type to the attorney general ,
due to an undiagnosed bug . The system often sloughs adjectives, e .g ., condensing "urban
guerrillas" to "guerrillas" in the perpetrator id slot . Even though the first sentence of messag e
48 says ". . .accused the farabundo marti national liberation front (fmln) of the crime", th e
skimmer failed to fill the organization slot. In this case, failure was due to lack of patterns fo r
accusations. Failure to find the instrument in ".. .was killed when a bomb placed by urba n
guerrillas on his vehicle exploded" is due to the absence of a pattern like "bomb placed by
actor". Rather, the system knew the pattern "bomb BE placed by actor" where the verb 'to be' i s
not optional.
179
DONAIII : (none )SER : (none)
(Qo i t. )
Ixurratml a
ANALYZ 10 : .^. off
	
Children : O off
	
(Add Vocabulary)
	
recalculate Graph )
VOCABULARY ADDITION
	
(Next Word) (Cancel )
concept Word
Root
	
: guerrill a
Added :
CONJUGATION S
Presen t
3rd person :
Progressive :
Pas t
Participle :
Singular
	
: guerrill a
Plural
	
: guerrillas
EXTEND
	
SYNTACTIC FEATURE S
O adj
	
0 aux
	
O det
	
noun
	
0 pro
	
0 verb
O adv
	
O c :onj
	
O modal O prep
	
0 quan
	
O wh
SEMANTIC FEATURES
Feature : terrorist/person/actor/object
Figure 1 : Vocabulary Addition Too l
SER : (none)
	
DONAIII : (none )
(KBEditor)
	
(Add Vocabulary)
	
(Dictionary)
	
(tent	 ool
	
(Options(
	
(Sun(I' :)
	
(Quit )
10
(Entire Graph)
ANALYZ ID: eoff
	
Children : Oaf
	
(AddVocabulary)
	
SFr.' ..T
abstract
—action
organization
—cici l ion
—.name
~ actor nationality
military
—assassin
—person
	
..coemando
- extremist—kil
le a
.—eurderr
...narcotrrr1at
tarrorisf	 rebel
...robber
• subversive
.. guerrilla
	
attr{
	
terrorist
nouns— guerrilla
~ terrorist...thief
...attribute
location
~ misc
--scalar
— Spanish
- thing
Figure 3 : Hierarchy Edito r
INLET:
	
Interactive Natural Language Engineering Too l
USER :
	
(none)
	
DOIIA111:
	
(none )
KB Ed Knowledge Base Graph Tool ® O O O (] ® Find : ob j
c(} .'
	
,] (Prune)
	
(Prune All) (Grow)
	
(Grow Al)) (Options)
	
(Done)(Entire Graph)
	
(Redisplay)
("off (Recalculate Graph)ANALYZ
	
ID:
	
C off
	
Children: (Add Vocabulary) L. .R»:cl
last-unit
last-weekday
RULE EDITOR: professional (1 of 1)
(Redisplay) (Insert) (Delete) (Delete Rule) (Salient) O 0 (Cancel) on
Node(s):
Rule modes:
	
.^. atom
Suggested concept:
	
professional
profession general —stem-*civilian
	
of
	
det
	
quan adJ
	
country-adj
	
adj
	
abstractnumber
~iif3~
_a—(
anything
n reader
actor
HOUSE BU11011S
-.symbol
Figure 4 :
	
Rule Edito r
Editor )- (L : Open Knowledge use
object
EXTEND
18 1
INLET: Interactive Natural Language Engineering Tool
SER : (none)
(KB editor) (Dictionary)
	
(Text loot)
	
(Options) (Qui( )(So:Os)
DONA111: (none)
(Add Vocabulary
ANALYZER>
Text file: noscl
Stat File: reject.tt
User File: output.tt
(Save) oa (Done)
Sort : 0
	
Off
Trace: 00ff
(Frequency) (Context) (Next Unknown) MAIM (Skim Step) (Skim) (Print)
WORD:
Msg : DEV-MUC3-0001 (NOSC)
Left: 3
	
Right: 3
	
Within : 0
	
O
Processed :
	
Matches:
EXTENDER>
MEANWHILE, THE 3D INFANTRY BRIGADE REPORTED THAT PONCE BATTALION UNITS
FOUND THE DECOMPOSED BODY OF A SUBVERSIVE IN LA FINCA HILL, SAN MIGUEL .
	
AN
M-16 RIFLE, FIVE GRENADES, AID MATERIAL FOR THE PRODUCTION OF EXPLOSIVES WERE
FOUND IN THE SAME PLACE.
	
THE BRIGADE, WHICH IS HEADQUARTERED IN SAN MIGUEL,
ADDED THAT THE SEIZURE VAS MADE YESTERDAY MORNING .
NATIONAL GUARD UNITS GUARDING THE LAS CANAS BRIDGE, WHICH IS ON THE
NORTHERN TRUNK HIGHWAY IN APOPA, THIS MORNING REPELLED A TERRORIST ATTACK
THAT RESULTED IN NO CASUALTIES.
	
THE ARMED CLASH INVOLVED MORTAR AND RIFLE
FIRE AND LASTED 30 MINUTES.
	
MEMBERS OF THAT SECURITY GROUP ARE CUMBING THE
0. MESSAGE ID DEV-MUC3-0001 (
	
SC)
1. TEMPLATE ID 1
2. DATE OF INCIDENT 30 DEC 89
3. TYPE OF INCIDENT KIDNAPPING
4. CATEGORY OF INCIDENT TERRORIST ACT
5. PERPETRATOR: ID OF INDIV(S) "TERRORISTS'
6. PERPETRATOR: ID OF ORG(::) "FARABUNDO WARTI NATIONAL LIBERATION FRONT"
7.
OUT"
PERPETRATOR : CONFIDENCE REPORTED AS FACT: "FARABUNDO MARTI NATIONAL LIBERATION FR
8. PHYSICAL TARGET : ID(S)
9. PHYSICAL TARGET : TOTAL NUM
10.PHYSICAL TARGET : TYPE(S)
11. HUMAN TARGET: ID(S)
	
`PEASANTS'
12. HUMAN TARGET: TOTAL NUM
	
PLURAL
13. HUMAN TARGET: TYPE(S)
	
CIVILIAN: "PEASANTS"
14. TARGET: FOREIGN NATION(S)
	
-
15. INSTRUMENT: TYPE(S)
16. LOCATION OF INCIDENT
	
EL SALVADOR: SAN MIGUEL (DEPARTMENT)
17. EFFECT ON PHYSICAL TARGET(S)
18. EFFECT ON HUMAN TARGET(S)
	
-Figure 5: Skimmer --	 -
,
1: object 1-13 _about_nunber_peaeants_of_verious_ages_
2: modal
3: have 14 have
4:
	
be 15-I6 been
5: adv
6: ven 17-18 _kidnapped
7: clause
8: by 19-20 _by
9: actor 21-38 _per~petrator_of_the_perpetrator_[perpetrator]_irLlocatlon ._
Phrase length = 3.
Non•Mlds = 2.
Pattern
P:
= nunber
1-3 _about_
1: nunber 4 nunber
2: adj
3 country-adj Figure 6: Skimmer Trace ,
INLET: Interactive Natural Language Engineering Tool
(Frequency) (Context) (Next Unknown =LIM (Ski. Step) (Skim) (Print)
WORD:
Msg: DEV-MUC3-0001 (NOSC)
Left: 3 Right : 3 Within : 0 O Oy
Processed:
	
Matches:
OEV-MUC3-0001 (NOSC )
SAN SALVADOR, 3 JAN 90 -- [REPORT] [ARMED FORCES PRESS CO44ITTEE,
COPREFA] [TEXT] THE ARCE BATTALION COMIANO HAS REPORTED THAT ABOUT 50
PEASANTS OF VARIOUS AGES HAVE BEEN KIDNAPPED BY TERRORISTS OF THE
FARABUNDO MARTI NATIONAL LIBERATION FRONT [FMLN) IN SAN MIGUEL
DEPARTMENT . ACCORDING TO THAT GARRISON, THE MASS KIDNAPPING TOOK PLACE ON
30 DECEMBER IN SAN LUIS DE LA REINA. THE SOURCE ADDED THAT THE TERRORISTS
FORCED THE INDIVIDUALS, VHO 'ERE TAKEN TO AN UNKNOWN LOCATION, OUT OF
THEIR RESIDENCES, PRESUMABLY TO INCORPORATE THEM AGAINST THEIR WILL INTO
CLANDESTINE GROUPS.
Phrase length .-36f.
Nonoilds
	
= 4.
Pattern
	
= passive .
EXTENDER>
SER: (none)
(KB Editor)
DOIIAIII: (none)
(Add Vocabular )
	
(Dictionary)
	
(Text	 fool)
	
(Options)
	
(SuoOSJ (Quit)
e:.:	 idol
Text file : noscl
Stat File : reject.tt
User File : output.tt
ANALYZER)
(Save)
Sort : 0
	
Off
Trace: ^v Ora
(Done)(Load)
18 2
