I 
/ 
/ 
/ 
l 
/ 
/ 
l 
/ 
/ 
/ 
/ 
/ 
l 
l 
Language Model and Sentence Structure Manipulations for 
Natural Language Application Systems 
Zenshiro Kawasaki, Keiji Takida, and Masato Tajima 
Department of Intellectual Information Systems 
Toyama University 
3190 Gofuku, Toyama 930-0887, Japan 
{kawasaki, zakida, and Zajima}@ecs.Zoyama-u. ac. jp 
Abstract 
This paper presents a language model and 
its application to sentence structure manip- 
ulations for various natural language ap- 
plications including human-computer com- 
munications. Building a working natural 
language dialog systems requires the inte- 
gration of solutions to many of the impor- 
tant subproblems of natural language pro- 
cessing. In order to materialize any of these 
subproblems, handling of natural language 
expressions plays a central role; natural 
language manipulation facilities axe indis- 
pensable for any natural language dialog 
systems. Concept Compound Manipula- 
tion Language (CCML) proposed in this 
paper is intended to provide a practical 
means to manipulate sentences by means 
of formal uniform operations. 
1 Introduction 
Sentence structure manipulation facilities such as 
transformation, substitution, translation, etc., axe 
indispensable for developing and maintaining natu- 
ral language application systems in which language 
structure operation plays an essential role. For this 
reason structural manipulability is one of the most 
important factors to be considered for designing a 
sentence structure representation scheme, i.e., a lan- 
guage model. The situation can be compared to 
database management systems; each system is based 
on a specific data model, and a data manipulation 
sublanguage designed for the data model is provided 
to handle the data structure (Date, 1990). 
In Concept Coupling Model (CCM) proposed in 
this paper, the primitive building block is a Con- 
cept Frame (CF), which is defined for each phrasal 
or sentential conceptual unit. The sentence analysis 
is carried out as a CF instantiation process, in which 
several CFs axe combined to form a Concept Com- 
pound (CC), a nested relational structure in which 
the syntactic and semantic properties of the sen- 
tence are encoded. The simplicity and uniformity of 
the CC representation format lead to a correspond- 
ing simplicity and uniformity in the CC structure 
operation scheme, i.e., CC Manipulation Language 
(CCML). 
Another advantage of the CCM formalism is that 
it allows inferential facilities to provide flexible hu- 
man computer interactions for various natural lan- 
guage applications. For this purpose conceptual re- 
lationships including synonymous and implicational 
relations established among CFs are employed. Such 
knowledge-based operations axe under development 
and will not be discussed in this paper. 
In Section 2 we present the basic components of 
CCM, i.e,, the concept frame and the concept com- 
pound. Section 3 introduces the CC manipulation 
language; the major features of each manipulation 
statement are explained with illustrative examples. 
Concluding observations axe drawn in Section 4. 
2 Concept Coupling Model 
2.1 Concept Compound and Concept 
Frame 
It is assumed that each linguistic expression such 
as a sentence or a phrase is mapped onto an ab- 
stract data structure called a concept compound 
(CC) which encodes the syntactic and semantic in- 
formation corresponding to the linguistic expression 
in question. The CC is realized as an instance of a 
data structure called the concept frame (CF) which 
is defined for each conceptual unit, such as an entity, 
a property, a relation, or a proposition, and serves 
as a template for permissible CC structures. CFs 
axe distinguished from one another by the syntactic 
and semantic properties of the concepts they repre- 
sent, and axe assigned unique identifiers. CFs axe 
classified according to their syntactic categories as 
sentential, nominal, adjectival, and adverbial. The 
CCM lexicon is a set of CFs; each entry of the lex- 
icon defines a CF. It should be noted that in this 
paper inflectional information attached to each CF 
definition is left out for simplicity. 
Kawasaki, Takida and Tajima 281 Language Model and Sentence Structure Manipulations 
Zenshiro Kawasaki, Keiji Takida and Masato Tajima (1998) Language Model and Sentence Structure Manipulations for 
Natural Language Applications Systems. In D.M.W. Powers (ed.) NeMIazP3/CoNLL98 Workshop on Human Computer 
Conversation, ACL, pp 281-286. 
2.2 Syntax 
In this section we define the syntax of the formal 
description scheme for the CF and the CC, and ex- 
plain how it is interpreted. A CF is comprised of 
four types of tokens. The first is the concept identi- 
fier which is used to indicate the relation name of the 
CF structure. The second token is the key-phrase, 
which establishes the links between the CF and the 
actual linguistic expressions. The third is a list of 
attribute values which characterize the syntactic and 
semantic properties of the CF. Control codes for the 
CCM processing system may also be included in the 
list. The last token is the concept pattern which is 
a syntactic template to be matched to linguistic ex- 
pressions. The overall structure of the CF is defined 
as follows: 
(I) c(g, A, P), 
where C and K are the concept identifier and the 
key-phrase respectively, A represents a list of at- 
tribute values of the concept, and P is the concept 
pattern which is a sequence of several terms: vari- 
ables, constants, and the symbol * which represents 
the key-phrase itself or one of its derivative expres- 
sions. The constant term is a word string. The vari- 
able term is accompanied by a set of codes which 
represent the syntactic and semantic properties im- 
posed on a CF to be substituted for it. These codes, 
each of which is preceded by the symbol +, are classi- 
fied into three categories: (a) constraints, (b) roles, 
and (c) instruction codes to be used by the CCM 
processing system. No reference is made to the se- 
quencing of these codes, i.e., the code names are 
uniquely defined in the whole CCM code system. 
The CF associated with the word break in the 
sense meant by John broke the box yesterday is 
shown in (2): 
(2) breakOl O('break', \[sent, dyn, base\], 
'$1(+nomphrs + hum + subj + agent) • 
$2( +nomphrs + Chert + obj + patnt)'). 
In this example the identifier and the key-phrase are 
breakOlO and break respectively. The attribute list 
indicates that the syntactic category of this CF is 
sertt(ential) and the semantic feature is dyn(amic). 
The attribute base is a control code for the CCM 
processing system, which will not be discussed fur- 
ther in this paper. The concept pattern of this CF 
corresponds to a subcategorization fraxae of the verb 
break. Besides the symbol • which represents the 
key-phrase break or one of its derivatives, the pat- 
tern includes two variable terms ($1 and $2), which 
are called the immediate constituents of the con- 
cept breakOlO. The appended attributes to these 
variables impose conditions on the CFs substituted 
for them. For example, the first variable should be 
matched to a CF which is a nom(inal-)phr(a)s(e) 
with the semantic feature hum(an), and the syntac- 
tic role subj(ect) and the semantic role agent are to 
be assigned to the instance of this variable. 
The CC is an instantiated CF and is defined as shown in (3): 
(3) C(H,R,A, 
where the concept identifier C is used to indicate 
the root node of the CC and represents the ,whole 
CC structure (3), and H, R, and A are the head, 
role, and attribute slot respectively. The head slot 
H is occupied by the identifier of the C's head, i.e., 
C itself or the identifier of the C's component which 
determines the essential properties of the C. The 
role slot R, which is absent in the corresponding CF 
definition, is filled in by a list of syntactic and se- 
mantic role names which are to be assigned to C 
in the concept coupling process described in Sec- 
tion 2.3. The last slot represents the C's structure, 
an instance of the concept pattern P of the corre- 
sponding CF, and is occupied by the constituent list. 
The members of the list, X1,X2,..., and Xn, are the 
CCs corresponding to the immediate constituents 
of C. The tail element M of the constituent list, 
which is absent in non-sentential CCs, has the form 
md_(H,R,A, \[M1, ..., Mini), where M1,...,Mm rep- 
resent CCs which are associated with optional ad- 
verbial modifiers. 
By way of example, the concept st'~ruCture corre- 
sponding to the sentence in (4a) is shown in (4b), 
which is an instance of the CF in (2). 
(4a) John broke the box yesterday. 
(4b) break010 
(break010, 
0, \[sent, dyn, fntcls, past, agr3s\], 
\[johnO060 
(johnO060, 
\[subj, agent\], 
\[nomphr s, prop, hum, agr 3s, mascIn\] , B), 
boxO0010 
(box00010, 
\[obj, patnt\], 
\[the_, encrt, nomphr s, agr3s\], 
0), md_ 
(\[yeste010\], \[modyr\], 
\[advphr s, mo~, 
\[yeste010 
(yeste010, 
0, \[advphrs, timeAdv\], 
I) 3) 3). In (4b) three additional attributes, i.e., f(i)n(i)t(e- 
)cl(au)s(e), past, and agr(eement-)3(rd-person- 
)s(ingular), which are absent in the CF definition, 
enter the attribute list of break010. Also note that 
the constituent list of break010 contains an optional 
modifier component with the identifier rod_, which 
does not have its counterpart in the corresponding 
CF definition given in (2). 
Kawasaki, Takida and Tajima 282 Language Model and Sentence Structure Manipulations 
I 
I 
I 
I 
l 
I 
l 
I 
I 
I 
I 
l 
I 
II 
II 
II 
II 
II 
II 
II 
II 
II 
II 
il 
!1 
AI 
2.3 Concept Coupling 
As sketched in the last section, each linguistic ex- 
pression such as a sentence or a phrase is mapped 
onto an instance of a CF. For example, the sentence 
(4a) is mapped onto the CC given in (4b) which is 
an instance of the sentential CF defined in (2). In 
this instantiation process, three other CFs given in 
(5) are identified and coupled with the CF in (2) to 
generate the compound given in (4b). 
(5a) johnOO60('john', 
\[nomphrs, prop, hum, agr3s, mascln\], I . ,). 
(Sb) boxOOOlO('box', \[nomphrs, cncrt, base_n\], ' * '). 
(5c) yesteOl O(l yesterday I, \[advphrs, timeAdv\], ' * i). 
All three CFs in (5) are primitive CFs, i.e., their 
concept patterns do not contain variable terms and 
their instances constitute ultimate constituents in 
the CC structure. For example (Sb) defines a CF 
corresponding to the word box. The identifier and 
the key-phrase are box00010 and box respectively. 
The attribute list indicates that the syntactic cate- 
gory, the semantic feature, and the control attribute 
are noro(inal-)phr(o)s(e), c(o)~c~(e)t(e), and base(- }n(oun} 
respectively. The concept pattern consists 
of the symbol *, for which box or boxes is to'be sub- 
stituted. 
In the current implementation of concept cou- 
pling, a Definite Clause Grammar (DCG) (Pereira 
and Warren, 1980) rule generator has been devel- 
oped. The generator converts the run-time dictio- 
nary entries, which are retrieved from the base dic- 
tionary as the relevant CFs for the input sentence 
analysis, to the corresponding DCG rules. We shall 
not, however, go into details here about the algo- 
rithm of this rule generation process. The input 
sentence is then analyzed using the generated DCG 
rules, and finally the source sentence structure is ob- 
tained as a CC, i.e., an instantiated CF. In this way 
the sentence analysis can be regarded as a process of 
identifying and combining the CFs which frame the 
source sentence. 
3 Concept Compound 
Manipulations 
The significance of the CC representation format is 
it's simplicity and uniformity; the relational struc- 
ture has the fixed argument configuration, and ev- 
ery constituent of the structure has the same data 
structure. Sentence-to-CC conversion corresponds 
to sentence analysis, and the obtained CC encodes 
syntactic and semantic information of the sentence; 
the CC representation can be used as a model for 
sentence analysis. Since CC, together with the rele- 
vant CFs, contains sufficient information to generate 
a syntactically and semantically equivalent sentence 
to the original, the CC representation can also be 
employed as a model for sentence generation. In 
this way, the CC representation can be used as a 
language model for sentence analysis and generation. 
Kawasaki, Takida and Tajima 
Another important feature of the CC representa- 
tion is that structural transformation relations can 
easily be established between CCs with different 
syntactic and semantic properties in tense, voice, 
modality, and so forth. Accordingly, if a conve- 
nient CC structure manipulation tool is available, 
sentence-to-sentence transformations can be realized 
through CC-to-CC transformations. The simplicity 
and uniformity of the CC data structure allows us to 
devise such a tool. We call the tool Concept Com- 
pound Manipulation Language (CCML). 
Suppose a set of sentences are collected for a spe- 
cific natural language application such as second lan- 
guage learning or human computer communication. 
The sentences are first transformed into the corre- 
sponding CCs and stored in a CC-base, a file of 
stored CCs. The CC-base is then made to be avail- 
able to retrieval and update operations. 
The CCML operations are classified into three cat- 
egories: (a) Sentence-CC conversion operations, (b) 
CC internal structure operations, (c) CC-base oper- 
ations. The sentence-CC conversion operations con- 
sists of two operators: the sentence-to-CC conver- 
sion which invokes the sentence analysis program 
and parses the input sentence to obtain the corre- 
sponding CC as the output, and the CC-to-sentence 
conversion which generates a sentence corresponding 
to the indicated CC. The CC internal structure oper- 
ations are concerned with operations such as mod- 
ifying values in a specific slot of a CC, and trans- 
forming a CC to its derivative CC structures. The 
CC-base operations include such operations as cre- 
ating and destroying CC-bases, and retrieving and 
updating entries in a CC-base. The current imple- 
mentation of these facilities are realized in a Prolog 
environment, in which these operations are provided 
as Prolog predicates. 
In the following sections, the operations men- 
tioned above are explained in terms of their effects 
on CCs and CC-bases, and are illustrated by means 
of a series of examples. All examples will be based 
on a small collection of sample sentences shown in 
(7), which are assumed to be stored in a file named 
sophie.text. 
(7a) Sophie opened the big envelope apprehensively. 
(To) Hilde began to describe her plan. 
(7c) Sophie saw that the philosopher was right. 
(7d) A thought had suddenly struck her. 
3.1 Sentence-CC Conversions 
Two operations, $get_cc and $get_sent, are provided 
to inter-convert sentences and CCs. 
$get_cc~ $get_sent 
The conversion of a sentence to its CC can be re- 
alized by the operation $get_cc as a process of con- 
cept coupling described in Section 2.3. The reverse 
process, i.e., CC-to-sentence conversion, is carried 
out by the operation $get_sent, which invokes the 
283 Language Model and Sentence Structure Manipulations 
sentence generator to transform the CC to a corre- 
sponding linguistic expression. The formats of these 
operations are: 
(8) Sget_cc( Sent, CC). 
(9) $get_sent( CC, Sent). 
The arguments Sent and CC represent a sentence 
or a list of sentences, and a CC or a list of CCs, 
respectively. For the $get_cc operation, the input 
sentence (list) occupies the Sent position and CC 
is a variable in which the resultant CC (list) is ob- 
tained. For the $get_sent operation, the roles of the 
arguments are reversed, i.e., CC is taken up by an 
input CC (list) and Sent an output sentence (list). 
Example: 
(10a) Get CC for the sentence Sophie opened the big 
envelope apprehensively. 
The query (10a) is translated into a CCML state- 
ment as: 
(10b) $get_cc('Sophie opened the big envelope 
apprehensively', CC). 
Note that the input sentence must be enclosed in sin- 
gle quotes. The CC of the input sentence is obtained 
in the second argument CC, as shown in (10c): (10c) 
CC = openOOlO(openO010, 
D, \[sent, f ntels, past, agr 3s\] , 
\[sophiOlO( sophiOlO, \[subj\], 
\[nomphrs, prop, hum, agr3s, femnn\], D), 
bigOOO20( envelO01, \[obj\], 
\[the_, det_modf d, adj_mod, cncrt, 
nomphrs, agr3s\], 
\[envelOO l ( envelO01, ~, 
\[cncrt, nomphrs, agr3s, f _n\], ~)\]), md_(\[ar e010\], \[modyr\], \[advphrs, 
\[app,'eO10(azo,'e010, D, \[ dvphrs\], U)\])\])- 
3.2 CC Internal Structure Operations 
Since the CC is designed to represent an abstract 
sentence structure in a uniform format, well-defined 
attributive and structural correspondences can be 
established between CCs of syntactically and seman- 
tically related sentences. Transformations between 
these derivative expressions can therefore be realized 
by modifying relevant portions of the CC in ques- 
tion. 
For manipulating the CC's internal structure, 
CCML provides four basic operations ($add, Sdelete, 
Ssubstitute, Srestructure) and one comprehensive op- 
eration ( $trans form ). 
$add 
This operation is used to axid values to a slot. The 
format is: 
(11) Sadd(CC, Slot, ValueList, CCNew). 
For the CC given in the first argument CC, the el- 
ements in ValueList are appended to the slot indi- 
cated by the second argument Slot to get the mod- 
ified CC in the last argument CCNew. 
Example: 
(12a) For the CC given in (10c), add the value 
perf(e)ct to the slot attribute. 
(12b) $add( C C , attribute, ~ver f ct\] , C C New ) . 
In (12b) the first argument CC is occupied by the 
CC shown in (10c). The last argument .CCNew is 
then instantiated as the CC corresponding to the 
sentence Sophie had opened the envelope apprehen- 
sively. Note that imperf(e)ct is a default attribute 
value assumed in (10c). 
$delete 
In contrast to add, this operation removes the in- 
dicated values from the specified slot. The format 
is: 
(13) Sdelete( CC, Slot, ValueList, CCNew). 
$substitute 
This operation is used to replace a value in a slot 
with another value. The format is: 
(14) Ssubstitute( C C, Slot, OldV alue, N ewV alue, 
CCNew). 
Example: 
(15a) For the CC in (10c), replace the attribute value 
past by pres( e )nt. 
(15b) $substitute( CC, attribute, past, 
presnt, C C New ). 
By this operation CCNew is instantiated as a CC 
corresponding to the sentence Sophie opens the en- 
velope apprehensively. 
$restructure 
This operation changes the listing order of imme- 
diate constituents, i.e., the component CCs in the 
structure slot of the specified CC. The format is: 
(16) Srestrueture( CC, Order, CC New), 
where the first argument CC represents the CC to 
be restructured and the second argument Order de- 
fines the new ordering of the constituents. The gen- 
eral format for this entry is: 
(17) \[Pl,P2,Ps, ..,pn\], 
where the integer p~ (i = 1, 2, ..., n) represents the 
old position for the pi-th constituent of the CC in 
question, and the current position of Pi in the list 
(17) indicates the new position for that constituent. 
For example, \[2,1\] means that the constituent CCs 
in the first and second positions in the old structure 
are to be swapped. The remaining constituents are 
not affected by this rearrangement. 
$transform 
The above mentioned basic operations yield CCs 
which do not necessarily correspond to actual (gram- 
matical) linguistic expressions. The higher level 
structural transformation operation, $transform, is 
a comprehensive operation to change a CC into one 
of its derivative CCs which directly correspond to 
actual linguistic expressions. Tense, aspect, voice, 
sentence types (statement, question, etc), and nega- 
tion are among the currently implemented transfor- 
mation types. The format is: 
(18) Strans form( C C, TransTypeList, C C New). 
The second argument TransTypeList defines the 
target transformation types which can be selected 
from the following codes: 
Kawasaki, Takida and Tafima 284 Language Model and Sentence Structure Manipulations 
mm 
mm 
!1 
mm 
II 
II 
mm 
II 
mm 
mm 
II 
II 
mm 
II 
II 
II 
II 
II 
!1 
II 
II 
II 
II 
II 
II 
II 
II 
II 
Voice: act( i )v( e ), pas( si )v( e ). 
Negation: a f firm(a)t(i)v(e), neg(a)t(i)v(e). 
Tense: pres(e)nt, past,/ut(u)r(e). 
Perfective: per f ( e )ct, imper f ( e )ct. 
Progressive: cont(inuous), n(on-)cont(inuous). 
Sentence Type~ stat(e)m(e)nt, quest(io)n, 
dir(e)ct(i)v(e), excl(a)m(a)t(io)n. 
Note that the $transform operation does not re- 
quire explicit indication of the attribute type for 
each transformation code in the above list. This 
is possible because the code names are uniquely de- 
fined in the whole CCM code system. 
Examples: 
(19a) Get the interrogative form of the sentence 
Hilde began to describe her plan. 
The above query is expressed as a series of CCML 
statements: 
(19b) $get_cc(~Hilde began to describe her plan ~, 
CC), 
Stransf orm( CC, \[questn\], CCNew), 
Sget_sent( CC New, SentNew). 
The result is obtained in SentNew as: 
(19c) SentNew='Did Hilde begin to describe her 
plan?' 
Note that the same values are substituted for like- 
named variables appearing in the same query in all 
of their occurrences, e.g., CC and CCNew in (19b). 
Another example of the use of $transfarm is given 
in (20): 
(20a) Get the present perfect passive form of the sen- 
tence Sophie opened the big envelope apprehensively. 
(20b) $get_cc(tSophie opened the big envelope 
apprehensively ~, C C ) , 
$trans form( C C, ~resnt, per f ct, pasv\], 
CCNew), 
$get_sent( CC New, SentNew ). 
(20c) SentNew='The big envelope has been opened 
by Sophie apprehensively.' 
3.3 CC-base Operations 
3.3.1 Storage operations 
The CC-base storage operations are: $create_ccb, 
Sactivate_ccb, Sdestroy_ccb, Ssave_ccb, and Sre- 
store_ccb. 
$create_ecb 
The general format of $create_ccb is: 
(21) $create_ccb( SentenceFileName, 
CC BaseFileName). 
The file indicated by the first argument Sentence- 
FileName contains a set of sentences (one sentence 
per line) to be converted to their CC structures. The 
$create_ccb operation invokes the sentence analyzer 
to transform each sentence in the file into the cor- 
responding CC and store the results in the CC-base 
indicated by CCBaseFileName (one CC per line). 
Example: 
(22a) Convert all sentences in the text file 
sophie.text shown in (7) into their CCs and store 
the result in the CC-base named sophie.ccb. 
(22b) $create_ccb(' sophie.text',' sophie.ccb'). 
The first line of the file sophie.ccb is taken by the CC 
given in (10c) which corresponds to the first sentence 
in the file sophie.text shown in (7). 
$activate_ccb 
The format of $activate_ccb is: 
(23) $activate_ccb( CC BaseFileName). 
This operation copies the CC-base indicated by 
CCBaseFileName to the "current" CC-base which 
can be accessed by retrieval and update operations 
explained in the following sections. If another CC- 
base file is subsequently "activated", CCs from this 
new CC-base file are simply appended to the current 
CC-base. 
Example: 
(24a) Activate sophie.ccb. 
(24b) $activate(' sophie.ccb'). 
$destroy_ccb 
There are two formats for $destro~_ccb: 
(25a) $destroy_ccb(CCBaseFileName). 
(25b) $destroy_ccb. 
CCBaseFileName is taken up by the name of a CC- 
base file to be removed. If current is substituted for 
CCBaseFileName in (25a) or the operation is used 
in the (25b) format, the current CC-base is removed. 
$save_eeb 
The formats are: 
(26a) $save_ccb( C C BaseF ileN ame ). 
(26b) $save_ccb. 
The $save_ccb operation is used to store the cur- 
rent CC-base into the file indicated by CCBaseFile- 
Name. The current CC-base is destroyed by this 
operation. If CCBaseFileName is current in (26a) 
or the operation is used in the (26b) format, the cur- 
rent CC-base is stored temporarily in the system's 
work space. Note that the Ssave_ccb operation dis- 
cards already existing CC-base in the work space 
when it is executed. 
$restore_ccb 
This operation takes no arguments. The existing 
current CC-base is destroyed and the saved CC-base 
in the work space is activated. The format is: 
(27) $restore_ccb. 
3.3.2 Retrieval operations 
Retrieval operations are applied to the current 
CC-base. Relevant CC-bases should therefore be ac- 
tivated prior to issuing retrieval operations. 
$retrieve_ec 
The $retrieve_cc operation searches the current 
CC-base for CCs which satisfy the specified condi- 
tions. Note that if a CC contains component CCs 
which satisfy the imposed conditions, they are also 
fetched by this operation; CCs within a CC are all 
searched for..The general format of $retrieve_cc is 
as fallows: 
(28) $retrieve_cc( S electionaIC onditionList, 
Kawasaki, Takida and Tajima 285 Language Model and Sentence Structure Manipulations 
RetrievedC C List), 
where SelectionalConditionList is a list of con- 
stralnts imposed on CCs to be retrieved. Each el- 
ement of the list consists of either of the following 
terms: 
(29a) SlotName = ValueList. 
(29b) RoleN ame : C ondition List. 
SlotName is occupied by a slot name of the CC 
structure, i.e., identifier, head, role, attribute, or 
structure. ValueList is a list of values correspond- 
ing to the value category indicated by SlotName. 
The (29b) format is used when conditions are to 
be imposed on the immediate constituent CC with 
the role value indicated by RoleName. The con- 
ditions are entered in ConditionList, which is a 
list of terms of the format (29a). Each element of 
SelectionalConditionList represents an obligatory 
condition, i.e., all the conditions in the list should be 
satisfied simultaneously. More general logical con- 
nectives such as negation and disjunction are not 
available in the current implementation. The re- 
trieval result is obtained in the second argument 
RetrievedCCList as a list. 
Examples: 
(30a) Get all finite subordinate clauses in the file 
sophie.text. 
(30b) Sdestroy_ccb, 
$cr eate_ccb( t sophie.text',' sophie.ccb') , 
$activate_ccb(' sophie.ccU ), 
$retrieve_cc(\[attribute = fntcls, 
attribute = sub_cls\], CCL ), 
$get_sent( CC L, SL ). 
(30c) SL=\['That the philosopher was right. '\]. 
(31a) Assuming that the current CC-base is the one 
activated in (30), get sentences/clauses whose sub- 
jects have the semantic feature hum(an). 
(315) $retrieve_cc(\[subject : \[attribute = hum\]\], 
$get.sent( CC L, SentL ). CC L ), 
(31c) Senti,= 
\['Sophie opened the big envelope 
apprehensively. ', 
'Hilde began to describe her plan." 
'To describe her plan. ', 
'Sophie saw that the philosopher was right. ', 
'That the philosopher was right. '\]. 
Note that all the embedded sentences are included 
in the retrieved CC list. Since the non-overt subject 
of describe in the sentence (7b) is analyzed as Hilde 
in the CC generation process, the infinitival clause 
To describe her plan is also retrieved. 
3.3.3 Update operations 
CCML provides two update operations, i.e., 
Sappend_cc and $delete_cc. These operations are 
used to add or delete the specified CCs from the 
CC-base indicated. 
Sappend_cc 
The formats axe: 
(32a) $append_cc( C C , C C B File ). 
(325) $append.cc( C C ). 
The first argument CC indicates a CC or a list of 
CCs to be appended to the CC-base specified in 
the second argument CCBFile. If the named CC- 
base is current or the operation is used in the for- 
mat (32b), the $append_cc operation makes the ap- 
pended CC(s) indicated in the first argument be di- 
rectly accessible by retrieval operations. 
Example: 
(33a) Append the CC for the sentence Sophie saw 
that the philosopher was right to the current CC- 
base. 
(335) Sget_cc( 'Sophie saw that the philosopher 
was right', CC), 
Sappend_cc(CC). 
Sdelete_cc 
(34a) $delete_cc( C C , C C B File ). 
(34b) $delete_cc( C C ). 
Removal of the indicated CC(s) from the current 
CC-base is carried out by this operation. The in- 
terpretation of the arguments and their uses are the 
same as those of $append_cc. 
4 Conclusions 
A sentence structure manipulation language CCML 
based on the language model CCM was proposed. 
In CCM each sentence is transformed into a CC, 
a nested relational structure in which the syntac- 
tic and semantic properties of the sentence are en- 
coded in a uniform data structure. This uniformity 
in CC's data structure leads to a corresponding uni- 
formity in the CCML operations. The CCML op- 
erations implemented so fax cover a wide range of 
areas in sentence structure manipulations including 
sentence-CC inter-conversion operations, CC inter- 
nal structure operations, and CC-base operations. 
The manipulation language CCML proposed in this 
paper is expected to be used in various natural lan- 
guage application systems such as second-language 
learning systems and human computer communica- 
tion systems, in which sentence structure manipula- 
tion plays an essential role. 

References 
C. J. Date. 1990. An Introduction to Database 
Systems, Volume 1, Fifth Edition. Addison- 
Wesley Publishing Company, Inc., Reading, Mas- 
sachusetts. 
F. Pereira and D. H. D. Warren. 1980. Definite 
clause grammars for language analysis - A sur- 
vey of the formalism and a comparison with aug- 
mented transition networks. Artificial Intelligence 
13:231-278. 
