I 
i 
I 
I 
I 
I 
I 
I 
! 
EACL'99 
CoNLL-99 
Computational Natural Language 
Learning 
Proceedings of a Workshop Sponsored by 
The Association for Computational Linguistics 
Editors: Miles Osborne & Erik Tjong Kim Sang 
12 June 1999 
Universty of Bergen 
Bergen, Norway 
Published by the Association for Computational Linguistics 
i 
I 
I 
I 
I 
I 
i 
i 
I 
i 
i 
I 
m ~ 
! 
CoNLL99 
Computational Natural Language 
Learning 
Bergen, Norway 12th June, 1999 
Editors: Miles Osborne & Erik Tjong Kim Sang 
In conjunction with the Special Interest Group in Natural Language Learning 
and the TMR Project Learning Com- 
putational Grammars 
Collection copyright ACL. Authors retain individual copyright. Order 
ditional copies from: 
ACL 
P. O. Box 6090 
Somerset, N J, 08875 USA 
÷1-908-873-3898 
acl@bellcore.com 
ii 
Preface 
Welcome to CoNLL99 (http://lcg-www.uia.ac.be/conl199), the third meet- 
ing of the Special Interest Group in Natural Language Learning. Regular 
papers accepted this year deal with orthography, morphology, syntax and 
grammatical relations, using technologies based around information theory, 
transformation-based learning, instance-based learning and semi-automated 
knowledge acquisition. In addition to these regular papers, CoNLL99 has 
as a special theme the task of recovering Noun Phrases from free text. On 
this topic, we are pleased to have Lance Ranlshaw as our guest speaker.. 
In addition to the papers in the proceedings dealing with Noun Phrase 
identification, the meeting will include short reports by other groups. The 
CoNLL99 web site will contain updated information. 
We would like to thank all the people who helped contribute to this 
workshop: our guest speaker, the programme committee, SiGNLL, LCG 
(funded by the TMR programme of the European Union), our EACL hosts 
at Bergen, and finally, and most importantly, all the authors for their con- 
tributions. 
Miles Osborne and Erik Tjong Kim Sang. 
iii 
Chairs 
Miles Osborne 
Erik Tjong Kim Sang 
Programme Committee 
Antal van den Bosch 
Ted Briscoe 
Walter Daelemans 
John Nerbonne 
David Powers 
Christer Samuelsson 
Jeffrey Mark Siskind 
Invited Speaker 
Lance Ramshaw (BBN). 
(U. Groningen) 
(U. Antwerp) 
(U. Tilburg) 
(U. Cambridge) 
(U. Antwerp) 
(U. Groningen) 
(U. Flinders) 
(Xerox) 
(NEC) 
Table of Contents 
Chunyu Kit and Yorick Wilks 
Unsupervised Learning of Word Boundary with 
Description Length Gain 1-6 
Andrd Kernpe 
Experiments in Unsupervised Entropy-Based 
Corpus Segmentation 7-13 
Kemal Oflazer and Sergei Nirenburg 
Practical Bootstrapping of Morphological Analyzers 14"--23 
Stephan Raaijmakers 
Finding Representations for Memory-Based Language Learning 24-32 
TorbjSrn Lager 
The #-TBL System: Logic Programming Tools for 
Transformation-Based Learning 33-42 
Lisa Ferro~ Marc Vilain and Alexander Yeh 
Learning Transformation Rules to Find Grammatical Relations 43-52 
Walter Daelemans, Sabine Buchholz and Jorn Veenstra 
Memory-Based Shallow Parsing 53-60 
Miles Osborne 
MDL-based DCG Induction for NP Identification 61-68 
iv 
i 
I 
1 
I 
I 
I 
i 
I 
i 
I 
I 
I 
I 
! 
I 
Timetable 
09.00 - 09.30 
09.30- 10.00 
10.00- 10.30 
10.30- 11.00 
11.00- 12.00 
12.00- 12.30 
12.30- 13.30 
13.30- 14.00 
14.00- 14.30 
14.30- 15.00 
15.00- 15.30 
15.30- 16.00 
16.00- 16.30 
16.30- 17.00 
Unsupervised Learning of Word Boundary 
with Description Length Gain 
Chunyu Kit and Yorick Wilks 
Experiments in Unsupervised Entropy-Based 
Corpus Segmentation 
Andr4 Kempe 
Practical Bootstrapping of Morphological Analyzers 
Kemal Oflazer and Sergei Nirenburg 
break 
Invited talk (title to be announced) 
Lance Ramshaw 
Finding Representations for Memory-based 
Language Learning 
Stephan tLaaij rnakers 
Lunch 
The/~-TBL System: Logic Programming Tools for 
Transformation-Based Learning 
Torbjiirn Lager 
Learning Transformation Rules to Find 
Grammatical Relations 
Lisa Ferro, Marc Vilain and Alexander Yeh 
Memory-Based Shallow Parsing 
Walter Daelemans, Sabine Buchholz 
and Jorn Veenstra 
break 
MDL-based DCG Induction for NP Identification 
Miles Osborne 
Results CoNLL task 
(speakers to be announced) 
Closing 
