1
Aquantitativeevaluationofnaturalisticmodelsoflanguageacquisition;the
ef ciencyoftheTriggeringLearningAlgorithmcomparedtoaCategorial
GrammarLearner
PaulaButtery
NaturalLanguageandInformationProcessingGroup,
ComputerLaboratory,CambridgeUniversity,
15JJThomsonAvenue,Cambridge,CB30FD,UK
paula.buttery@cl.cam.ac.uk
Abstract
Naturalistictheoriesoflanguageacquisitionassume
learnerstobeendowedwithsomeinnatelanguage
knowledge. Thepurposeofthisinnateknowledge
istofacilitatelanguageacquisitionbyconstrain-
ingalearner’shypothesisspace. Thispaperdis-
cussesanaturalisticlearningsystem(aCategorial
GrammarLearner(CGL))thatdiffersfromprevious
learners(suchastheTriggeringLearningAlgorithm
(TLA)(GibsonandWexler,1994))byemployinga
dynamicde nitionofthehypothesis-spacewhich
isdrivenbytheBayesianIncrementalParameter
Settingalgorithm(Briscoe, 1999). Wecompare
theef ciencyoftheTLAwiththeCGLwhenac-
quiringanindependentlyandidenticallydistributed
English-likelanguageinnoiselessconditions. We
showthatwhenconvergence tothetargetgram-
maroccurs(whichisnotguaranteed),theexpected
number of steps toconvergence for the TLAis
shorterthanthatfortheCGLinitializedwithuni-
formpriors. However,theCGLconvergesmore
reliablythantheTLA.Wediscussthetrade-offof
ef ciencyagainstmorereliableconvergencetothe
targetgrammar.
1 Introduction
Anormalchildacquiresthelanguageofherenvi-
ronmentwithoutanyspeci ctraining. Chomsky
(1965)claimsthat,giventhe relativelyslightex-
posure toexamplesand remarkablecomplexity 
oflanguage,itwouldbe anextraordinaryintellec-
tualachievement forachildtoacquirealanguage
ifnotspeci callydesignedtodoso.HisArgument
fromthePovertyoftheStimulussuggeststhatifwe
knowX,andXisundeterminedbylearningexpe-
riencethenXmustbeinnate.Foranexamplecon-
siderstructuredependencyinlanguagesyntax:
AquestioninEnglishcanbeformedbyinvert-
ingtheauxiliaryverbandsubjectnoun-phrase:(1a)
 Dinahwasdrinkingasaucerofmilk ;(1b) was
Dinahdrinkingasaucerofmilk? 
Uponexposuretothisexample,achildcouldhy-
pothesizein nitelymanyquestion-formationrules,
suchas:(i)swapthe rstandsecondwordsinthe
sentence;(ii)frontthe rstauxiliaryverb;(iii)front
wordsbeginningwithw.
The rsttwooftheserulesarerefutedifthechild
encountersthefollowing: (2a) thecatwhowas
grinningatAlicewasdisappearing ;(2b) wasthe
catwhowasgrinningatAlicedisappearing? 
Ifachildistoconvergeuponthecorrecthypoth-
esisunaidedshemustbeexposedtosuf cientex-
amplessothatallfalsehypothesesarerefuted.Un-
fortunatelysuchexamplesarenotreadilyavailable
inchild-directedspeech;eventheconstructionsin
examples(2a)and(2b)arerare(Legate,1999).To
compensateforthislackofdataChomskysuggests
thatsomeprinciplesoflanguagearealreadyavail-
ableinthechild’smind. Forexample,ifthechild
hadinnately known thatallgrammarrulesare
structurally-dependentuponsyntaxshewouldnever
havehypothesizedrules(i)and(iii). Thus,Chom-
skytheorizesthatahumanmindcontainsaUniver-
salGrammarwhichde nesahypothesis-spaceof
 legal grammars.1 Thishypothesis-spacemustbe
bothlargeenoughtocontaingrammar’sforallof
theworld’slanguagesandsmallenoughtoensure
successful acquisition giventhesparsity ofdata.
Languageacquisitionistheprocessofsearchingthe
hypothesis-spaceforthegrammarthatmostclosely
describesthelanguageoftheenvironment. With
estimatesofthenumberoflivinglanguagesbeing
around6800(Ethnologue,2004)itisnotsensibleto
modelthehypothesis-spaceofgrammarsexplicitly,
ratheritmustbemodeledparametrically.Language
acquisitionisthentheprocessofsettingthesepa-
rameters. Chomsky(1981)suggestedthatparam-
etersshouldrepresentpointsofvariationbetween
languages, howevertheonlyrequirementforpa-
rametersisthattheyde nethecurrenthypothesis-
space.
1DiscussionofstructuraldependenceasevidenceoftheAr-
gumentfromthePovertyofStimulusisillustrative,thesig-
ni cancebeingthatinnateknowledgeinanyformwillplace
constraintsonthehypothesis-space
2
Theproperties oftheparameters usedbythis
learner(theCGL)areasfollows:(1)Parametersare
lexical;(2)Parametersareinheritancebased;(3)Pa-
rametersettingisstatistical.
1-LexicalParameters
TheCGLemploysparametersettingasameans
toacquirealexicon;differingfromotherparamet-
riclearners,(suchastheTriggeringLearningAl-
gorithm(TLA)(GibsonandWexler,1994)andthe
StructuralTriggersLearner(STL)(Fodor,1998b),
(SakasandFodor, 2001)) whichacquire general
syntacticinformationratherthanthesyntacticprop-
ertiesassociatedwithindividualwords.2
Inparticular,acategorialgrammarisacquired.
Thesyntacticpropertiesofawordarecontainedin
itslexicalentryintheformofasyntacticcategory.
Awordthatmaybeusedinmultiplesyntacticsitu-
ations(orsub-categorizationframes)willhavemul-
tipleentriesinthelexicon.
Syntacticcategoriesareconstructedfroma nite
setofprimitivecategoriescombinedwithtwoop-
erators(=andn)andarede nedbytheirmembers
abilitytocombinewithotherconstituents;thuscon-
stituentsmaybethoughtofaseitherfunctionsor
arguments.
Thearguments of afunctional constituent are
showntotherightoftheoperatorsandtheresult
totheleft.Theforwardslashoperator(=)indicates
thattheargumentmustappeartotherightofthe
functionandabackwardslash(n)indicatesthatit
mustappearontheleft. Considerthefollowing
CFGstructurewhichdescribesthepropertiesofa
transitiveverb:
s ! npvp
vp ! tvnp
tv ! gets, nds,...
Assumethatthereisasetofprimitivecategories
fs;npg. A vp mustbeinthecategoryoffunc-
tionalconstituentsthattakesanpfromtheleftand
returnsans. Thiscanbewritten snnp. Likewise
a tv takesan np fromtherightandreturnsa vp
(whosetypewealreadyknow).Atvmaybewrit-
ten(snnp)=np.
Rulesmaybeusedtocombinecategories. We
assumethatourlearnerisinnatelyendowedwiththe
rulesoffunctionapplication,functioncomposition
andgeneralizedweakpermutation(Briscoe,1999)
(see gures1and2).
 ForwardApplication(>)
X=Y Y ! X
2Theconceptoflexicalparametersandthelexical-linking
ofparametersistobeattributedtoBorer(1984).
 BackwardApplication(<)
Y XnY ! X
 ForwardComposition(> B)
X=Y Y=Z ! X=Z
 BackwardComposition(< B)
Y nX ZnY ! XnZ
 GeneralizedWeakPermutation(P)
((X j Y1)::: j Yn) ! ((X j Yn)::: j Y1)
wherejisavariableovernand=.
Alice
np
may
(snnp)=(snnp)
eat
(snnp)=np > B
(snnp)=np
the cake 
  
np >
snnp <
s
Figure1:Illustrationofforward/backwardapplica-
tion(>,<)andforwardcomposition(> B)
the
np=n
rabbit
n
that
(nnn)=(s=np)
she
np
saw
(snnp)=np P
(s=np)nnp <
(s=np) >
nnn <
n >
np
Figure2:Illustrationofgeneralizedweakpermuta-
tion(P)
Thelexiconforalanguagewillcontaina nite
subsetofallpossiblesyntacticcategories,thesizeof
whichdependsonthelanguage. Steedman(2000)
suggeststhatforEnglishthelexicalfunctionalcate-
goriesneverneedmorethan veargumentsandthat
theseareneededonlyinalimitednumberofcases
suchasfortheverbbetinthesentenceIbetyou ve
poundsforEnglandtowin.
ThecategorialgrammarparametersoftheCGL
are concerned withde ning the set of syntactic
categoriespresentinthelanguageoftheenviron-
ment.Convergingonthecorrectsetaidsacquisition
byconstrainingthelearner’shypothesizedsyntactic
categoriesforanunknownword.Aparameter(with
3
valueofeitherACTIVEorINACTIVE)isassoci-
atedwitheverypossiblesyntacticcategorytoindi-
catewhetherthelearnerconsidersthecategorytobe
partofthetargetgrammar.
Some previous parametric learners (TLA and
STL)havebeenprimarilyconcernedwithoverall
syntacticphenomenaratherthanthesyntacticprop-
ertiesofindividualwords. Movementparameters
(suchastheV 2parameteroftheTLA)maybecap-
turedbytheCGLusinginnaterulesormultiplelex-
icalentries.Forinstance,DutchandGermanword
orderiscapturedbyassumingthatverbsinthese
languagessystematicallyhavetwocategories,one
determiningmainclauseorderandtheothersubor-
dinateclauseorders.
2-InheritanceBasedParameters
Thecomplexsyntactic categories ofacategorial
grammarareasub-categorizationofsimplercate-
gories;consequentlycategoriesmaybearrangedin
ahierarchywithmorecomplexcategoriesinheriting
fromsimplerones.Figure3showsafragmentofa
possiblehierarchy.Thishierarchicalorganizationof
parametersprovidesthelearnerwithseveralbene-
 ts:(1)Thehierarchycanenforceanorderonlearn-
ing;constraintsmaybeimposedsuchthataparent
parametermustbeacquiredbeforeachildparame-
ter(forexample,inFigure3,thelearnermustac-
quireintransitiveverbsbeforetransitiveverbsmay
behypothesized). (2)Parametervaluesmaybein-
heritedasamethodofacquisition.(3)Theparame-
tersarestoredef ciently.
s-ACTIVE‘‘
‘‘‘‘      
s=s snnp-ACTIVEXX
XXX     
[snnp]=np-ACTIVE [snnp]=[snnp]
Figure3: Partialhierarchyofsyntacticcategories.
Eachcategoryisassociatedwithaparameterindi-
catingeitherACTIVEorINACTIVEstatus.
3-StatisticalParameterSetting
Thelearnerusesastatisticalmethodtotrackrela-
tivefrequenciesofparameter-setting-utterancesin
theinput.3 WeusetheBayesianIncrementalPa-
rameterSetting(BIPS)algorithm(Briscoe,1999)
tosetthecategorialparameters. Suchanapproach
setstheparameterstothevaluesthataremostlikely
givenalltheaccumulatedevidence.Thisrepresents
3OtherstatisticalparametersettingmodelsincludeYang’s
Variationalmodel(2002)andtheGuessingSTL(Fodor,1998a)
acompromisebetweentwoextremes:implementa-
tionsoftheTLAarememorylessallowingaparam-
etervaluestooscillate;someimplementationsofthe
STLsetaparameteronce,foralltime.
UsingtheBIPSalgorithm,evidencefromanin-
pututterancewilleitherstrengthenthecurrentpa-
rametersettingsorweakenthem.Eitherway,there
isre-estimationoftheprobabilitiesassociatedwith
possibleparametervalues.Valuesareonlyassigned
whensuf cientevidencehasbeenaccumulated,i.e.
oncetheassociatedprobabilityreachesathreshold
value. Byemployingthismethod,itbecomesun-
likelyforparameterstoswitchbetweensettingsas
theconsequenceofanerroneousutterance.
AnotheradvantageofusingaBayesianapproach
isthatwemaysetdefaultparametervaluesbyas-
signingBayesianpriors; ifaparameter’sdefault
valueisstronglybiasedagainsttheaccumulatedev-
idencethenitwillbedif culttoswitch.Also,weno
longerneedtoworryaboutambiguityinparameter-
setting-utterances(Clark,1992)(Fodor,1998b):the
Bayesianapproachallowsustosolvethisproblem
 forfree sinceindeterminacyjustbecomesanother
caseoferrorduetomisclassi cationofinputdata
(ButteryandBriscoe,2004).
2 OverviewoftheCategorialGrammar
Learner
Thelearningsystemiscomposedofathreemod-
ules:asemanticslearningmodule,syntaxlearning
moduleandmemorymodule. Foreachutterance
heardthelearnerreceivesaninputstreamofword
tokenspairedwithpossiblesemantichypotheses.
Forexample,onhearingtheutterance Dinahdrinks
milk thelearnermayreceivethepairing:(fdinah,
drinks,milkg,drinks(dinah,milk)).
2.1 TheSemanticModule
Thesemanticmoduleattemptstolearnthemapping
betweenwordtokensandsemanticsymbols,build-
ing a lexicon containing the meaning associated
witheachwordsense. Thisisachievedbyanalyz-
ingeachinpututteranceanditsassociatedsemantic
hypothesesusingcross-situationaltechniques(fol-
lowingSiskind(1996)).
Foratrivialexampleconsidertheutterances Al-
icelaughs and Aliceeatscookies ; theymight
havewordtokenspairedwithsemanticexpressions
asfollows:(falice,laughsg,laugh(alice)),(falice,
eats,cookiesg,eat(alice,cookies)).
Fromthesetwoutterancesitispossibletoascer-
tainthatthemeaningassociatedwiththewordtoken
alicemustbealicesinceitistheonlysemanticele-
mentthatiscommontobothutterances.
4
2.2 TheSyntacticModule
Thelearningsystemlinksthesemanticmoduleand
syntacticmodulebyusingatypingassumption:the
semanticarityofawordisusuallythesameasits
numberofsyntacticarguments.Forexample,ifitis
knownthatlikesmapstolike(x;y),thenthetyp-
ingassumptionsuggeststhatitssyntacticcategory
willbeinoneofthefollowingforms:anbnc,a=bnc,
anb=c,a=b=cormoreconciselya j b j c(wherea,b
andcmaybebasicorcomplexsyntacticcategories
themselves).
Byemployingthetypingassumptionthenumber
ofargumentsinaword’ssyntacticcategorycanbe
hypothesized. Thus,theobjectiveofthesyntactic
moduleistodiscoverthearguments’categorytypes
andlocations.
Themoduleattemptstocreatevalidparsetrees
startingfromthesyntacticinformationalreadyas-
sumedbythetypingassumption(followingBut-
tery(2003)). Avalidparseisonethatadheres
totherulesofthecategorialgrammaraswellas
theconstraintsimposedbythecurrentsettingsof
theparameters. Ifavalidparsecannotbefound
thelearnerassumesthetypingassumptiontohave
failedandbacktrackstoallowtyperaising.
2.3 MemoryModule
Thememorymodulerecordsthecurrentstateof
thehypothesis-space. Thesyntacticmodulerefers
tothisinformationtoplaceconstraintsuponwhich
syntactic categories may be hypothesized. The
moduleconsistsoftwohierarchiesofparameters
whichmaybesetusingtheBIPSalgorithm:
CategorialParametersdeterminewhetheracat-
egoryisinusewithinthelearner’scurrentmodel
oftheinputlanguage. Aninheritancehierarchyof
allpossiblesyntacticcategories(forupto veargu-
ments)isde nedandaparameterassociatedwith
eachone(Villavicencio,2002). Everyparameter
(exceptthoseassociatedwithprimitivecategories
suchasS)isoriginallysettoINACTIVE,i.e. no
categories(exceptprimitives)areknownuponthe
commencementoflearning.Acategorialparameter
mayonlybesettoACTIVEifitsparentcategory
isalreadyactiveandtherehasbeensatisfactoryev-
idencethattheassociatedcategoryispresentinthe
languageoftheenvironment.
WordOrderParametersdeterminetheunderly-
ingorderinwhichconstituentsoccur.Theymaybe
settoeitherFORWARDorBACKWARDdepend-
ingonwhethertheconstituentsinvolvedaregen-
erallylocatedtotherightorleft. Anexampleis
theparameterthatspeci esthedirectionofthesub-
jectofaverb: ifthelanguageoftheenvironment
isEnglishthisparameterwouldbesettoBACK-
WARDsincesubjectsgenerallyappeartotheleftof
theverb.Evidenceforthesettingofwordorderpa-
rametersiscollectedfromwordorderstatisticsof
theinputlanguage.
3 TheacquisitionofanEnglish-type
language
TheEnglish-likelanguageofthethree-parameter
system studied by Gibson and Wexler has the
parameter settings and associated unembedded
surface-stringsasshowninFigure4. Forthistask
weassumethatthesurface-stringsoftheEnglish-
likelanguageareindependentandidenticallydis-
tributedintheinputtothelearner.
Speci er Complement V2
0(Left) 1(Right) 0(off)
1. SubjVerb
2. SubjVerbObj
3. SubjVerbObjObj
4. SubjAuxVerb
5. SubjAuxVerbObj
6. SubjAuxVerbObjObj
7. AdvSubjVerb
8. AdvSubjVerbObj
9. AdvSubjVerbObjObj
10. AdvSubjAuxVerb
11. AdvSubjAuxVerbObj
12. AdvSubjAuxVerbObjObj
Figure4:Parametersettingsandsurface-stringsof
GibsonandWexler’sEnglish-likeLanguage.
3.1 Ef ciencyofTriggerLearningAlgorithm
FortheTLAtobesuccessfulitmustconvergeto
thecorrectparametersettingsoftheEnglish-like
language.BerwickandNiyogi(1996)modeledthe
TLAasaMarkovprocess(seeFigure5).
Usingthismodelitispossibletocalculatethe
probabilityofconvergingtothetargetfromeach
startinggrammarandtheexpectednumberofsteps
beforeconvergence.
ProbabilityofConvergence:
ConsiderstartingfromGrammar3,aftertheprocess
 nishesloopingithasa 3=5 probabilityofmov-
ingtoGrammar 4 (fromwhichitwillnevercon-
verge)anda2=5probabilityofmovingtoGrammar
7(fromwhichitwillde nitelyconverge),therefore
thereisa40%probabilityofconvergingtothetarget
grammarwhenstartingatGrammar3.
5
ExpectednumberofStepstoConvergence:
LetSn betheexpectednumberofstepsfromstate
ntothetargetstate.Forstartinggrammars6,7and
8,whichde nitelyconverge,weknow:
S6 = 1 + 56S6 (1)
S7 = 1 + 23S7 + 118S8 (2)
S8 = 1 + 112S6 + 136S7 + 89S8 (3)
andforthetimeswhenwedoconvergefromgram-
mars3and1wecanexpect:
S1 = 1 + 35S1 (4)
S3 = 1 + 3133S3 (5)
Figure6showstheprobabilityofconvergenceand
expectednumberofstepstoconvergenceforeach
ofthestartinggrammars.Theexpectednumberof
stepstoconvergencerangesfromin nity(forstart-
inggrammars 2 and 4)downto 2:5 forGrammar
1. Ifthedistributionoverthestartinggrammarsis
uniformthentheoverallprobabilityofconverging
isthesumoftheprobabilitiesofconvergingfrom
eachstatedividedbythetotalnumberofstates:
1:00 + 1:00 + 1:00 + 1:00 + 0:40 + 0:66
8 = 0:63(6)
andtheexpectednumberofstepsgiventhatyou
convergeistheweightedaverageofthenumberof
stepsfromeachpossiblyconvergingstate:
5:47 + 14:87 + 6 + 21:98  0:4 + 2:5  0:66
1:00 + 1:00 + 1:00 + 1:00 + 0:40 + 0:66 = 7:26(7)
3.2 Ef ciencyofCategorialGrammarLearner
TheinputdatatotheCGLwouldusuallybeanut-
teranceannotatedwithalogicalform;theonlydata
availableherehowever,issurface-stringsconsist-
ingofwordtypes.Hence,forthepurposeofcom-
parisonwiththeTLAthesemanticmoduleofour
learnerisby-passed;weassumethatmappingsto
semanticformshavepreviouslybeenacquiredand
thatthesubjectandobjectsofsurface-stringsare
known. Forexample,givensurface-string1(Subj
Verb) weassume the mapping Verb 7! verb(x),
whichprovidesVerbwithasyntacticcategoryofthe
formajbbythetypingassumption(wherea, bare
unknownsyntacticcategoriesand j isanoperator
overnand=);wealsoassumeSubjtomaptoaprim-
itivesyntacticcategorySB,sinceitisthesubjectof
Verb.
ThecriteriaforsuccessfortheCGLwhenacquir-
ingGibsonandWexler’sEnglish-likelanguageisa
lexiconcontainingthefollowing:4
Adv S=S Aux [SnSB]=[SnSB]
Obj OB Verb SnSB
Subj SB [SnSB]=OB
[[SnSB]=OB]=OB
where S (sentence), SB (subject)and OB (ob-
ject)areprimitivecategorieswhichareinnatetothe
learnerwith SB and OB assumedtobederivable
fromthesemanticmodule.
DuringthelearningprocesstheCGLwillhave
constructedacategoryhierarchybysettingappro-
priatecategorialparameterstotrue(seeFigure7).
Thelearnerwillhavealsoconstructedaword-order
hierarchy(Figure8), settingparameterstoFOR-
WARDorBACKWARD.Thesehierarchiesareused
duringthelearningprocesstoconstrainhypothe-
sizedsyntacticcategories. Forthistasktheset-
tingoftheword-orderparametersbecomestrivial
andtheirroleinconstraininghypothesesnegligible;
consequently,therestofourargumentwillrelateto
categorialparametersonly.Forthepurposeofthis
gendir=/a
aa!!!
subjdir=n vargdir=/
Figure8:Word-orderparametersettingsrequiredto
parseGibsonandWexler’sEnglish-likelanguage.
analysisparametersareinitializedwithuniformpri-
orsandareoriginallysetINACTIVE.Sincethein-
putisnoiseless,theswitchingthresholdissetsuch
thatparametersmaybesetACTIVEupontheevi-
dencefromonesurface-string.
Itisarequirementoftheparametersettingde-
vicethattheparent-typesofhypothesizedsyntax
categoriesareACTIVEbeforenewparametersare
set. Thus,thelearnerisnotallowedtohypoth-
esizethesyntactic category foratransitive verb
[[SnSB]=OB]beforeithaslearntthecategoryfor
anintransitiveverb [SnSB]; thisbehaviourcon-
strainsover-generation. Additionally,itisusually
notpossibletoderiveaword’sfullsyntacticcate-
gory(i.e.withoutanyremainingunknowns)unless
itistheonlynewwordintheclause.
Asaconsequenceoftheseissues,theorderin
whichthesurface-stringsappeartothelearneraf-
4Notethatthelexiconwouldusuallycontainorthographic
entriesforthewordsinthelanguageratherthanwordtypeen-
tries.
6
fectsthespeedofacquisition. Forinstance, the
learnerpreferstoseethesurface-stringSubjVerb
before Subj VerbObjsothat it can acquire the
maximuminformationwithoutwastinganystrings.
FortheEnglish-typelanguagedescribedbyGib-
sonandWexlerthelearnercanoptimallyacquire
thewholelexiconafterseeingonly5surface-strings
(onestringneededforeachnewcomplexsyntactic
categorytobelearnt).However,thestringsappear
tothelearnerinarandomordersoitisnecessaryto
calculatetheexpectednumberofstrings(orsteps)
beforeconvergence.
ThelearnermustnecessarilyseethestringSubj
Verbbeforeitcanlearnanyotherinformation.With
12 surface-strings the probability of seeing Subj
Verbis1=12andtheexpectednumberofstringsbe-
foreitisseenis12.Thelearnercannowlearnfrom
3surface-strings:SubjVerbObj,SubjAuxVerband
AdvSubjVerb.Figure9showsaMarkovstructure
oftheprocess.Fromthemodelwecancalculatethe
expectednumberofstepstoconvergetobe24:53.
4 Conclusions
TheTLAandCGLwerecomparedforef ciency
(expectednumberofstepstoconvergence)when
acquiringtheEnglish-typegrammarofthethree-
parametersystemstudiedbyGibsonandWexler.
TheexpectednumberofstepsfortheTLAwas
foundtobe7:26butthealgorithmonlyconverged
63%ofthetime.Theexpectednumberofstepsfor
theCGLis24:53butthelearnerconvergesmorere-
liably;atradeoffbetweenef ciencyandsuccess.
WithnoiselessinputtheCGLcanonlyfailifthere
isinsuf cientinputstringsorifBayesianpriorsare
heavilybiasedagainstthetarget. Furthermore,the
CGLcanbemaderobusttonoisebyincreasingthe
probabilitythresholdatwhichaparametermaybe
setACTIVE;theTLAhasnomechanismforcoping
withnoisydata.
TheCGLlearnsincrementally; thehypothesis-
spacefromwhichitcanselectpossiblesyntactic
categoriesexpandsdynamicallyand, asaconse-
quenceofthehierarchicalstructureofparameters,
thespeedofacquisitionincreasesovertime. For
instance,inthestartingstatethereisonlya 1=12
probabilityoflearningfromsurface-stringswhereas
instatek(whenallbutonecategoryhasbeenac-
quired)thereisa 1=2 probability. Itislikelythat
withamorecomplexlearningtaskthebene tsof
thisincrementalapproachwilloutweightheslow
startingcosts.Relatedworkontheeffectsofincre-
mentallearningonSTLperformance(Sakas,2000)
drawssimilarconclusions. Futureworkhopesto
comparetheCGLwithotherparametriclearners
(suchastheSTL)inlargerdomains.

References
RBerwickandPNiyogi.1996.Learningfromtrig-
gers.LinguisticInquiry,27(4):605 622.
HBorer. 1984. ParametricSyntax: CaseStudies
inSemiticandRomanceLanguages.Foris,Dor-
drecht.
EBriscoe.1999.Theacquisitionofgrammarinan
evolvingpopulationoflanguageagents.Machine
Intelligence,16.
PButteryandTBriscoe.2004.Thesigni canceof
errorstoparametricmodelsoflanguageacquisi-
tion. TechnicalReportSS-04-05,AmericanAs-
sociationofArti cialIntelligence,March.
PButtery. 2003. Acomputationalmodelfor rst
languageacquisition.InCLUK-6,Edinburgh.
NChomsky.1965.AspectsoftheTheoryofSyntax.
MITPress.
NChomsky. 1981. LecturesonGovernmentand
Binding.ForisPublications.
RClark. 1992. Theselectionofsyntacticknowl-
edge.LanguageAcquisition,2(2):83 149.
Ethnologue. 2004. Languages of the
world, 14th edition. SIL International.
http://www.ethnologue.com/.
JFodor. 1998a. Parsingtolearn. JournalofPsy-
cholinguisticResearch,27(3):339 374.
JFodor.1998b.Unambiguoustriggers.Linguistic
Inquiry,29(1):1 36.
EGibsonandKWexler.1994.Triggers.Linguistic
Inquiry,25(3):407 454.
JLegate. 1999. Wastheargumentthatwasmade
empirical? Ms,MassachusettsInstituteofTech-
nology.
WSakasandJFodor.2001.Thestructuraltriggers
learner. InSBertolo,editor,LanguageAcquisi-
tionandLearnability,chapter5.CambridgeUni-
versityPress,Cambridge,UK.
WSakas.2000.AmbiguityandtheComputational
FeasibilityofSyntaxAcquisition. Ph.D.thesis,
CityUniversityofNewYork.
J Siskind. 1996. A computational study of
crosssituationaltechniquesforlearningword-to-
meaning mappings. Cognition, 61(1-2):39 91,
Nov/Oct.
MSteedman. 2000. TheSyntacticProcess. MIT
Press/BradfordBooks.
A Villavicencio. 2002. The acquisition of a
uni cation-based generalised categorial gram-
mar.Ph.D.thesis,UniversityofCambridge.
CYang.2002.KnowledgeandLearninginNatural
Language.OxfordUniversityPress.
