HAL Id: hal-00356671
https://hal.archives-ouvertes.fr/hal-00356671
Preprint submitted on 28 Jan 2009
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Learning of Complex Temporal Sequences
Matthieu Lagarde, Pierre Andry, Philippe Gaussier
To cite this version:
Matthieu Lagarde, Pierre Andry, Philippe Gaussier. The Role of Internal Oscillators for the One-Shot
Learning of Complex Temporal Sequences. 2007. �hal-00356671�
Learning of Complex Temporal Sequenes.
MatthieuLAGARDE,PierreANDRY,PhilippeGAUSSIER
ETIS,NeuroybernetiTeam,UMRCNRS8051
2,avenueAdolphe-Chauvin,UniversityofCergy-Pontoise,Frane
{lagarde,andry,gaussier}ensea. fr
Abstrat. Wepresentanartiialneuralnetworkusedtolearnonline
omplextemporalsequenesofgesturestoarobot.Thesystemisbased
on a simple temporal sequenes learning arhiteture, neurobiologial
inspiredmodelusing someof the properties ofthe erebellumand the
hippoampus, plus a diversity generator omposed of CTRNN osilla-
tors.The useof osillators allows toremove theambiguity of omplex
sequenes. The assoiations with osillators allow to build an internal
stateto disambiguatethe observable state.Tounderstandtheeetof
thislearningmehanism,we omparetheperformaneof(i)ourmodel
with (ii) simple sequene learning model and with (iii) the simplese-
quene learning model plus a ompetitive mehanism between inputs
andosillators.Finally,wepresentanexperimentshowingaAIBOrobot,
whihlearnsandreproduesasequeneofgestures.
1 INTRODUCTION
Ourlongtermgoalistobuild anautonomousrobotabletolearnsensorimotor
tasks. Suh a systemshould be(i) able to aquire new behaviors : gestures,
objetsmanipulation assequenes ombining multimodal elementsof dierent
levels.Todothis,anautonomoussystemmust(ii)takeadvantageofinformation
oftheassoiationsbetweenvisionandmotorapabilities.Thispaperfouseses-
sentiallyon the rst point : learning, prediting and reprodution of omplex
sensorimotorsequenes.
Inthissope,solutionsbasedonneuralnetworksareaninterestingsolution.Neu-
ralnetworksareabletolearnsequenesusingassoiativemehanisms.Moreover,
thesenetworksoeralevelofoding(neuron)thattakesintoaountinforma-
tionaboutthelowersensorimotorsystem;suhsystemsavoidtheuseofsymbols
or information that ould separate the sequene learningomponent from the
building of assoiations between sensation and ation. Networks are adapted
to online learning favoring easier interations with humans and other robots.
Among these models, haoti neural networks are based onreurrent network
(RN).In[1℄,afullyonnetedRNlearnsasequenethankstoasinglelayerof
neurons.Thedynamisgeneratedbythenetworkhelptolearnashortsequene.
Afterafewiterations,thelearnedsequenevanishesprogressively.In[2℄,aran-
neurons. The rst layer generates an internal dynami by means of a RRN.
Theseond layergeneratesaresonanephenomenon.Thenetworklearnsshort
sequenes of 7or 8 states. But this model is highly sensitive to noises orthe
stimulusvariationsanddoesnotlearnlongperiods sequenes.Asimilarmodel
is the Eho States Network (ESN) based on RRN for short term memory [3℄
(STM).Under ertainonditions(detailedin [4℄),theativationofeahneuron
in thehiddenlayer,isafuntion oftheinputhistorypresentedtothenetwork;
thisistheehofuntion.Oneagain,theideaistouseareservoirofdynamis
from whih the desired output islearned in onjuntion with the eet of the
inputativity.
Intheontextof robotis,manymodels onerngesturelearning.Bymeansof
nonlinear dynamial systems, [5℄ develops ontrol poliies to approximate the
reorded movements and to learn them with atting of mixture model using
areursiveleastsquareregressiontehnique.In[6℄,thetrajetoriesofgestures
areaquiredbytheonstrutionofmotorskillswithaprobalistirepresentation
ofthemovement.Trajetoriesanbelearntthroughviapoints[7℄withparallel
vetor-integration-to-endpointmodels[8℄.Inourwork,wewishtobeabletore-
useanddetetsubsequenesandpossibly,ombinethem.Thus,weneedtolearn
someoftheimportantomponentsofthesequeneandnotonlytoapproximate
thetrajetoryofthegesture.
In this paper, we present a biologially inspired model of neural network for
temporalomplexsequeneslearning.Arstapproahdesribedin[9℄proposes
aneuralnetworkfortheonlinelearningofthetimingbetweeneventsforsimple
sequenes (with non ambiguous stateslike A B C). We propose a model for
omplexsequenes(withambiguousstateslikeAandBinABACB).Inor-
dertoremovetheambiguousstatesortransitions,weusebatteriesofosillators
asareservoirofdiversityallowingtoseparate theinputsappearing repeatedly
in thesequene. Insetion 3, weshowresults from simulationsomparing the
performanes of 3 dierent systemsinvolved in the learningand reprodution
of thesameset of omplexsequenes: (i)the systemdesribed in [9℄,(ii) this
system plus a simple ompetitive mehanism between the osillators and the
input(showingtheeet ofaddinginternal dynamis inorder toseparate am-
biguousstates) and(iii)asystemoptimizingtheuseoftheosillatorsbyusing
anassoiativelearningruleinorder toreruitnewinternalstateswhenneeded
(repetition of the same input state). Setion 4 details the appliation of our
modelonarealrobotforthelearningofaomplexgesture.Finally,weonlude
andpointoutsomeopenproblems.
2 A MODEL FOR TIMING AND SEQUENCE
LEARNING
Thearhiteture(Fig. 1)isbasedonaneurobiologialmodel[10℄inspiredfrom
some of the properties of the erebellum and the hippoampus. This model
uses assoiative learning rules between past inputs memorized as a STM and
here to the sequenes in whih the same state appears only one. The main
advantageofthismodelisthattheassoiativemehanismalsolearnsthetiming
ofthesequene,whihallowsauratepreditionsofthetransitionsthatompose
the sequene. In order to learnomplex sequenes in whih the same state is
repeatedseveraltimes,inourmodelwehaveaddedamehanismthatgenerates
internal dynamis and that an be assoiated with the repeated inputs of the
sequene. The assoiation between therepeated inputsand dierentativities
of the osillator allowsto ode hidden stateswith dierent and un-ambiguous
patternsofativities.Asaresult,ourarhiteturemanagestolearn/preditand
reprodueomplextemporalsequenes.
Fig.1.Complexsequeneslearningmodel.Barredlinksaremodiableonnexions.The
othersareassoiatedtounmodiableonnexions.Theleftpartisdetailedingure3.A
and3.B.Therightpartisdetailedingure4.
2.1 Generating internal diversity
Osillators are very muh used in roboti appliations like loomotion using
entralpatterngenerator(CPG)[11℄.Anosillatorisaontinuoustimereurrent
neural network (CTRNN) omposed of twoneurons (Fig. 2.A). The study on
CTRNNsanbefoundin[12℄.Thiskindofosillatorsisknownforstability,and
resistanetothenoises.CTRNNareeasytoimplementtoo.ACTRNNoupling
twoneuronsproduesanosillator(Fig.2.B):
τ e . dx
dt = −x + S((w ii ∗ x) − (w ji ∗ y) + w e
const)
(1)τ i . dy
dt = −y + S((w jj ∗ y) + (w ij ∗ x) + w i
const)
(2)with
τ e
atimeonstantfortheexitatoryneuronandτ i
fortheinhibitoryneuron.x
andy
aretheativityoftheexitatoryandtheinhibitoryneuronrespetively.w ii
istheweightofthe reurrentlink of theexitatoryneuron,w jj
the weightofthereurrentlink oftheinhibitory neuron.
w ij
istheweightofthelinkfromtheexitatoryneurontoinhibitoryneuron.
w ji
istheweightofthelinkfromtheinhibitoryneurontoexitatoryneuron.
w e
const andw i
const aretheweightsofthelinks from the onstantinputs. And
S
is thetransferfuntion of eah neuron.Inourmodel,weusethreeosillatorswith
τ e = τ i
.A. B.
0 0.2 0.4 0.6 0.8
0 50 100 150 200 250 300
time
Fig.2.A. Osillator model.Left neuronisexitatoryand right neuronis inhibitory.
Exitatory links are :
W
ii= 1
,W
jj= 1
,W
ij= 1
. Inhibitory links is :W
ji= −1
,Constantinputvalueisequalto
1
withonstantlinksW
econst
= 0 . 45
andW
iconst
= 0
.Initialativitiesofneuronsare
X (0) = 0
,Y (0) = 0
.B. Display oftheinstantaneousmeanfrequenyativity of3osillatorssystemswith
τ
1= 20
(plainline),τ
2= 30
(longdashedline),τ
3= 50
(shortdashedline).2.2 learning ofinternal states
Inordertouserepeatedlythesameinputinagivensequene,dierentongu-
rationsofosillatorsanbeassoiatedwiththesameinput.Tounderstandthe
generation of diversity and its impliation in our learning algorithm,we have
tested twomehanisms: asimpleompetitionouplinginputstateswith osil-
lators (Fig. 3.A) and an assoiativemehanismbased on alearning rule (Fig.
3.B) that reruitsneuronsaordingto theativitiesof theosillatorsand the
repeatedinputs.
CompetitivemehanismTheompetitionisomputedasfollow:eahneuron
ij
ofthe Competition groupatsasanneuron performingthe logialoperator ANDbetweentheneuronsoftheInputsgroupandoftheOsillatorsgroup:P ot ij = (w input
i∗ x input
i+ w osci
j∗ x osci
j) − threshold ij
(3)with
w input
i= 1
,w osci
j= 1
,threshold ij = 1.2
,x input
i theativityoftheinputatindex
i
andx osci
j theativityoftheosillatoratindexj
.Inaseondstep,aompetitionbetweenallneurons
ij
oftheCompetitiongroup isapplied:W inner ij =
1 if ij = Argmax ij (P ot ij )
0 otherwise
(4)Thewinnerneuronbeomestheinputofthetemporalsequenelearningnetwork
(subsetion2.3).Inthisway,areservoirofosillatorneuronsanbeusedasa
waytoassoiatethesameinputwithdierentinternalpatterns.Intuitively,the
simpleompetition(nolearningisrequiredhere)allowstodiretlyseletdier-
entinternalstatesorrespondingtothesameinputrepeatedmanytimesinthe
sequene. For examplein Fig. 3.A,eah input (A,B,C,D) anappears upto 3
times(orrespondingtothenumberofosillators)in thesamesequene.More-
of thesequene. Obviously, iftheompetition betweenosillators is anavenue
worthexploring,itisstillpossibletohaveambiguity.Aninputanbeassoiated
withsamewinnerosillatortwoormoretimes.Consequently,thereisstillpoten-
tialambiguitiesontheinternal statesofourmodel,andsomesequenesould
notbereproduedorretly. A preisemeasure of thisproblem orresponds to
the probability that the samestateanbeassoiatedwith the sameosillator
several timesand therefore theinternal statepartially depends ofthe shape,
phaseand numberofosillators. Typially,the problemhappenswhen agiven
stateomesbakwiththesamefrequenyastheseletedosillator.Theurves
C2andC3ongure5showtheperformanesoftheompetitivemehanism.To
solvethisproblem,anassoiativemehanismallowingtoreruitneuronsoding
internal stateshasbeenadded.
A. B.
Fig.3.A.Modeloftheneuralnetworkouplinganinputstatewithanosillator.All
linksarexedonnetions.B.Modeloftheneuralnetworkusedtoassoiate aninput
statewithaongurationofosillators.Onlyfewlinksarerepresentedforthelegibility.
Dashedlinksaremodiableonnetions.Solidlinksarexedonnetions.
Assoiative mehanism The learning proess of an assoiation between an
inputstateandaongurationofosillatorsis:
U S = w i ∗ x i
(5)with
w i
the weightof the link from input statei
, andx i
ativity of theinputstate
i
. IfU S > threshold
, we ompute the potential and the ativity of theneuronasfollow:
P ot j =
M
osciX
j=0
|(w j − e j )| Act j = 1 1 + P ot j
(6)
with
M osci
thenumberofosillators,w j
theweightofthelinkfromosillatorj
,and
e j
theativityoftheosillatorj
.Theneuronthathastheminimumativityis reruited :
W in = Argmin j (Act j )
. Initial weigths of onnexions have highvalues.Theosillatorsongurationislearntaordingto theerrorof distane
∆w j = ε(e j − w j )
withε
alearning rate,w j
weight of link from Osillatorj
,and
e j
ativityof osillatorj
. The Assoiationsgroup beomes thenew inputof the temporal sequene learningnetwork(subsetion 2.3). As showed onthe
gure3.B,aninputallowsreruiting3dierentneuronsodinginternalstates.
Theyorrespond totheonnetivityoftheunonditionallinks hosenbetween
theInputsgroupandtheAssoiationgroup.Theassoiativemehanismensures
toreruitanewinternal stateforeahinput(A,B,CorD)fromthesequene.
The onnetivity oflinks betweentheInput group andthe Assoiationgroup,
has been hosento haveanumberof hidden statesequal foreah input. This
allows theomparison between the dierentmodels in our simulations.But it
ouldbepossibletohangetheonnetivityofthelinkstoallowthereruitement
of morehiddenstatesforeah repeatedinput in thesequene.We havetested
thismehanismin ourarhitetureinsimulationandrobotiappliation.
2.3 Temporal sequeneslearning
Fig.4.Representation ofhippoampus.EntorhinalCortex (EC) reeivesinputs and
transmits them to Dentate Gyrus (DG)and CA3 pyramidal ells. Between the DG
groupandtheCA3 groupthereare fullyonnetedwithmodiableonnetions.Be-
tweentheECgroupandCA3group,andtheECgroupandDGgroup,therearexed
onetoneighborhoodonnetions.
Thispartofmodelisbasedonashematirepresentationofhippoampus[10℄
(Fig. 4). DG represents past state (STM), and develops a temporal ativity
spetrum.CA3linksallowpatternompletionandreognitionbetweeninoming
statefromECandpreviousstatemaintainedinDG.WesupposetheDGativity
anbemodelledasfollow:
Act DG j,l (t) = 1 m j
· exp − (t − m j ) 2 2 · σ j
(7)
with
Act DG j,l
theativityoftheellatindexl
onthelinej
,t
thetime,m j
atimeonstantand
σ j
thestandarddeviation.Neuronsononelinesharetheirativityin thetimeandrepresentatemporaltraeofEC.Learningofanassoiationis
ontheweightsoflinksbetweenCA3andDG.Thenormalizationoftheativity
omingfromDGneuronsisperformedduetothenormalizationoftheDG-CA3
weights.
W CA3(i,j) DG(j,l) =
Act
DGj,lP
j,l
( Act
DGj,l)
2if Act DG j 6= 0 unchanged otherwise
(8)
Interestingly, this model has the property to work when a same input omes
isstoredduringthetotaltimeofitspresene.Consequently,twosuessivestates
arenotambiguousforthesystem(AA =A).
3 SIMULATION RESULTS
Atemporalsequeneofstatesisrarelyreplayedtwotimesexatlywiththesame
rhythm.Timeanvarybetweentwostatesespeiallywhendemonstratingase-
quene toarobot. Inoursimulationsweapply atimevariationbetweenstates
and observethe onsequenes on three arhitetures. The rst arhiteture is
themodel ofsimplesequeneslearningpresentedin subsetion2.3. Theseond
isthesamemodelplustheompetitivemehanismpresentedinsubsetion 2.2.
The third arhitetureis thesame astherst oneplusthe assoiativemeha-
nism (Fig. 1) seenin subsetion 2.2. Referenessequenes are generated to be
suessfully reprodued by the seond arhiteture with a timing variation of
0%. All arhitetures aretrainedwith thesamesequenesand thesamemaxi-
mumoftiming variation(0%,5%or10%), but withatimevariationrandomly
hosenbetween0and themaximumvariationofthetime. Inourexperiments,
tobootstrapasequene,weprovidetherststate.Consequently,thisstatewill
not be ambiguousin thesequenes. Forexample,a omplexsequenes an be
DBCBACAB :D isthestartingstateanditwillnotberepeatedafter.
Fig.5showstheperformanesofeaharhiteture.Weanseethattherstar-
0 10 20 30 40 50 60 70 80 90 100
2 3 4 5 6 7 8
% of sequences correctly reproduced
sequence length
"C1"
"C2"
"C3"
"C4"
Fig.5.C1:rstarhiteture:simplesequeneslearning.Theresultsarethesamewith
timevariationof0%,5%and10%.C2:seondarhiteture:omplexsequeneslearn-
ingwithaompetitivemehanismandatimevariationof5%.C3:seondarhiteture:
omplexsequeneslearningwithaompetitivemehanismandatimevariationof10%.
C4:urrentarhiteture:omplexsequeneslearningwithanassoiativemehanism.
Theresultsarethesametimingvariationof0%,5%and10%.
hiteture(subsetion2.3)hasverygoodperformaneswithsequenesof3and
4 states, beause those sequenes have no repeated states (simple sequenes).
With sequenes havingmore than 4 states, the performanes fall drastially ,
time variationhas noeet on theperformanes.The arhiteture annotre-
produe them, beauseCA3group learnstwotransitions and, thus itpredits
twostates for eah repeatedinput. The seond arhiteture using ompetitive
mehanism,hasbetterperformanes,but,aswehaveseenpreviouslyinsubse-
tion2.2,ambiguousinternalstatesanappearandreduethisgainofsequenes
orretly reprodued. Consequently, like the rst arhiteture, the CA3group
learnstwointernal statesand, thus, itpreditstwostatesfromoneinputre-
peated.Wean see,theperformaneshangeaordingto thetimingvariation
between states in the sequenes : a sameinput from agiven sequene an be
assoiatedwith twodierentosillatorsand, onsequentlyadierentinternal
statewins.Thankstothereruitmentmehanism,thethirdarhiteture,hasthe
best performanes : 100% with all tested sequenes. There are notambiguous
statesor internal states.Thetimevariationhasnoeetontheperformanes
ofthemodel.
4 ROBOTIC APPLICATION
A. B.
Fig.6.A. Representation ofdesired sequene.It beginsfrom thestart point. B.We
manipulate Aibo passively. It learns the suession of orientations of the movement
fromthesefrontleftlegmotorinformation.
Therobot used in ourexperiments is an Aibo ERS7 (Sony). In ourappli-
ation, we use onlythe front left leg,in apassivemovement mode to learna
sequeneofgestures.ThesequenetobelearnedandreproduedisshowedFig.
6.A.Inthisappliation,wetestthethird arhiteturepreviouslydesribed.
During learning, we manipulate the front left leg of the robot passively (Fig.
6.B). Duringtheexeutionof themovement,theneural network learnsonline,
and in one shot the suession of the joints orientation thanks to the motors
feedbakinformationofitsleg(proprioeptivesignal).Hene,theinputsofour
model are theorientations/anglesof theleg.The reordedmotors information
while learningareshownin Fig.7.X-learning (horizontal movements) andFig.
7.Y-learning(vertialmovements).
Toinitiatethereprodutionofthesequenebytherobot,wegivetherststate
of the sequene (down). As Aibo an not be manipulated when motors are
ativated, wesendtheommanddiretlytotherobot.Next,Aiboplaysthese-
quene autonomously(Fig.7,top).With thestartingstate,ourmodelpredits
0.5 0.55 0.6 0.65 0.7 0.75
0 200 400 600 800 1000 1200 1400 1600 1800
angle
time
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0 200 400 600 800 1000 1200 1400 1600 1800
angle
time
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0 200 400 600 800 1000 1200
angle
time
0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
0 200 400 600 800 1000 1200
angle
time
X-learning Y-learning X-reprodution Y-reprodution
0.1 0 0.5 1
0 0.5 1 0
0.5 1
0 0.5 1
Learntgesture Reproduedgesture
Fig.7.Top:Aiboreproduesthelearntsequene.Middle:X-learningandY-learning
arerespetivelythehorizontalandthevertialmotorsinformationwhilerobotlearns
the sequene.X-reprodution and Y-reprodution are respetively the horizontal and
thevertialmotorsinformationduringthereprodutionofthesequene.Onthegure
Y-reprodution,therstmovementisnotreprodued(notpredited),butgivenbythe
userinordertotriggerthereallitisourbootstrapstatetostartthesequene.X-axis
arethetimeandY-axisare theanglesofthemotors.
5 CONCLUSIONS AND DISCUSSIONS
Wehaveproposedamodelofneuralnetworkforthelearningofomplextempo-
ralsequenes.Thismodelintroduesanassoiativemehanismtakingadvantage
of adiversitygeneratoromposed ofosillators. These osillatorsare basedon
oupled CTRNN. This model is eient in theframe of autonomous robotis
andsueedin learninginoneshotthetimingofsequenesofgestures.
Duringtherobotisappliation,wehavenotiedthat therobot reproduesthe
sequene with adierent amplitude of the movement. This eet omes from
thespeedofthedisplaementofthelegofAibo.Inourappliation,thespeedof
thereprodutionisapredenedonstantdierentfromtheuserdynamiduring
learning.Therhythmofthesequeneisrespetedthankstoatemporalgroupof
neurons.ApossibleimprovementwouldbetoaddamodellikeCPG[5℄foreah
movements(up,down,left andright)omposing sequeneswithvariable
speeds. Inour model, the numberof neurons oding the assoiations between
theinputsandtheosillators,representsthesizeoftheshorttermmemory.In
oursimulationsandappliation,thesequeneslearntdonotsaturatethemem-
ory.Itwouldbeinterestingtoanalyzethebehavioroftheneuralnetworkwith
longer sequenes, andtest the limitations ofthe systemwhen the neurallimit
hasbeenreahedbythereruitmentmehanism.Inthepresentsystem,itwould
meanthatthealreadyreruitedneuronsouldbeerasedinordertoenodenew