ContentslistsavailableatScienceDirect
Journal
of
Neuroscience
Methods
j ou rn a l h om epa g e : w w w . e l s e v i e r . c o m / l o c a t e / j n e u m e t h
Decoding
intracranial
EEG
data
with
multiple
kernel
learning
method
Jessica
Schrouff
a,b,
Janaina
Mourão-Miranda
b,
Christophe
Phillips
c,
Josef
Parvizi
a,∗aLaboratoryofBehavioral&CognitiveNeuroscience,StanfordHumanIntracranialCognitiveElectrophysiologyProgram(SHICEP),
DepartmentofNeurology&NeurologicalSciences,StanfordUniversity,Stanford,CA,USA
bDepartmentofComputerScience,UniversityCollegeLondon,UnitedKingdom cCyclotronResearchCentre,UniversityofLiège,Belgium
h
i
g
h
l
i
g
h
t
s
•MultipleKernelLearning(MKL)canexploremultipledimensionssimultaneously.
•MKLmethodissparseandperformsfeatureselectionduringmodelingofthedata.
•WeproposetousetheMKLmethodtoanalyzeelectrophysiologydata(EEGorMEG).
•WeprovideaprototypeexampleofhowMKLmethodcanapplytoECoGdata.
•WeshowmultipledimensionsofECoGsignalcancontributetonumericalprocessing.
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received15August2015
Receivedinrevisedform23October2015 Accepted30November2015
Availableonline12December2015 Keywords:
Multiplekernellearning IntracranialEEG Machinelearning Featureselection
a
b
s
t
r
a
c
t
Background:Machinelearningmodelshavebeensuccessfullyappliedtoneuroimagingdatatomake predictionsaboutbehavioralandcognitivestatesofinterest.Whilethesemultivariatemethodshave greatlyadvancedthefieldofneuroimaging,theirapplicationtoelectrophysiologicaldatahasbeenless commonespeciallyintheanalysisofhumanintracranialelectroencephalography(iEEG,alsoknownas electrocorticographyorECoG)data,whichcontainsarichspectrumofsignalsrecordedfromarelatively highnumberofrecordingsites.
Newmethod:Inthepresentwork,weintroduceanovelapproachtodeterminethecontributionof differ-entbandwidthsofEEGsignalindifferentrecordingsitesacrossdifferentexperimentalconditionsusing theMultipleKernelLearning(MKL)method.
Comparisonwithexistingmethod:Tovalidateandcomparetheusefulnessofourapproach,weapplied thismethodtoanECoGdatasetthatwaspreviouslyanalysedandpublishedwithunivariatemethods. Results:OurfindingsprovedtheusefulnessoftheMKLmethodindetectingchangesinthepowerof variousfrequencybandsduringagiventaskandselectingautomaticallythemostcontributorysignalin themostcontributorysite(s)ofrecording.
Conclusions:Withasinglecomputation,thecontributionofeachfrequencybandineachrecordingsite intheestimatedmultivariatemodelcanbehighlighted,whichthenallowsformulationofhypotheses thatcanbetestedaposterioriwithunivariatemethodsifneeded.
©2015TheAuthors.PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Intracranial EEG (a.k.a., electrocorticography, ECoG) is the
recording ofbrain’s electrical activitywithintracranial sensors.
Suchsignalscontaininformationdistributedovermultiple
dimen-sions, i.e., the spatial (recordingsites), thetemporal (sampling
ratein1000samples/s)andfrequencydomains(0.1–300Hz).In
thecurrent practice,ECoG signals areprimarily analyzedusing
∗ Correspondingauthor.
E-mailaddresses:jparvizi@stanford.edu,jschrouf@stanford.edu(J.Parvizi).
univariate methods,e.g., todetectchangesin thepowerof the
electrophysiological activity during a given task, in a specific
recordingsiteofinterestandbandwidth,whichisoftenthe
high-frequencybroadband(HFB,50–200Hz).Aprimarymotivationfor
theunivariateanalysisofHFBinaspecificbrainregionistheextent
evidence that HFB provides a precise measure of local cortical
engagementduringagiventask(Changetal.,2010;Flinkeretal.,
2011;Manningetal.,2009;Mukamel,2005;Niretal.,2007;Pasley etal.,2012;Rayetal.,2008;RayandMaunsell,2011).Although
theseunivariatestudieshaveacrucialroleintestinghypotheses
aboutthelocalengagementofaspecificcorticalregioninagiven
task,amoresensitive,data-drivenmultivariateapproachisneeded
http://dx.doi.org/10.1016/j.jneumeth.2015.11.028
univariateanalyses.Theyhavealreadyprovenusefulwhen
inves-tigating local field potentials and spikesin both monkeys and
humans(e.g.,Meyersetal.,2008,2012;QuianQuirogaandPanzeri,
2009;Zhangetal.,2011b).However,modelinterpretationmightbe
complexastheobtainedmodelparameterscannotbethresholded
basedontheiramplitude/varianceratio.
In ourcurrent work,we present a MultipleKernel Learning
(MKL)approachtofacilitatemachinelearningmodelingofECoG
data.TheMKLmethodsimultaneouslylearnsandcombines
dif-ferentmodels,representedby differentkernels (seeGönen and
Alpaydin,2011fora review).MKLapproaches havebeen
previ-ouslyappliedtoneuroimagingdata,especiallyforcombinationof
signalsfromdifferentmodalities(e.g.,Daietal.,2012;Filipovych
etal., 2011;Filippone et al.,2012;Hinrichs et al.,2011; Zhang etal.,2011a).Furthermore,thecontributionofeachkerneltothe
finalmodelcanbeconstrainedtobesparse(Rakotomamonjyetal.,
2008). In the present work,each kernel is built fromdifferent
featuresoftheECoGdata,i.e.,thepowerofactivityinspecific
band-widthsofECoGsignalateachrecordingsite.MKListhereforeused
hereasatoolforfeatureselection.
Asaproofofconceptstudy,weexaminedthefeasibilityofthe
MKLapproach byapplyingthismethodtopreviouslypublished
human electrophysiological data (Dastjerdi et al., 2013; Foster
etal.,2012).Hereweshowthatwithasinglecomputation,the
MKLmodelisabletohighlightthecontributionofeachfrequency
bandateachrecordingsite,whichthencanbeusedtoformulate
hypothesestobetestedaposteriori.
2. Materialsandmethods
ThecurrentworkaimsatdemonstratingthefeasibilityofMKL
techniquesfortheanalysisofECoGdata.Tothisend,weconsidered
datathathasbeenpreviouslyanalyzedwithunivariatemethods
(Dastjerdi etal., 2013; Fosteret al.,2012).Theresultsof these
worksshowed(1)increasedHFBpowerintheretrosplenialcortex
(RSC)duringepisodicmemoryprocessingandinactivation(i.e.,no
changesinHFBpower)ofthesamerecordingsitesduringnumerical
processing,and(2)increasedHFBpowerintheintra-parietalsulcus
(IPS)duringnumericalprocessinganddeactivation(i.e.,decreased
HFBpower)inthesamerecordingsiteduringepisodicmemory
processing.Thepresentstudyprobesthesamedatasetinamore
exploratorywaybyconsideringallrecordingsitesandfrequency
bandsduringnumericalandepisodic memoryprocessinginthe
samemodel.
2.1. Demographicsandbraincoverage
Threesubjectswereimplantedwithintracranialelectrodesfor
presurgicalepilepsymonitoring.Allthreepatientssufferedfrom
medicationresistantepilepsyforwhichtheywereimplantedwith
intracranialelectrodesforinvasivevideoEEGmonitoring.The
pro-cedurewasapprovedbytheStanfordInstitutionalReviewBoard
(IRB)andthesubjectsprovidedwritteninformedconsentto
par-ticipateinthestudy.Thelocationofthegridswasdeterminedby
clinicalneeds,andtheseizurefociwerefoundtobeintheright
team–werealsodiscardedfromfurtheranalyses.Thislaststepwas
performedtoensurethatnon-pathologicalneuronalactivitywas
usedtoinvestigateourcognitiveneurosciencequestion.
2.2. Anatomicallocalizationofelectrodes
Post-implantCTimageswerealignedtothepre-opMRI
anatom-icalbrainvolume(Hermesetal.,2010).Electrodeswerevisualized
onthesubject’sownbrainvolumeandreconstructed3Dcortical
surfaceallowingforaccurateanatomicallocalizationofelectrodes.
2.3. ECoGexperiment
Data was recorded when the patients performed simple
true/falsejudgmentsofmemorysentencesormathematical
equa-tions(Fig.1A).Thememorysentencescomprisedself-episodic(e.g.,
“I atepizzathis week”), self-semantic (e.g.,“I eatpizzaoften”)
andself-judgment(e.g.,“Iamacuriousperson”)statements.Basic
mathematicaladditionswerepresentedalongwitharesult(e.g.,
“4+49=53”,furtherreferredtoas‘Math’condition).Interleaved
acrosstrialswere5scued-restperiods,duringwhichacentered
crosssignwasdisplayedonthescreenandpatientswereinstructed
tofixateandrest.Theexperimentcomprised96randomizedtrials
ofeachcondition(exceptforrest,∼66trials)andwasdividedin
twosessionsadministeredthesameday(P1)oronsuccessivedays
(P2,P3).
2.4. Preprocessing
AllpreprocessingstepswereperformedusingMatlab(www.
mathworks.com)and SPM(www.fil.ion.ucl.ac.uk/spm).The
con-tinuoustime seriesof thetwoexperimentalsessionswerefirst
re-referencedtotheaverageofthesignalsoverallselected
elec-trodes(considering eachsession separately).Thedatawasthen
filteredforpower-linenoise(60Hzanditsharmonics),and
down-sampledto436Hz.
2.5. Definitionofevents
Duetotheself-pacednatureoftheexperiment,andthushigh
variability in thereaction times (RT) acrossevents and
condi-tions,wedecidedtoextractthesignalascontiguousepochsof1s,
consideringeacheventonsetasanintegernumberofseconds.As
illustratedinFig.1B,each1swindowwasassociatedtoabinary
label(‘Math’ or ‘Non-math’) if a stimulus was theonly
stimu-luspresentduringtheselected1swindow.Forthis,werounded
eacheventonsettotheclosestinteger(ins),andlabeledthe1s
epochs correspondingto the duration of the event (asdefined
bytheRT,roundedtowardnegativeinfinity)withthesame
cat-egory as the event. Each1sepoch wasthen consideredas an
individualtrial.Thefourconditionsofself-episodic,self-judgment,
self-semanticand restwere pooledtogether as“non-math” (as
inDastjerdietal.,2013)andequalnumberof“math”and
“non-math” events was chosen in order to obtain balanced training
Fig.1.Experimentaldesignandfeatureextraction.(A)Examplesofstimuliofeachcategory.Thepatienthastoprovide‘true’or‘false’judgmentsonthepresentedstimulus, exceptforthe‘Rest’condition(fixationcross,presentedfor5s).Thenumberoftrialsperconditionis48ineachrun(exceptfor‘Rest’,33trials).Inthepresentwork,‘Math’ istheconditionofinterest.Thefourotherconditions(namelyself-episodic,self-semantic,self-judgmentandrest)arepooledtogethertoformthe‘Non-math’category.(B) Definitionof“Events”:Theeventonset(displayedbyverticalarrows)representsthebeginningofvisualstimuluspresentation.Eachstimulusisdisplayedonthecomputer screenuntiltheparticipantpressesabutton(1for‘true’or2for‘false’),inaself-pacedmannerwithvaryinglengthsofreactiontime(RT).Thenextstimulusisthendisplayed, afteraninter-stimulusinterval(ISI)of200ms.‘Rest’conditionhadfixedintervalsandthenextstimulusappearedwithouttheparticipantpressinganybuttons.Inthiswork, thecontinuoustimeserieswasdividedinto1swindows.Eachwindowwasassociatedtoalabel(‘Math’or‘Non-math’)ifastimuluswastheonlystimuluspresentduring theselected1swindow(bottomline,labels).Toavoidincludingnon-taskrelatedsignal,theeventonsetwasroundedtowardthenearestintegerandtheeventduration (herecorrespondingtotheRT)wasroundedtowardnegativeinfinity(i.e.,Matlab‘floor’).
2.6. Featureextraction
Each event was extracted (i.e., epoched) in the −200ms –
1200ms time window around itsonset (in integer seconds, as
definedabove).Nobaselinecorrectionorartifactrejectionwas
per-formedsuchthatthetemporalcorrelationbetweenepochsfrom
thesameeventwasmaintained.Atime-frequencydecomposition
wasthencomputed(Morletwavelets,7wavelets).The
frequen-ciesofinterestwerelog-spacedbetween1and110Hz(29values
intotal).Theresultingdecompositionwasrescaled(point-wise)
by the logarithm of its value.The instantaneouspower of the
signalin the0 –1000mstime-window wasthencomputed in
eachofthefollowingfrequency bands:ı(1–4Hz),(4–8Hz),˛
(8–12Hz),ˇ(15–25Hz),low- (30–55Hz)andanarrowbandof
HFB,high-(70–110Hz,toavoidresiduallinenoise),byaveraging
thefrequencybinswithinthosebands.Fig.2(top)displayssuch
featuresaveragedacrosscategories(“math”indashedgreenand
“non-math”ingrey)inthehigh-bandforelectrodes1,18,27and
40inP1.
2.7. Kernels
Theconsideredalgorithmuseskernelsasinputs.Kernelsare
matrices of size n×n (n being the number of samples/trials,
hereepochs),representingthepair-wisesimilaritybetween
sam-ples/trials.In thepresent work,linearkernels(i.e.,dotproduct)
werebuiltforeachelectrodeand eachfrequencyband,as
illus-trated in Fig. 2. Thus m kernels were computed based onthe
averagedpowerofthesignalwithinaspecificfrequencybandon
eachchannel,withmbeingthenumberofelectrodes.Them×6
ker-nelswerethenconcatenated,toestimatewhatisfurtherreferredto
asthe‘full’model.Toensurethatthescaleofeachkerneldoesnot
playaroleinthemodelingstep,allkernelsweremean-centered
and normalizedbeforeentering theclassification algorithm (by
takingintoaccountthetrain/testseparation,seeSection2.9).
2.8. Machinelearningmodeling
Allmachinelearningmodelingstepswereperformedbasedon
oursoftwarePRoNTo(Schrouffetal.,2013,www.mlnl.cs.ucl.ac.uk/
pronto),whichhasbeenadaptedtoanalyzeelectrophysiological
dataintheSPMMEEGformat.Thebuiltkernelswereconsideredfor
MKLmodelingusingthe“simpleMKL”versionof(Rakotomamonjy
etal.,2008)(illustratedinFig.2).ThisalgorithmusesaSupport
Vec-torMachine(SVM,(CortesandVapnik,1995))todefineadecision
boundary(fm),discriminatingbetween“math”and“non-math”per
kernel.Todetermineeachdecisionboundary,modelparameters
(wm)areoptimized.Thosedifferentdecisionboundaries(oneper
kernel)arethenweighted(byaparameterdm)todefineaglobal
decisionboundaryf.Thesetwostepsareimplementedintoa
recur-siveoptimizationprocedure,andhenceboththeweightswmand
thekernelcontributionsdmdependonallthefeatures.The
regu-larizationconstraintsintheconsideredalgorithmleadtoasparse
selectionofnon-nullcontributions(dm)tothefinaldecision
func-tion,i.e.,onlysomekernelswillhaveanon-nullcontributiontothe
decisionfunction.Thepresentmodelcanbeseenasbeing
‘hierar-chical’,i.e.,foreachkernelitispossibletocomputetheweight,wtm,
ofeachfeaturetinfm(here436features,asfor‘classical’machine
learningtechniques),andeachkernelisweightedbyacontribution
dm.Informationonthecontributionofeachfeatureishence
avail-able.However,interpretingthecontributionofthekernelsmightbe
easieraseachkernelisbuiltonspecificdimensions,i.e.,MKL
‘sum-marizes’thecontributionsoverthechosendimensionby
automat-icallyselectingallthefeaturesfromthisdimensionwiththesame
contributiondm(butnotwiththesameweightwtm).Itisimportant
tonotethatifthosecontributionsdmwereequalforallkernels,the
MKLtechniquewouldcorrespondtoanSVMmodel(withmodel
parametersw)consideringallthefeaturesasconcatenated.
Inthepresentcase, eachkernel representsthepowerofthe
signalinoneelectrode,averagedwithinafrequencyband(ı,,˛,
ˇ,low-orhigh-).Foreachpatient,thefullmodelwasestimated
(m×6kernels).Inaddition,forcomparisonswithprevious
univari-ateresults(Dastjerdietal.,2013;Fosteretal.,2012),anMKLmodel
wasestimatedforeachfrequencyband(i.e.,withmkernels).
2.9. Cross-validationandaccuracy
Weassessedtheperformanceofthemodelonthe
experimen-talconditionbyusinga10-foldcross-validationscheme:themodel
Fig.2.Illustrationofthemultiplekernellearning(MKL)approachconsidered.Foreachelectrode,thetimecourseofthepowerinaspecificfrequencyband(herehigh-)is extractedforeachtrialinthe[01000]mswindowaroundonset.Thetoprowofthefiguredisplayssuchfeaturesaveragedacrossmath(dashedgreen)andnon-math(gray) trials(patientP1,session1),withnormalizedstandarderror(shadedareas).Fromthosefeatures,alinearkernelisbuiltforeachelectrode,asillustratedinthemiddlerowof thefigure(trialssortedbycategory,withMathtrialsinthetopleftcorner).Thismatrixissymmetricanddisplayswhethertrialsfromonecategoryaremoresimilartoone anotherthantotrialsfromanothercategory(asisthecaseforkernelsK18andK27butnotforkernelsK1andK40).Foreachkernel,themodelestimatesadecisionfunctionfm
(m=1...M,Mbeingthenumberofelectrodes)whichisthenweightedaccordingtoapositiveornullcontributiondm.Thefinaldecisionfunctionisthelinearcombination
ofthedifferentdecisionfunctionsestimatedoneachelectrode.Thecontributionsofallelectrodesmustsatisfytheconstraintsofsummingto1,andbeingpositiveornull, whichleadstoasparseselectionofelectrodescontributingtothefinalmodel.(Forinterpretationofthereferencestocolorinthisfigurelegend,thereaderisreferredtothe webversionofthisarticle).
andusedtopredictthelabelsofthe10%eventsleftout.The
predic-tionswerethencomparedwiththelabelsofthoseleftoutevents
tocomputetheaccuracyforeachcategory(“math”accuracyand
“non-math”accuracy,a.k.a.thesensitivity),thebalancedaccuracy
(averageoftheclassaccuracies)andthepositivepredictivevalues
foreachcategory(representingthespecificity).Thisprocesswas
repeated10times,withadifferentsplitof10%ofthedataleftout
eachtime.Eachsplitofthedataisfurtherreferredtoasa‘fold’.In
thiswork,theaccuracypresentedwasobtainedbyaveragingthe
valuesoverallfolds.
The considered algorithm is based on SVM models, which
includesasoft-marginparameterC.Thishyperparameterpenalizes
more(largevaluesofC)orless(smallvaluesofC)mis-classifications
duringtrainingandaffectstheresultingdecisionboundaryf.This
isparticularlythecaseinthepresentworksinceeachkernel is
basedon436features(i.e.,1electrode×1stime-windowsampled
at436Hz×averagedfrequencyband),andthenumberofevents
availablefortrainingisinthesameorder(368forP1,507forP2
and369forP3).Wethereforecannotexpectthecategoriestobe
linearlyseparableastheywouldinahigh-dimensionalspace.
Val-uesofCrangingfrom0.01to1000(C=10i,i=−2,−1,...,3)were
henceconsidered.Anestedcross-validationwasperformedwhen
modelingtheexperimentalconditions:theinnercross-validation
selectedthevalueofCleadingtothehighestmodelperformance
andtheoutercross-validationestimatedtheperformanceonatest
setusingtheselectedvalueofC.
2.10. Statisticalsignificance
Theindependencebetweenthetrainandtestsetsis
compro-miseddue totemporal autocorrelation in theconsideredtrials.
Inaddition,differentfoldsofthecross-validationareestimated
basedonoverlappingtrainingdata(hereasmuchas80%ofthe
dataissharedbetweentwofolds).Thishenceprecludestheuse
ofanyparametrictestssuchasbinomialtests(Noirhommeetal.,
2014;Pereiraetal.,2009)toassessthestatisticalsignificanceof
ourresults.Thesignificanceoftheperformanceofeachmodelwas
assessedusing1000permutationsofthetraininglabels.Results
associatedtoap-valuesmallerthan0.05werereportedas
signifi-cant.Balancedaccuracyvalueswereconsideredassignificantwhen
both theclassificationaccuracy forthe“math”and “non-math”
conditionsweresignificant.
2.11. Localizationofthekernelcontributions
Fig.3illustratestheoutputsoftheMKLmodelinthehigh-
bandforpatientP1.Fig.3Adisplaysthekernel(herechannel)
con-tributionsdm,whileFig.3Bdisplaystheweightswmforonechannel
inthisfrequencyband.
2.11.1. Sparsenessoftheresults
Aspreviouslymentioned,theconsideredalgorithmconstrained
Fig.3.IllustrationoftheoutputsoftheMKLmodel.(A)ParietalviewofthesurfaceofthecortexofpatientP1.Electrodesaredisplayedascirclesonthecortex(notto scale)withafillcolorcorrespondingtotheircontributiontothefinalmodel(dm,in%).(B)WeightswmforthechannelcircledinblackinpanelA.Theseweightsdisplaythe
contributionofeachtimepointfromthatchanneltotheMKLmodel.Asthemodelisnotsparseintermsofthewm,allthetimepointsconsideredformodelingcontribute
tothefinaldecisionfunction.
eachfold,acertainnumberofkernels–herecorrespondingtoa
combinationoffrequencyandelectrode(i.e.,recordingsite)–hada
non-nullcontributiontothefinaldecisionboundary.Becauseofthe
cross-validation,thecontributionsofeachkernelwereaveraged
acrossallfolds(here 10).Wehenceinvestigatedthestabilityof
dmacrossfoldsbycountinghowmanykernelswereconsistently
selectedinallfoldsorin80%ofthefolds(i.e.,8foldsoutof10in
theconsideredcross-validationscheme).
2.11.2. Anatomicallocalization
Ineachfrequencyband,welookedatthecontributionofeach
electrode(dm)tothemodel.Thisallowedlocalizingthe
informa-tionleadingtothediscriminationof“math”versus“non-math”in
termsofrecordingsite,i.e.,anatomicalposition.Forillustration,the
contributionofelectrodes(averagedacrossfolds)wasprojectedon
thepatient’s3Dbrainsurfacebycolor-codingthecircle
represent-ingtheanatomicalpositionofeachelectrode.Fig.3Adisplaysthe
parietalviewofthecortexofPatientP1,withchannelswithawhite
fillhavingaperfectlynullcontributiontothemodelandchannels
withapinktopurplefillcontributingtotheMKLmodel.
2.11.3. Frequencylocalization
Thefullmodel allowedcombiningtheinformation from
dif-ferentelectrodesanddifferentfrequencybands,whichhelpedus
investigatewhetherthediscriminatingsignalislocalizedinterms
of frequency or, on the opposite, distributed across frequency
bands.Tothisend,thedmvaluescorrespondingtoeachfrequency
bandweresummedacrosselectrodes,leadingtoaweightvalue
perfrequencyband.Thesearerepresentedonbargraphsforeach
patient.
2.11.4. Temporallocalization
Asmentionedintheprevioussection,itwouldbepossibleto
alsoinvestigatetheweightswtm foreachtimepointofthetime
courseofthepowerineachchannelandfrequencybandpair.Those
weightsaredisplayedinFig.3Bforonechannel/frequencyband
combination(channelcircledinblackinFig.3A).Asinterpreting
suchweight values is complex and hasrecently raisedvarious
issues(seee.g.,(Haynes,2015)foradiscussiononthetopic),we
havechosentofocusourinterpretationoftheresultsonthekernel
contributionsdm.
3. Results
3.1. Featuresandkernels
Eachparticipatingpatientwasimplantedwithdifferent
num-berofelectrodes.Afterdiscardingelectrodescontainingnoisyor
pathologicalsignals,differentnumbersofchannelswereselected
forP1(n=40),P2(n=102)andP3(n=97).Hence,foreachofthe
sixfrequencybands,40,102and97kernelswerecomputedfor
P1,P2andP3,respectively.Therefore,thefullmodelcomprised
40×6=240kernelsforP1,612forP2and582forP3.
3.2. Extractionofevents
Regardingthetwosessionsofexperimentalcondition,184math
eventswereextractedforP1, 255for P2and 185for P3.
Num-bersdifferacrossparticipantsbecausedifferentnumbersofevents
wererejectedduetothepresenceofartifactsorepileptic
activ-ity inthesignal.They werebalancedwithan equalnumber of
epochs fromeach ofthe fourother non-math conditions
(self-episodic,self-judgment,self-semanticandrestconditions,random
selection).
3.3. Modelperformance
Balancedandclass accuraciesestimatedfromthefullmodel
aredisplayedforeachsubjectinTable1.Themodelperformance
foreachfrequencybandisalsodisplayedforcomparisonwiththe
univariateresults(Dastjerdietal.,2013).
Forallsubjects,the6frequencybandsledtosignificant
clas-sification of mathversusnon-math. Classificationaccuracy was
highestinthehigh-bandforpatientsP1(88.92%)andP2(88.37%)
while it was highest in the ˛ bandfor patientP3 (75.03%).In
addition, the full model showed similar accuracy tothe
accu-racy obtainedfromthebestbandfor eachpatient: P1=89.51%,
P2=87.18%andP3=76.80%.
3.4. Localizationofdiscriminatingsignal
3.4.1. Sparsenessoftheresults
Thenumberofkernelswithnon-nullcontributionsacrossall
foldsispresentedinTable2.Thenumberofkernelsconsistently
selectedin80orin100%ofthefoldsarealsodisplayed.Fig.4A
shows ahistogramof thekernel contributions(averagedacross
folds),sortedindescendingorder,foreachpatient.
Table2showsthattheresultsaresparse,buthighlydependent
onthetrainingdata,whichisassumedtoslightlyvaryfromone
foldofthecross-validationtoanother.
3.4.2. Anatomicallocalization
Figs. 5,6 and7 display thecontributionofeach kernel (i.e.,
eachelectrodeineachoftheconsideredfrequencybands)foreach
patient,respectively.Theelectrodesarerepresentedbycircles(not
toscale)onthepatient’scortex,color-codedbasedontheir
con-tributiontothefullmodel(averageacrossfolds).Table3displays
the10electrode–frequencybandcombinationswiththehighest
contributiontothefullmodelforeachpatient.
ForpatientP1(Fig.5),electrode27hadthelargestcontribution
Table2
Numberofkernelswithanon-nullcontributiontothefullmodel,foreachpatient.Thepercentageofchannelsselectedinthemodelcomparedtothetotalnumberofchannels foreachpatientisdisplayedinbrackets.
Numberofkernelsselectedin PatientP1 PatientP2 PatientP3
Atleastonefold(averageacrossfolds) 180(75%) 309(50%) 344(59%)
Atleast80%ofthefolds 59(25%) 69(11%) 25(04%)
Allfolds 11(05%) 36(06%) 12(02%)
Table3
Top10rankingelectrode-frequencybandcombinationsforeachpatient,accordingtodmaveragedacrossfolds.
PatientP1 PatientP2 PatientP3
Band Elec dm(%) Band Elec dm(%) Band Elec dm(%)
high- 27 17.14 high- 97 03.40 high- 63 06.43
low- 27 03.81 ˇ 74 03.07 ˛ 23 05.88 high- 18 03.47 high- 83 02.89 23 05.71 high- 32 02.54 high- 67 02.78 ı 44 04.00 ˇ 14 02.38 high- 60 02.56 65 03.29 ˛ 17 02.04 high- 62 02.13 ˇ 44 03.17 low- 13 01.98 ˛ 63 02.09 16 02.91
high- 04 01.92 high- 84 01.93 low- 63 02.27
low- 15 01.68 ˇ 63 01.85 high- 69 01.71
20 01.47 high- 11 01.80 ı 66 01.67
Thischannelis locatedin theIPS.The electrode withthe third
highestcontributionischannel18(3.47%),whichislocatedinthe
RSC.TheIPS electrode site isprecisely theone reportedinour
previouspublicationtobehighlyactivatedduringthemath
con-dition(Dastjerdietal.,2013)andtheRSCisthesamesiteasthe
onereportedinourpreviousworktobehighlydeactivated
dur-ingthemathcondition(Fosteretal.,2012).ForpatientP2(Fig.6),
thedistributionofweightsislesssparsethanforpatientP1.The
electrodewiththelargestcontributionwaselectrode 97 inthe
high- band(3.40%),locatedintheRSC(i.e.,thesamesiteasthe
onereportedinourpreviousworktobehighlydeactivated
dur-ingthemathcondition,Fosteretal.,2012).Amongtheothertop
10 ranking sites (usingaveraged dm for ranking) the temporal
cortex (electrodes83, 67, 84) as wellas occipital cortex
(elec-trodes60,62and63)andIPS(electrode11)sitesweremarkedas
contributory.ForP3(Fig.7),severalelectrodes(accordingto
aver-aged dm) showed non-null contribution: 4 electrodes were in
thelateralparietalcortex(highest:electrode23,5.88%,selected
in the and ˛ bands), as well as one electrode in the RSC
(electrode 69, high- band). Non-null contributions were also
foundin the pre-supplementary motor region(electrode 63 in
high-,6.43%)andinthelateralparietalcortex(electrode44in
ıandˇ).
3.4.3. Frequencylocalization
Whencomputingthetotalcontributionofeachfrequencyband
tothefullmodel,weobservedthatallfrequencybandscontributed
tothemodel(Fig.4C).ForpatientsP1andP2,thehigh-bandhad
thelargestcontribution(39.7%forP1and35.0%forP2).Forpatient
3,thehigh-bandaccountedfor22%oftheweights,thebandfor
21%andthe˛bandfor17%.
4. Discussion
Inthis work,we presentamultiplekernel learningmachine
basedmodeltoautomaticallyselectfrequencyandanatomical
fea-turesdiscriminating betweendifferentexperimentalconditions.
Theproposedapproach wasdata-drivenand usedthesignalin
threedimensions(i.e.,evolutionofthepowerintimeandineach
frequencybandandineachelectrode)toautomaticallydetermine
whichelectrodesand/orfrequencybandcontaineddiscriminating
informationaboutnumericalprocessing.TheMKLtechniquecan
hencebeseenasanexploratorytechniquetoinvestigate
electro-physiologicalsignalonmultipledimensionssimultaneously.
4.1. Comparisonofunivariateandmultivariateresults
Theperformedmultivariateanalysesledtosignificant
classi-ficationofMathversusNon-mathtrials,usingallelectrodesand
frequency bands asinputs. The highest accuracyvalues ranged
between77%and89%.Thesevaluescanhardlybedirectly
com-paredtothesensitivityandspecificityobtainedusingunivariate
techniques(Dastjerdietal.,2013)sinceweobtainedonevalueper
subjectandnotonevalueperelectrode.Thiscanbeseenasan
advantage(i.e.,‘summarizing’theresultsforeachsubject)orasa
disadvantage(i.e.,lackoflocalizedinformation).Moregenerally,
univariatemethodsarethoughttobemorespecific(i.e.,localized
detectionof(de)activations)butless sensitive(moredifficultto
detect significant changes) than multivariate techniques. This
lastobservationissupportedinthepresent studyassignificant
discriminationbetweenMathandNon-mathtrialswasobserved
in allfrequency bands,which wasnot detectedin theworkof
Fig.4.Sparsenessandfrequencylocalizationofthekernelcontributions,foreachpatient.(A)Sortedhistogramofkernelcontributions(averagedacrossfolds).Thedisplayed valuessumto100%,asconstrainedbytheconsideredmachinelearningalgorithm.(B)Summedcontributionofeachfrequencybandtothefullmodel(averagedacrossfolds), inpercent.
analysesinvestigatedifferentaspectsofthequestionofinterest
andshouldbeusedinconjunctionratherthancompared.
4.2. Informationisdistributedoverfrequencybands
Whenlookingatthecontributionofeachfrequencybandtothe
fullmodel,theresultssuggestthatdiscriminatinginformationis
distributedacrossfrequencybands.Thisisfurthersupportedby
thefactthatthefullmodelmaydisplayevenhigheraccuracythan
asinglefrequencybandeventhoughthesizeoftheinput-space
wasmultipliedby6.Inparticular,thehigh-gamma,thetaandalpha
bandsseemedtocontaininformationaboutourvariableof
inter-est.ThisresultisinlinewithpreviousmodelingofECoGdata(van
Gervenetal.,2013),whichshowedthatECoGsignalhad
predic-tivepowerinarangeoffrequencies,especiallyinthe4–16Hzand
65–128Hzfrequencybands.
Inmostoftoday’sECoGliterature,thereisaparticularfocus
ontheHFB.Thepresentworkshowsthatotherfrequencybands
couldalsocontaininformationaboutthevariableofinterest,either
ontheirownorcombinedwithotherfrequency bands(i.e.,full
Fig.5. ChannelcontributionineachfrequencybandforpatientP1.Projectionofkernelcontributionontotheelectrodeanatomicalpositiononthepatient’scortex(3D mesh),foreachfrequencyband(l-standsforlow-andh-forhigh-).Eachcircleonthecortexrepresentsthepositionofanelectrode(nottoscale).Thecontributionof thecorrespondingkerneltothefullmodelisthencolor-coded(seecolorbars):awhitefillrepresentsanelectrodewitha0%contributionandthehighestcontributionsare displayedinpurple.Electrodesdisplayedwithagreyfillwerenotconsideredformodeling(noisy/pathologicalelectrodes).Theelectrodenumberofsomeelectrodes–with acontributionrankedinthetop10–isdisplayedforcross-referencewiththetext.(Forinterpretationofthereferencestocolorinthisfigurelegend,thereaderisreferred tothewebversionofthisarticle).
Fig.6.ChannelcontributionineachfrequencybandforpatientP2.
model).Thismightbringfurtherinsightsonhowtheinformation
iscodedinthehumanbrainandmighthaveimportantimplications
onupcomingstudiesbasedonECoGrecordings.
4.3. Automaticselectionoffeatures
Duringitsestimation,themodelautomaticallyselectedwhich
electrodesand/orfrequencybands shouldhave anon-null
con-tributiontowardbetterdiscriminationbetweenconditions.This
data-drivenanalysisallowedus toidentifythesiteswithinthe
coveredcorticalregionsthathadbeenidentifiedpreviouslywith
univariateresults.Forinstance,oursiteswiththelargest
contrib-utionshadbeenshownpreviouslyassignificantlydeactivatedor
activatedduringthemath ornon-math conditions (i.e.,theIPS
activationduring mathtrials andRSC activationsand
deactiva-tionsduringnon-mathandmathtrials,respectively(Dastjerdietal.,
2013;Fosteretal.,2012).InlinewiththecurrentECoGliterature,
thehigh-bandledtothehighestmodelingaccuracyandhadthe
largestweightinthefullmodelforpatientsP1andP2,whichisalso
inagreementwithourownpreviousunivariateresults(Dastjerdi
etal.,2013).Univariateanalyseshoweverneglectedtoidentifythe
significantcontributionof˛and bandsascontainingrelevant
informationforpatientP3,oncemoreconfirmingthesuperior
sen-sitivityofMKLmachinelearningmodeltoidentifydistributedbrain
activityduringagivencognitivecondition.
4.4. Prospectiveuses
Inthepresentwork,thekernelswerecomputedfromthe
tem-poralevolutionoftheaveragedpowerinaspecificfrequencyband
ineachelectrode.However,anyfeaturecouldbeconsideredfor
modeling(e.g.,therawsignaloneachchannel,coherencevalues
orphase values), dependingontheclinical orcognitive
neuro-sciencequestiontoinvestigate.Moreover,theMKLapproachas
definedinthisworkcanbetransposedandappliedtoany
combi-nationofdimension,whichallowsinvestigatingmultipleaspects
ofsuchaquestion.Forexample,thesignalinanepochcouldbe
dividedintosub-timewindowsof 10ms,for each channel.The
MKLmodel would hence provideinformation onthe timingof
bestdiscriminationbetweentwo classes for each channel.This
wouldallowinvestigatingthetemporalpropagationofthesignal
ofinterestoverthechannels.Inthesameway,weusedpre-defined
frequencybandstoillustrateourMKLapproach.However,itmight
beofinteresttoconsidersubject-specificfrequencybands (e.g.,
for activity).Tothis end,a kernelcouldbebuiltfor each
fre-quencybin in a frequency bandof interest. The resultingMKL
modelwouldthenautomaticallyselectwhichfrequencybins
con-taininformationaboutthevariableofinterestforthisspecificdata
set/subject.Finally,differenttypesoffeaturescouldbecombined
(e.g.,power,phaseandrawsignal).TheresultingMKLmodelwould
providethecontributionsofeachfeaturetypetothefinalmodel.
Thiswouldallowinvestigating whetherdifferentfeatures bring
complementary information regarding the variable of interest
ornot.
Furthermore, we chose to illustrateour method with ECoG
data since the interpretation of the selection of electrodes in
termsofanatomicallocationisstraightforward.Practically
how-ever,the same technique can beapplied toscalp EEG or MEG
data.
Anotherprospectiveapplicationof theMKLapproach isthat
itallowsinvestigatingpotentiallinearcombinationsbetweenthe
signalsondifferentelectrodes,inthesameordifferentfrequency
bands.Thisfieldofstudyrecentlyreceivedsignificantattention,
especiallywhenstudyingresting-statenetworksandhow
differ-entconditionscanaffecttherelationshipbetweendifferentareas
withinorbetweennetworks.Inparticular,phase-amplitude
cou-plingscouldeasilybeinvestigatedbycomputingonekernelper
electrodeusingthepowerinaspecificfrequencyband(e.g.,HFB)
andonekernelperelectrodeusingthephaseinthesameoranother
frequencyband(e.g.,).
4.5. Limitations
Itshouldbenotedthatthealgorithmusedinourpresentwork
(Rakotomamonjy etal.,2008)issparseinterms ofthe
contrib-utionsofthekernelstothefinaldecisionboundary.Itispossible
thattheinformationofinterestismorewidelydistributedacross
electrodes/frequencybands,andtheconsideredalgorithmmaynot
selectkernelscontainingcorrelatedsignals.Asaresult,
neighbor-ingelectrodesarenotoftenselectedsimultaneouslyinthesame
modelestimation (onlya subsetisselected). Unfortunately,the
consideredformulationdoesnotallowtocontrolforthelevelof
sparsity.Ifanon-sparseanalysisisdesired,onecanuseanother
typeofregularization,suchastheelastic-net(combinationofL-1
andL-2regularization,(ZouandHastie,2005)),asdiscussedin(van
Gervenetal.,2013).InvestigatingMKLapproacheswithothertypes
ofregularizationsisatopicforfuturework.
4.6. Availabilityofthemethod
ThePRoNTotoolbox(Schrouffetal.,2013)hasbeenadapted
toperformthemachinelearningmodelingofelectrophysiological
data(intheSPMMEEGformat).Allthefunctionalitiespresentedin
thiswork,aswellasintheprospectiveuseshavebeenimplemented
intheversion3.0ofouropen-sourcesoftware.PRoNTov3.0willbe
releasedassoonaspossible(www.mlnl.cs.ucl.ac.uk/pronto).
4.7. Conclusion
In this work, we propose a sparse MKL approach to
auto-matically select relevant sites of recording and/or frequency
bands during cognitive processing in the human brain. We
show that the MKL method is more sensitive than univariate
methodstodecodea variableofinterestbasedon
electrophysi-ological recordings.We alsodemonstratethat theMKLmethod
provides an easy way to locate the information of interest
intermsofanatomy,timewindoworfrequencydomain.
Conflictofinterest
None.
Acknowledgments
We thank Müge Ozker, Brett Foster, Mohammad Dastjerdi,
VinithaRangarajan,and StanfordEpilepsyMonitoringUnitstaff
forhelp withelectrophysiologicalrecordings.Thisresearchwas
supported by US National Institute of Neurological Disorders
and Stroke (R01NS078396), and US National Science
Founda-tion(BCS1358907)toJ.P.andtheBelgianAmericanEducational
Foundation(BAEF),LéonFrédéricqFunds-RotaryMedical
Founda-tionofLiègeandMarieSkłodowskaCurieActions(DecoMPECoG,
grant 654038) to J.S. C.P. was funded by the Belgium National
Research Funds (F.N.R.S.) and J.M.M. by the Wellcome Trust
(WT086565/Z/08/ZandWT102845/Z/13/Z).
References
ChangEF,RiegerJW,JohnsonK,BergerMS,BarbaroNM,KnightRT. Categori-calspeechrepresentationinhumansuperiortemporalgyrus.NatNeurosci 2010;13:1428–32.
CortesC,VapnikV.SupportVectorNetworksMachineLearning,1995.
DaiD,WangJ,HuaJ,HeH.ClassificationofADHDchildrenthroughmultimodal magneticresonanceimaging.FrontSystNeurosci2012;6:63.
DastjerdiM,OzkerM,FosterBL,RangarajanV,ParviziJ.Numericalprocessingin thehumanparietalcortexduringexperimentalandnaturalconditions.Nat Commun2013;4:2528.
FilipovychR,ResnickSM,DavatzikosC.Multi-kernelclassificationforintegrationof clinicalandimagingdata:applicationtopredictionofcognitivedeclineinolder adults.MedImageComputComputAssistInterv2011;7009:26–34.
FilipponeM,MarquandAF,BlainCRV,WilliamsSCR,Mourao-MirandaJ,GirolamiM. Probabilisticpredictionofneurologicaldisorderswithastatisticalassessment ofneuroimagingdatamodalities.AnnApplStat2012;6:1883–905.
FlinkerA,ChangEF,BarbaroNM,BergerMS,KnightRT.Sub-centimeterlanguage organizationinthehumantemporallobe.BrainLang2011;117:103–9. FosterBL,DastjerdiM,ParviziJ.Neuralpopulationsinhumanposteromedialcortex
displayopposingresponsesduringmemoryandnumericalprocessing.ProcNatl AcadSciUSA2012;109:15514–9.
GönenM,Alpaydin E.Multiple kernellearningalgorithms. JMach Learn Res 2011;12:2211–68.
HaynesJD.Aprimeronpattern-basedapproachestofMRI:principles,pitfalls,and perspectives.Neuron2015;87:257–70.
HermesD,MillerKJ,NoordmansHJ,VansteenselMJ,RamseyNF.Automated electro-corticographicelectrodelocalizationonindividuallyrenderedbrainsurfaces.J NeurosciMethods2010;185:293–8.
auditorycortex.Science2005;309:951–4.
NirY,FischL,MukamelR,Gelbard-SagivH,ArieliA,FriedI,etal.Couplingbetween neuronalfiringrate,gammaLFP,andBOLDfMRIisrelatedtointerneuronal correlations.CurrBiol2007;17:1275–85.
NoirhommeQ,LesenfantsD,GomezF,SodduA,SchrouffJ,GarrauxG,etal.Biased binomialassessmentofcross-validatedestimationofclassificationaccuracies illustratedindiagnosispredictions.Neuroimage:Clin2014;4:687–94. Pasley BN,David SV, Mesgarani N,Flinker A, Shamma SA, Crone NE, etal.
Reconstructing speech from human auditory cortex. PLoS Biol 2012;10., http://dx.doi.org/10.1371/journal.pbio.1001251.
Neuroinformatics2013;11:319–37.
vanGervenM,MarisE,SperlingM.Decodingthememorizationofindividualstimuli withdirecthumanbrainrecordings.Neuroimage2013;70:223–32.
ZhangD,WangY,ZhouL,YuanH,ShenD.MultimodalclassificationofAlzheimer’s diseaseandmildcognitiveimpairment.Neuroimage2011a;55:856–67. ZhangY,MeyersEM,BichotNP,SerreT,PoggioTA,DesimoneR.Objectdecoding
withattentionininferiortemporalcortex.ProcNatlAcadSciUSA2011b;108: 8850–5.
ZouH,HastieT.Regularizationandvariableselectionviatheelasticnet.JRStatSoc 2005;67:301–20.