HAL Id: hal-01132571
https://hal.archives-ouvertes.fr/hal-01132571
Submitted on 10 Jan 2019
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
based on the common variation cue
Mathieu Lagrange, Martin Raspaud
To cite this version:
Mathieu Lagrange, Martin Raspaud. Spectral similarity metrics for sound source formation based on
the common variation cue. Multimedia Tools and Applications, Springer Verlag, 2010, pp.185-205.
�hal-01132571�
Spe tral SimilarityMetri s For Sound Sour e Formation
Based on the Common Variation Cue
Mathieu Lagrange
·
MartinRaspaudRe eived:date/A epted:date
Abstra t S eneanalysisisarelevantwayofgatheringinformationaboutthestru ture
ofanaudiostream.For ontent extra tionpurposes,italsoprovidespriorknowledge
that anbe taken into a ount inorderto provide morerobust resultsfor standard
lassi ationapproa hes.
Inordertoperformsu hs eneanalysis,webelievethatthenotionoftemporalityis
important.Consequently,westudyinthispaperanewwayofmodelingtheevolution
over time of the frequen y and amplitude parameters of spe tral omponents. We
evaluateitsbenetsby onsideringitsabilitytoautomati allygatherthe omponents
ofthesamesoundsour e.Theevaluationoftheproposedmetri showsthatita hieves
goodperforman eandtakesbettera ountofmi ro-modulations.
Keywords auditorys ene analysis,mid-level representation, lustering, ommon
variation ue
1Introdu tion
Extra ting ontent from polyphoni audio su h as musi al streams appears to be
boundedtomoderateperforman e ifthestreamis onsidered'blindly',i.e.pro essed
withoutanypriorknowledgeofthestru tureofthestream[2℄.Ass eneanalysis isa
relevant way of gatheringinformations aboutthe stru tureof anaudio stream,
per-formingsu hoperationpriorextra ting ontentisawaytoaddressthisissue.
Onthehighend,one an onsideramid-levelrepresentationofthepolyphony[13,
5℄des ribingpolyphoni soundsasa setof oherent spe tralregions, where ea hset
anbe onsideredas monophoni .In this ase, one an fo us the ontent extra tion
M.Lagrange
Tele omParisTe h46,rueBarrault75634PARISCedex13-FRANCE
Tel.:+33(0)145817324Fax:+33(0)145817144
E-mail:lagrangetele om-pariste h.fr
M.Raspaud
LinköpingUniversityBredgatan33SE-60174Norrköping-SWEDEN
pro ess to agivenelement of thes ene[28℄. Ona lower end,one an onsider some
time segmentation of the audio stream where se tions that have similar properties
are identiedand/or lustered.Basedonthisrepresentation,thetemporalpriorsare
onsidered to integrate the indexing de ision done at ea h analysis frame to obtain
morerobust lassi ationresults[21℄.
Inordertoextra tsu hrepresentationorsegmentation,many ues anbe
onsid-ered[6℄. Timbreisoneofthem.Thedes riptionofthetimbreofmonophoni sounds
hasbeenwidelystudied[31℄andmanydes riptorshavebeenproposed[18℄.These
de-s riptorsorfeatures are mainlybasedonthetemporalorspe tralobservationsofthe
soundssin eTimbredependsprimarilyuponthespe trumofthestimulus,butitalso
dependsonthewaveform,thesoundpressure,thefrequen ylo ation,ofthespe trum,
andthetemporal hara teristi softhestimulus.,asstatedintheANSIdenitionof
timbre[19℄.Unfortunately,mostofthesedes riptors annotbedire tlyextra tedfrom
polyphoni re ordings.
Ifthesoundsprodu edbytheinstruments anbe onsideredaspseudo-periodi ,a
monophoni orpolyphoni signalmaybede omposedintosinusoidal omponentswith
parametersthatevolveslowlywithtime,thepartials.Thisrestri tionisnottoostrong
sin emost lassi alinstrumentstinthis ategory,fromstringstobrassinstruments.
Inthis ase,several riteriaorpsy hoa ousti al' ues'proposedintheAuditoryS ene
Analysis(ASA)literature[6℄maythenbe onsideredforanautomati evaluationofthe
timbreofea hsoundssour es[14℄.Inparti ular,itisshownintheworkofM Adams
[32℄that the orrelatedevolutionoftheparametersofthe partialsofagivenmusi al
orvo altoneisanimportant uefor theper eptionoftimbre.
Consequently,in order to ensure the relevan e of the approa h proposed inthis
paper,theanalysedsignalshavetobepseudo-periodi inordertobesuitableforthe
sinusoidalmodelthatisthefront-endofourmethod.Thesignals anbeinharmoni .In
fa t,thatisthemainmotivationoftheuseofthe ommonvariation ueto omplement
theharmoni ityone.Theyshouldbebestmonophoni butin aseofweakpolyphonies,
i.e.nounison, somepartials are not overlappingand anbe assignedto onlyoneof
thetwodierentsour esa tiveatthesametime.
The ommon variation ue has been used for sour e separation [9,12,46℄ i.e. to
determinewhi hpartials havebeenprodu ed simultaneouslyby thesameProdu ing
SoundSystem (PSS)and therefore automati ally extra ta high leveldes ription of
polyphoni sound. This ue is also a musi al parameter that des ribes timbre and
thereforealsohavepotentialforMusi alInformationRetrieval(MIR)appli ationssu h
as musi alinstrument,instrument lassidenti ation,and instrumentalistor lo utor
re ognition.
Theseappli ationsbothrelyonthedenitionofametri toevaluatehowdissimilar
twopartialsare,a ordingtothe ommonvariationoftheirparameters.Wewillshow
inthis paperthat onsideringthe spe trumof these variations allows us to propose
arobustdissimilarity metri . Thepaper isorganized as follows: after apresentation
of the sinusoidal modelin Se tion 2,existing metri sproposed in the literature are
reviewedinSe tion3andtherequisitesofarelevantmetri arealsodetailed.
Theproposedmetri isnextintrodu edinSe tion4.Motivatedbytheproperties
oftheevolutionsofthefrequen iesofthepartials,arstmetri isproposed.Wenext
showthatthismetri analsobesu essfullyusedwhile onsideringtheevolutionsof
theevaluationmethodologypresentedinSe tion5,wherethedatabaseandthe riteria
that evaluatethe ability of the testedmetri to dis riminatepartials produ edfrom
dierentPSS.TheresultsofthisevaluationarepresentedinSe tion6.
Thetimbraldis rimination apabilitiesoftheproposedmetri ,i.e.itsabilityto
dif-ferentiatepartialsprodu edbynotonlydierentPSSbutalsodierentinstrumentsor
dierent lassesofintrumentsarestudiedinSe tion7andsomepotentialappli ations
aredes ribedinSe tion8.
2High-LevelRepresentationofPolyphoni Sounds
Most ofthe des riptors usedinMIRappli ations onsider temporal featuressu has
meanzero- rossingrateorspe tralonessu hasMel-Frequen yCepstrumCoe ients
(MFCC), see the work of P. Herrera et al. [18℄ for a deeperreview. These
des rip-tors are generally extra ted ona frame basis and the frames are usually onsidered
independently,loosingmostofthetemporalinformation.
For various appli ations, oneneeds arepresentation of polyphoni soundswhere
thetimbralinformationaswellastheirevolutionswithrespe ttotimeofea hsound
sour es an be onsidered. In this se tion, we dis uss the fa t that the well-known
sinusoidalmodel anbeabasisforsu harepresentation.
2.1 SinusoidalModel
The sinusoidal model represents pseudo-periodi sounds as sums of sinusoids
so- alled partials ontrolledbyparametersthat evolve slowly withtime[33,43℄.More
formallyput,theaudiosignal
s
anbe al ulatedfromthe ontrollingparametersusingEquations1and2,where
N
isthenumberofpartialsandthefun tionsf
p
,a
p
,andφ
p
aretheinstantaneousfrequen y,amplitude,andphaseofthep
-thpartial,respe tively.The
N
pairs(f
p
, a
p
)
aretheparametersoftheadditivemodelandrepresentpointsin thefrequen y-amplitudeplaneattimet
.s(t) =
N
X
p=1
a
p
(t) cos(φ
p
(t))
(1)φ
p
(t) = φ
p
(0) + 2π
Z
t
0
f
p
(u) du
(2)This analsobewrittenfromthesetpointofview:
P
k
(m) = {F
k
(m), A
k
(m), Φ
k
(m)}
(3)where
F
k
(m)
,A
k
(m)
,andΦ
k
(m)
arerespe tivelythefrequen y,amplitude,andphase ofthepartialP
k
attimeindexm
.Theseparametersarevalidforallm
∈ [b
k
,
· · · , b
k
+
l
k
− 1]
,wheretheb
k
andl
k
arerespe tivelythestartingindexandthelengthofthe partial.Onaframebasis,theinstantaneousfrequen y,amplitude,andphaseofea h
go beyond the resolution limitation of the Fourier transform,one an also onsider
parametri methods like the ESPRITalgorithm [29,4℄ or maximumlikelihood ones,
likethemat hingpursuit [8,10℄. Those estimate anbe omplementedwiththe
esti-mationoftheslopeofthefrequen yandamplitude[1,42℄that ouldbe onsideredat
thetra kingphasetoobtainamorepre isemodelingofthelongtermevolutionofthe
frequen yandamplitudeparametersthroughtime.
Thepartials anbeextra tedfromtheparametersestimatedonaframebasisusing
partialtra kingalgorithms[33,43,44,27,40,35℄. Polyphoni sounds anbe onsidered
withdedi atedtra king algorithms[11,26℄. However, inorderto avoid problemsdue
tostrongpolyphony[13℄,weonly onsiderinthispapermixturesofentitiesextra ted
frommonophoni signals.
2.2 A ousti alEntities
Thesesinusoidal omponentsare alledpartialsbe ausetheyareonlyapartofamore
per eptively oherententitythatmaybe alled ana ousti alentity.
This anbewrittenas:
S =
N
[
n=1
E
n
(4)with
S
being themid-levelrepresentationofthe sound,E
beingana ousti al entityandNthetotalnumberofentitiesinthesound.Hen eea hentityismadeofagroup
ofpartials:
E
n
=
M
n
[
k=1
P
k
n
(5)where
M
n
isthetotalnumberofpartialsP
n
k
intheentity.To extra ttheseentitiesfrom asinusoidal representation of asound, similarities
betweenpartialsshouldbe onsideredinordertogathertheonesbelongingtothesame
a ousti alentity.Fromtheper eptualpointofview,somepartialsbelongtothesame
entityiftheyare per eivedbythe humanauditorysystem asauniquesound.There
are several ues thatlead tothis per eptualfusion: the ommononset,theharmoni
relationofthefrequen ies,the orrelatedevolutionsoftheparametersandthespatial
lo ation[6℄.
The earliest attempts at a ousti al entity identi ation and separation onsider
harmoni ityasthesole uefor groupformation.Somerelyonapriordete tionofthe
fundamental frequen y[17,15℄ andothers onsideronlythe harmoni relation ofthe
frequen iesofthepartials[23,46,41℄.Yet,manymusi alinstrumentsarenotperfe tly
harmoni .
In ontrast, the ue that onsider the orrelated evolutionsof the parametersof
thepartialsisgeneri .Also,numerouspsy hoa ousti alstudiesshowedthatthe
vari-ationsorthemi ro-modulationsareimportantforper eption.Bregmanwrites:Small
u tuationsin frequen y o ur naturally inthe humanvoi e and in musi al
instru-ments.Theu tuationsare notoften verylarge,rangingfromlessthan1per entfor
a larinettonetoabout1per entforavoi etryingtoholdasteadypit h,withlarger
ex ursionsofasmu hthanas20per entforthevibratoofthesinger.Eventhesmaller
B
A
C
D
Time
Frequency
Fig.1 Representationoftwo tivesoundsinthetime-frequen ydomain.PartialsA,B,and
C( learly orrelatedinmodulationandstartingandendingtimes,thatis ommonvariation)
representthesinusoidal omponentsoftherstsound,whileDandErepresentthesinusoidal
omponentsofthese ondsound.
partialsisper eivedasauniquea ousti alentityonlyifthesevariationsare orrelated.
Therefore,the orrelatedevolutionsoftheparametersofthe partialsisageneri ue
sin eit anbeobservedwithanyvibratinginstruments.Asanexample,seeFigure1.
Inordertodeneadissimilaritymetri that onsidersthe ommonvariation ue,we
willstudyinthenextse tionthephysi alpropertiesoftheevolutionsofthefrequen y
andamplitudeparametersofthepartials.
3TheCommon Variation Cue
Inordertodene adissimilarity metri that onsidersthe ommonvariation ue,we
havetostudythephysi alpropertiesoftheevolutionsofthefrequen yandamplitude
parametersofthepartials.
Letus onsideraharmoni tonemodulatedbyavibratoofgivendepthandrate.
Alltheharmoni saremodulatedatthesamerateandphasebuttheirrespe tivedepth
iss aledbyafa torequaltotheirharmoni rank(seeFigure2(a)).Itisthenimportant
to onsiderametri whi hiss ale-invariant.
Cooke usesadistan e [9℄ equivalent to the osine dissimilarity
d
c
,also knownas inter orrelation:d
c
(X
1
, X
2
) = 1 −
c(X
1
, X
2
)
p
c(X
1
, X
1
)
p
c(X
2
, X
2
)
(6)c(X
1
, X
2
) =
N
X
i=1
X
1
(i) X
2
(i)
(7)where
X
1
andX
2
arerealve torsofsizeN
.Thisdissimilarityiss ale-invariant. T. Virtanen et al.proposed (in [46℄)to use the mean-squarederror betweentheve torsrstnormalizedbytheiraveragevalues:
d
v
(X
1
, X
2
) =
1
N
N
X
i=1
„
X
1
(i)
¯
X
1
−
X
2
¯
(i)
X
2
«
2
(8)where
X
1
andX
2
areve torsofsizeN
and¯
normaliza-themeanfrequen yofagivenharmoni andtheoneofthefundamentalisequaltoits
harmoni rank.
It is proposed in [24℄ to onsider the Auto-Regressive (AR) model as a
s ale-invariant metri that onsidersonlythe predi tablepartof theevolutionsof the
pa-rameters:
X
l
(n) ≈
n
X
i=1
k
l
(i)X
l
(n − i)
(9)wherethe
k
l
(i)
aretheAR oe ients.Sin ethedire t omparisonoftheAR oe- ients omputedfromthetwove torsX
1
andX
2
isnotrelevant,thespe trumofthese oe ientsis omparedasproposedbyItakura[20℄:d
AR(X
1
, X
2
) = log
Z
π
−π
|K
1
(ω)|
|K
2
(ω)|
dω
2π
(10) whereK
l
(ω) = 1 +
n
X
i=1
K
l
(i)e
−jiω
(11)When onsideringthe amplitudes of the partials, a s ale-invariant metri is also
important. In this ontext, the normalization proposed by T.Virtanen is no longer
motivated sin e the relative amplitudesof the harmoni sdependon theenvelopeof
thesound.For example,onFigure2(b), thetopmost urve(withsmall modulations)
representstheamplitudesofthefundamentalpartial,whilethese ondtothetop urve
withbroados illationrepresentstherstharmoni .
Moreover the envelope is globally de reasing as the frequen y grows, but it an
appearthattheamplitudeoftheenvelopeisalsoas endingduetothespe i shapeof
theenvelopearoundformants.Therefore,whenthefrequen yofapartialismodulated,
theamplitudemaybemodulated withaphaseshift, seethebottom urveof Figure
2(b).Therefore,ametri thatisphase-invariantshouldbe onsidered.
Theamplitudeevolutionofapartialis omposedofatemporalenvelopeandsome
periodi modulations.Sin etheenvelopeoftheamplitudeofthepartials anbevery
dierentfrompartials topartialsofthesameentityitmaybeusefulto onsideronly
theperiodi modulationswhile omputingtheirsimilarities.Themetri introdu edin
thenextse tionwill opewiththeseissues.
4Proposed Metri
Wepropose to gobeyondtemporaldomainby takingtheparametersto thespe tral
domain. There was already anattemptat this, usingARmodels (see equation 10).
Sin etheFouriertransformisbasedonthefa tthattheinputsignalisperiodi ,using
a spe trum of the evolution of the partials might show ommon periodi ities of the
partials. This will be handy for the modulations of the partials reated by vibrato
andtremolo,sin ewe anassimilatethesemodulationstosinusoidalonesoverashort
periodoftime(see[30℄). It anbealsointeresting formi ro-modulations su hasthe
onesprodu edbyvibratingstringssu hasthestringsofapiano(seeFigure3).Hen e,
0
50
100
150
200
250
300
350
400
−25
−20
−15
−10
−5
0
5
10
15
Time (frames)
Centered Frequency (Hz)
(a)Frequen ies0
50
100
150
200
250
300
350
400
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Time (frames)
Amplitude
(b)AmplitudesFig.2 Mean- enteredfrequen iesandamplitudesofsomepartialsofasaxophonetonewith
vibrato.
4.1 UsingtheFrequen iesofthePartials
Therststepinthe al ulationofournewmetri isto orrelatetheevolutionsofthe
frequen iesofthepartials.Aswesaidbefore, agooddes riptionoftheseevolutionsis
givenbythespe traoftheseevolutions.
Theway to omputethe spe tra ofthefrequen y evolutionsofthesignal from a
partialis totakeothe meanvalue ofthis frequen yand then omputetheFourier
transformoftheresultingsignal.Indeed,inordertohavea leanspe trumrelevantto
theevolutions,itisne essarytohavetheevolutions enteredaroundzero.
Then,weapplythepreviouslyexposedpro esstothefrequen iesofallthepartials
fromwhi hwewanttomeasureevolution orrelation.On ewehavethesefrequen ies
expressed interms ofspe tra,the way to omputethedistan e between two partial
signalsistointer orrelatetheirspe tra(seeequation6).Thisgives
0
10
20
30
40
50
60
70
80
90
100
1
2
3
4
5
Time (frames)
Harmonic index
0
200
400
600
800
1000
1200
1
2
3
4
5
Harmonic index
Frequency (Hz)
Fig.3 Centeredfrequen ies(top)ofapianonoteandtheir orrespondingspe tra(bottom).
Ea h urveisshiftedfor laritysake.
0
50
100
150
200
250
−1
0
1
2
3
4
5
Time (Frames)
Amplitude
Partial
Polynomial
Fig.4 AmplitudesofapartialofanBbClarinetanditspolynomialenvelopeestimation.
where
f
1
andf
2
are the frequen yve tors oftwo partialsP
1
andP
2
andF
k
is the Fourier spe trum off
k
. Thanks to the absolute value applied to the spe tra, this distan eisphase-invariant.4.2 UsingtheAmplitudesofthePartials
Inthe aseoftheamplitudesofthepartials,theproblemisslightlymore ompli ated.
Indeed,inorderto entertheos illatingpartofthesignalaroundzerosubtra tingthe
0
50
100
150
200
250
−4
−3
−2
−1
0
1
Time (Frames)
Amplitude
(a)Modulations0
200
400
600
800
1000
1200
−1
−0.5
0
0.5
1
1.5
2
Frequency
Amplitude
(b)CorrespondingSpe traFig.5 AmplitudesofthreepartialsofanBbClarinetwhenthepolynomialenvelopeisremoved
(a),andtheir orrespondingspe tra(b).The urveshavebeenshiftedfor laritysake.
behind this polynomial subtra tion is that the envelope of a sound (seen as atta k,
de ay,sustain andrelease) anberoughlyapproximatedbya9thdegreepolynomial.
Anexampleofsu hasubtra tionisshownonFigure5.
Thisgivesusthedistan e
d
sp
:d
sp
(a
1
, a
2
) = d
c
(| f
A
1
|, | f
A
2
|)
(13)where
A
f
k
istheFourierspe trumoff
a
k
withf
a
k
= a
k
− Π(a
k
)
where
a
1
anda
2
aretheamplitudesoftwopartials,Π
(x)
istheenvelopepolynomial omputedfromsignalx
,usingasimpleleast-squaresmethod[34℄.4.3 Metri Combination
Inorderto exploitboth thefrequen y and amplitudeparameters,we needaway to
ombinethemeasuresofamplitudeandfrequen ydistan es.
T.Virtanen et al. proposed to ombine frequen y and amplitudeparameters
dis-tan esbymeansofaddingthetwodistan emeasureswhile onsideringanharmoni ity
fa tor.Intheirwork[46℄,ea hdistan esareweightedbeforeperformingtheaddition.
For omparisonpurposes,we onsiderthefollowingdistan e:
d
v+v
(P
1
, P
2
) =
d
v
(f
1
, f
2
) + d
v
(a
1
, a
2
)
2
(14)where
f
k
anda
k
arerespe tivelythe frequen iesandamplitudeofpartialsP
k
.Sin e theweightsarenotsuppliedandnoharmoni ityinformationisavailableitisonlyanapproximationofthe ombinations hemeproposedbyT.Virtanen.
Sin eourproposeddistan es
d
s
andd
sp
arenormalized,ifwewanttogivethesame weighttothetwodistan es,we an ombinethefrequen yandamplitudedistan esbyperformingasimplemean.Thiswouldthenyield:
d
+
(P
1
, P
2
) =
d
s
(f
1
, f
2
) + d
sp
(a
1
, a
2
)
2
(15)Inorder to take into a ount the best result on part of one of the measures, a
methodwouldbetotaketheminimumofthetwodistan es:
d
m
(P
1
, P
2
) = min(d
s
(f
1
, f
2
), d
sp
(a
1
, a
2
))
(16)Asitwill bepresentedinSe tion6,betterresults area hievedwhenwemultiply
amplitudeandfrequen yparameterdistan es. This ombination,howeverlessrobust
to errors, seemsto take better a ount ofthe performan e of ea h distan emeasure
independently.Inordertokeepthemetri sinthesames ale,asquarerootisapplied
tothe ombination:
d
×
(P
1
, P
2
) =
q
d
s
(f
1
, f
2
)d
sp
(a
1
, a
2
)
(17)5Evaluation
In this se tion, we present the methodology used for evaluating the performan e of
thedierentmetri sreviewedinSe tion3and proposedinSe tion4.Theevaluation
databaseisrstdes ribed.Next,several riteriaare presented,ea honeevaluatinga
spe i propertyoftheevaluatedmetri .
Theobje tiveoftheevaluationpresentedintheremainingofthepaperistostudy
if theproposed similarity metri sare good andidatesfor implementinga lustering
of the partials of the same a ousti al entity. In Se tion 7, we extendthis study by
onsidering the statisti alpropertiesof oneofthe proposedmetri while onsidering
Inthisstudy,wefo usonasubsetofmusi alinstrumentsthatprodu epseudo-periodi
soundsand modelthemasasumof partials(see Se tion2).Theinstrumentsofthe
IOWAdatabase[16℄whoseinstrumenthierar hyisplottedinFigure7,globally tto
this onditioneventhough somesampleshavetobe removed.The pizzi ato tones,
i.eplu ked-stringtoneswithstrong atta kand weakresonating phaseas well asthe
pianissimo tonesi.etoneswithverylowamplitudearedis arded.
Inordertoextra tthepartialsforea htone,ea hleoftheIOWAdatabaseissplit
intoaseries ofaudioles,ea h ontainingonlyonetone.Thespe tralparametersat
ea hframesareestimatedusingthephasederivativemethodstudiedin[25℄ withthe
followingparameters:thewindowsizeis2048sampleslong,thehopsizeis512samples
long at a samplingrate of 44100 Hz. Animplementationof the algorithm proposed
byM AulyandQuatieriin[33℄isusedwithafrequen ytoleran eof50Hz.Sin ewe
onsideronlytheprominentpartialsofagiventone,onlytheextra tedpartialslasting
foratleast2se ondsareretained.Forea hentity,onlythe20partialswiththehighest
amplitudeareretained.
5.2 Methodology
To omparethemetri sproposedinSe tion4andthosereviewedinSe tion3,weuse
thefollowingmethodologyto omputethethreeevaluation riteria.Forthetwoentities
of the onsidered ouple, the medianvalues ofthe starting/endingtimeindexofthe
partials
t
s
andt
e
are omputed.Onlythepartialsexistingbeforeandaftert
s
+ ǫ
s
andt
e
− ǫ
e
arekept(seeFigure6).Thevaluesǫ
s
andǫ
e
arearbitrarilysmall onstants.Then,thepartialsofthetwoentitiesaregathered.Onlythe ommonpartdened
asthetimeintervalwhereallthepartialsarea tiveis onsideredtoevaluatethetested
metri .Forexample,the ommonpartofthepartialsrepresentedinFigure6isbetween
c
s
andc
e
.Frequen y
Time
t
s
c
s
c
e
t
e
Fig.6 Sele tionofthe ommonpartsofthepartialsofthetwoa ousti alentities.Apartial
startisrepresentedwithabla klleddotanditsendwithawhitelleddot.Onlythepartials
existingbeforeandafter
ts
andte
arekept,represented withsolidlines.Theindexescs
and5.3 Performan eCriteria
On etheevaluationdatabaseandtheevaluationmethodologyaredened,some riteria
havetobedenedthatree tif,by onsideringtheevaluatedmetri ,twopartialsare
loseiftheya tuallybelongtothesamea ousti alentityandfarotherwise.
5.3.1Fisher riterion
Arelevantdissimilaritymetri betweentwopartialsisametri whi hislowforpartials
ofthesameentitythe lassfromthestatisti alpointofviewandhighforpartials
that donot belong to the same entity.The intra- lass dissimilarity should then be
minimalandtheinter- lassdissimilarityashighaspossible.Let
U
bethesetofelementsof ardinal
# U
andC
i
theentityofindexi
betweenN
c
dierententities.Anestimation oftherelevan eofagivendissimilarityd(x, y)
foragivena ousti alentityis:intra
(C
i
) =
n
i
X
j=1
n
i
X
k=1
d(C
i
(j), C
i
(k))
(18) inter(C
i
) =
n
i
X
j=1
# U −n
i
X
l=1
d(C
i
(j), C
i
(l))
(19)F(C
i
) =
inter(C
i
)
intra(C
i
)
(20)where
n
i
isthenumberofpartialsinC
i
andC
i
= U \C
i
.TheoverallqualityF(U )
is thendenedas:F(U ) =
P
N
c
i=1
inter(C
i
)
P
N
c
i=1
intra(C
i
)
(21)Thislast riterion
F(U )
islooselybasedonthesherdis riminant ommonlyusedinstatisti alanalysis.Itprovidesarstevaluationofthedis riminationqualityofagiven
metri . It anhoweverbe noti edthatthis riterionis dependentof thes ale ofthe
studieddissimilaritymetri .
5.3.2Density riterion
Dissimilarity-ve torbased lassi ationinvolves al ulating adissimilaritymetri
be-tweenpair-wise ombinationsof elementsand grouping togetherthosefor whi hthe
dissimilaritymetri issmalla ordingtoagiven lassi ationalgorithm.
Thedensity riterion
D
intends to evaluate aproperty ofthe tested metri thatshouldbefullledinordertoberelevantlyusedin ombinationwith ommon
lassi a-tionalgorithmssu hashierar hi al lusteringorK-means.Indeed,many lassi ation
algorithmsiteratively lusterpartialswhi hrelativedistan eis thesmallestone. The
density riterionveriesthatthesetwopartialsa tuallybelongtothesamea ousti al
entity.
Moreformally,given asetof elements
X
,ζ(X)
is denedas theratio of ouplesl:
X
→
N
a
7→
i
where
i
istheindexofthe lassofa
.Weget:D(X) =
# {(a, b) | d(a, b) = min
c∈X
d(a, c) ∧
l(a) =
l(b)}
# X
(22)where
X
anbeeitherana ousti alentityC
i
ortheuniverseU
and# x
denotesthe ardinalofx
.5.3.3Classi ation riterion
Forthis riterion,thequalityofthetestedmetri isevaluatedby onsideringthequality
ofa lassi ationdoneusingthetestedmetri anda lassi ationalgorithm.
We onsider anagglomerative hierar hi al lustering (AHC)pro edure [22℄. This
algorithmprodu esaseriesofpartitionsofthepartials:
(P
n
, P
n−1
, . . . , P
1
)
.Therst partition
P
n
onsists ofn
singletons and the last partitionP
1
onsists ofa single lass ontaining allthe partials.Atea hstage, themethodjoins togetherthetwo lusterofpartialswhi haremostsimilara ordingtothe hosendissimilarity
metri .Attherststage,of ourse,thisendsinjoiningtogetherthetwopartialsthat
are losesttogether,sin eattheinitialstageea h lusterhasonlyonepartial.Atea h
stage,thedissimilaritybetweenthenew lusterandtheotheronesis omputedusing
themethodproposedbyWard[47℄.
Hierar hi al lustering may berepresented by atwodimensional diagramknown
asdendrogram whi hillustratesthefusionsmadeatea hsu essivestageof lustering,
seeFigure 7where the lengthof the verti al barthat links two lasses is al ulated
a ordingtothedistan ebetweenthetwojoined lusters.
Thea ousti alentities anthenbefoundby utting thedendrogramatrelevant
levels. Here, for the lassi ation riterion, the a ousti al entities are identied by
simply uttingthedendrogram atthehighestlevelstoa hieve thedesired numberof
entities.Ifthedesirednumberofentitiesis2,onlythehighestlevelis ut(seeFigure
7).
The lassi ation riterion
H
isthen dened as the numberof partials orre tlylassiedversusthenumberofpartials lassied:
H(X) =
# {a|a ∈ ˆ
C
i
∧ cl(a) = i}
# X
(23)where
ˆ
C
i
isana ousti alentityextra tedfromthehierar hy.6Results
Ea hmetri sreviewedinSe tion3andproposedinSe tion4arenow omparedusing
theevaluationmethodology des ribedinthepreviousse tion. The orrelationmetri
d
c
ofEquation6andthemetrid
v
proposedbyT.Virtanen(seeEquation8)requires noparameterization.The metri
d
aronsiders AR ve tors of
4
oe ients omputed with the Burgp
1
p
2
p
3
p
4
p
5
p
6
Fig.7 Dendrogramrepresentingthehierar hyobtainedusingtheAHCalgorithmwith6
par-tials.The utatthehighestlevelofthehierar hyrepresentedbyadotidentifytwoa ousti al
entities
C1
= {p
1, p6
, p2}
andC2
= {p
3, p4
, p5}
.F
D
H
dc
2.909 0.938(0.216) 0.929(0.137)dv
1.763 0.929(0.230) 0.881(0.172)d
ar 1.863 0.712(0.326) 0.757(0.166)ds
3.488 0.944(0.210) 0.940(0.130)dsp
2.909 0.936(0.219) 0.931(0.133)Table1 Three riteria(Fisher,density,hierar hi al lassi ation)resultsforthevemetri s
presentedinthispaper,appliedonthefrequen iesofthepartials.Thedensityandhierar hi al
riteria(twolast olumns)arepresentedass oresbetween0and1.Forevery riteria,ahigher
valuemeansbetterperforman e.
omputation of the metri
d
sp
(see Equation 13) is similar ex eptthat a9
th
order
polynomialisrstestimatedandremovedbeforetheFFT omputation.Theresultsare
presentedasmeanvaluesforea h riterion,andthebra ketedvaluesarethestandard
deviations(notshownfor
F
sin ethevalueisalreadynormalized).6.1 Frequen yParameter
Themetri sbetweenpartialsbasedonthefrequen yparameterisshowedonTable1.
The
d
s
metri we proposedgivesthe bestresults for the three riteria. It shouldbe notedthatthe orrelationmetri (d
c
)givesalsogoodresultsforthetwolast riteria. We analsoseethatremovingthepolynomialfromthefrequen iesofthepartialsdoesnot ontributetothequalityofthemetri sin efrequen iesofthepartialsofthesounds
intheIOWAdatabasearequasi-stationary.Theperforman eisevenworsebe auseof
themodulationsthatthepolynomialmighttakeawayfromthefrequen yevolutions.
6.2 AmplitudeParameter
AspresentedonTable2,theperforman eofthemetri sfortheamplitudeparameter
are globally worse than those obtained for the frequen y parameter, lowering from
dv
1.298 0.784(0.316) 0.773(0.159)d
ar1.938 0.664(0.331) 0.733(0.156)
ds
1.452 0.778(0.301) 0.781(0.163)dsp
1.366 0.796(0.297) 0.803(0.171)Table2 Three riteria(Fisher,density,hierar hi al lassi ation)resultsforthevemetri s
presentedinthispaper,appliedontheamplitudesofthepartials.Thedensityandhierar hi al
riteria(twolast olumns)arepresentedass oresbetween0and1.Forevery riteria,ahigher
valuemeansbetterperforman e.
F
D
H
dv+v
1.298 0.784(0.316) 0.773(0.159)d+
2.040 0.923(0.230) 0.928(0.137)dm
3.303 0.934(0.216) 0.943(0.122)d×
2.702 0.937(0.217) 0.951(0.116)Table 3 Three riteria(Fisher,density,hierar hi al lassi ation)resultsforthe four
om-binedmetri swedened.Thedensityandhierar hi al riteria(twolast olumns)arepresented
ass oresbetween0and.Forevery riteria,ahighervaluemeansbetterperforman e.
Themetri
d
c
performsbestfor thedensity riterionsin eitisgenerallyverylow for verysimilarpartials. Themetrid
ar
givesagood resultfor theFis her riterion
while itperforms badly forthe twoother riteria.Thismetri was tested inanother
work[24℄,butonlyonaverylimiteddatabase.Onalarger databasesu has onethe
oneoftheIOWA,we anseethatthismetri doesnotseemverystableonthethree
riteria.Inthismater,thespe tralmetri s
d
s
andd
sp
performbest.6.3 Combination
Inordertojointlytakeinto a ount the ommonvariation ueofthe frequen yand
amplitude parameters,we onsidered all possible ombinations of pre eding metri s
(
d
c
,d
v
,d
ar,d
s
,d
sp
)forea hspe tralparamterwiththethreeoperatorsweproposed (+
,×
,min
).Onlythemostrelevantonesare presentedonTable 3for laritysake.The metri
d
m
is given best for theFis her riterion while the metrid
×
shows best resultsfor both densityand hierar hi al lassi ation riteria(the lassi ationperforman eisenhan edby1%overtheobtainedresultswiththefrequen y ueonly).
Hen e the metri
d
×
will be kept for timbral dis rimination presented in the next Se tion.7InstrumentsClass dis rimination
Inthepreviousse tion,weusedtheevaluationdatabasegloballyinorderto ompare
thedierentmetri s.Westudyinthisse tion adetailedevaluationofthebehaviorof
theproposedmetri by onsideringseveral levelsintheinstrumentshierar hy ofthe
Instruments intra(a) intra(b) inter(a,b)
a b mean
σ
max meanσ
max meanσ
minOb Ob 0.018 0.020 0.099 0.018 0.020 0.099 0.101 0.087 0.004 Ob Sx 0.018 0.021 0.092 0.062 0.072 0.652 0.314 0.225 0.007 Tu To 0.021 0.033 0.334 0.012 0.015 0.131 0.277 0.152 0.011 BW WW 0.015 0.022 0.295 0.083 0.102 0.667 0.315 0.184 0.016 BS SS 0.127 0.119 0.905 0.479 0.3 1.157 0.5 0.265 0.012 S W 0.237 0.216 0.946 0.059 0.11 0.928 0.373 0.204 0.024
Table 4 Evaluation of the dis rimination apabilities of the proposed metri for dierent
instrumentssu hasOboe(Ob),Saxophone(Sx),Trumpet(Tu)andTrombone(To)aswellas
setsofinstrumentsofthe IOWAdatabasesu hasBrassWinds(BW), WoodWinds(WW),
BowedStrings(BS),and Stru kedStrings(SS).Thevaluesinthe tablearerespe tivelythe
mean,standarddeviationandmaximalvaluesofthe
d×
metri .Strings Wind
Bowed Brass Wood
Bass, Cello, Violin IOWA Piano Trombone, Trumpet Flute, Saxophone, Clarinet, Flute, Bassoon, Oboe Stru k
Fig.8 TheIOWAdatabasehierar hy.
thisse tion.Ea hgroup orrespondstoanodeatagivenlevelofthehierar hyshowed
inFigure7.
The methodology used for these experiments is the one des ribed in Se tion 5.
For ea h experiment, we randomly sele t 100 entities of ea h onsidered groupand
theintraandinterare omputedforea h oupleofentities,ea hentitybelongingto
one group. Only ouples with dierent entities are onsidered. In order to improve
the larityof theresults,theintraand intervaluesarenotaveraged overall ouples.
Instead,the mean andthe standarddeviationis omputed,as well asthe maximum
valuerespe tivelyfortheintraandtheinter.
In the rst experiment, whi h results are reported in the rst line of Table 4,
we onsider a ousti al entities produ ed by the Oboe only. Sin ethe same groupis
onsidered onbothsides, the intravalues areequal. However, the interis not equal
totheintrasin ethe omputationoftheintrainvolvesonlythepartialsofoneentity,
whilethe omputationoftheinteralwaysinvolvespartialsofdierententities.
Inordertoseparateperfe tlytwoentitiesoftheOboe,wewouldneedtohavethe
minimumvalue oftheintergreaterthanthemaximumvalueoftheintra.Itis learly
theSaxophone andtwoinstrumentsoftheBrass Windfamily,theTrumpetandthe
Trombone.Sin ethesetofentitiesisdierentfromthepreviousexperimentwithOboe
only,theintraisslightlydierent.By onsideringtwodierentinstruments,theinteris
in reasedtoavaluethatremainsalmoststableinthehigherlevelsofthehierar hy.It
showsthatthedieren ebetweeninstrumentsisthemostsalientlevelofthehierar hy,
asfarastheproposedmetri is onsidered.
Next,theBrassWindandtheWoodWindfamilya hieveverylowintra,meaning
thatpartialsofthesameentityofthesetwofamiliesaredensea ordingtotheproposed
metri .ThefthlineofTable4presentstheresultswhile onsideringtheBowedStrings
andStru kStringsfamilies,thatappeartobeverydissimilar.Thehighintervaluemay
beexplainedbythedierenttypesofex itationsleadtoverydierenttimbre.
Thepartialsofthea ousti alentitiesprodu edbythePiano(uniqueinstrumentof
thestru kstringfamilyinthedatabase)arespreadoverthefeaturespa e.Eventhough
thenewmetri onsidersspe tralinformationwhi hdoesimprovetheperforman eover
thetemporalinformationin aseofmi ro-modulations,seeFigure3,itappearsthatthe
mi ro-modulationsarenotassalientaslargermodulationssu hasvibratoortremolo.
8Appli ations
Inthis se tion, we des ribe someappli ations where su hdes riptionof the
spe tro-temporal ontentofaudiostreams anbehelpful.
8.1 BinauralS eneAnalysis
The urrent paperdeals with the ommonvariationof partials. However, two more
ues are importantfor the per eptualgathering ofpartials: the ommondire tionof
arrival,andtheharmoni ityamongpartials[6℄.
The ommon dire tion of arrival an be determinedin the ase of multi hannel
audio.Inthe ase of binaural sounds(stereosounds re orded atthe entran eofthe
auditory hannels),itispossibletoobtainanoverallgoodestimationofthedire tion
ofarrivalofsoundsour es.Asstudiedin[37℄,where itisshownthatthedire tionof
arrivalofpartials,althoughnotaperfe t riterion anbeusedas apartial lustering
ue.Theharmoni ity uehasbeenusedforthegatheringofpartialstoo,su hasin[46℄.
Bydeterminingtheharmoni relationshipbetweenpartials,itispossibletodetermine
gatherthepartialsbysour esoftheonehand,andpointouttheoverlappingpartials.
Thesethree uesworkverydierentlyfromea hother.Hen e,by ombiningthem,
we think that we may be able to enhan e the robustness and pre ision of the
par-tial gathering pro ess as the diversity added by the dierent ues shows interesting
perspe tives.
8.2 A ousti alEntitiesSimilarity
We are interested inthis type of appli ation sin e there is anin reased interest
towardsre ommendationsystemsthatarenotbasedonanontologysu hasgenre[45℄
orinstrumenttype[21℄.Alternatively,one an onsiderare ommendationsystemthat
statesshow metunes that are similarto the onesI like.In this ase,one needsto
dene the similarity between musi al audio signals and the timbre is aninteresting
dimensionto onsider.
Weare urrentlyinvestigatingageneralizedversionofthedes riptorsdes ribedin
thispaperforsu hapurpose.Preliminarevaluationsshowthaton ontinuousmusi al
solos, theuseof thosedes riptors ombinedwithstandard segmentaldes riptors like
theMFCC'ssigni antlyimprovetheperforman es.
8.3 SingingVoi eDete tion
Astheproposeddes riptors apturethemodulationsovertimeofthespe tral
param-eters, theymodele ientlythemodulations of thesinging voi e, su h asvibrato or
tremolo. Assuming that the singing voi e is almost always modulated [39℄, one an
onsiderthattheproposeddes riptors anbe onsideredtoestimatewhetherasinging
voi eisa tiveornot.Preliminarexperimentsshow ompetitiveperforman e ompared
to state-of-the-artstatisti al approa hesusingstandard des riptors likethe MFCC's
[36℄.Astheproposeddes riptorsandtheMFCC'smodeldierentaspe tsoftheaudio
stream,itisexpe tedthata ombinationofbothapproa heswillprovidesasigni ant
improvement.
9Con lusion
In this arti le, we have proposed a new metri that dis riminate partials of
dier-enta ousti alentitiesby onsideringthe evolutionsoftheir frequen yandamplitude
parameters.
Consideringthe orrelationofthespe trumoftheseevolutionsleadtomorestable
results thanthe oneobtained withthe ARmodeling approa hproposed inprevious
work [24℄. A ording to the experiments, the modulations of the frequen y appear
tobethemost relevant ue,howeveraslight improvement anbe gained on erning
theamplitudeiftheenvelopeis removed.Wealsodemonstratedthat onsideringthe
ombinationof metri soffrequen ies and theamplitudes enhan edthe lassi ation
resultsasfarasthedensityandhierar hi al riteriaare on erned.
Thisnewmetri maybeusedforthe lassi ationofpartialsintoa ousti alentities.
It has to be noted that the hierar hi al lassi ation used as a quality riterion in
our study, even though very naive, yields to very good results, about 95 per ents
of orre t lassi ations. Even better performan e ould ertainlybe obtained using
more sophisti ated lassi ation methods, whi h ould be of interestfor many MIR
appli ations.
A knowledgements This work has been initiated when the authors were at the LaBRI
(UMR-Cnrs5800,UniversityofBordeaux1)andhasbeenpartlyfundedbytheOSEOproje t
1. M. Abeand I.Smith,J. O. Am/fmrateestimationfortime-varyingsinusoidal
model-ing.InPro .IEEEInternationalConferen eonA ousti s,Spee h,andSignalPro essing
(ICASSP'05),volume3,pagesiii/201iii/204,1823Mar h2005.
2. J.-J.Au outurierandF.Pa het.Theinuen eofpolyphonyonthedynami almodelling
ofmusi altimbre. PatternRe ognitionLetters,28(5):654661,2007.
3. F.AugerandP.Flandrin. ImprovingtheReadabilityofTime-Frequen yandTime-S ale
RepresentationsbytheReassignmentMethod. IEEETransa tionsonSignalPro essing,
43:10681089,May1995.
4. R.Badeau,G.Ri hard,andB.David. Performan eofespritforestimatingmixturesof
omplexexponentials modulatedbypolynomials. SignalPro essing, IEEETransa tions
on[seealsoA ousti s,Spee h,andSignalPro essing,IEEETransa tionson℄,56:492504,
2008.
5. J.P.BelloandJ.Pi kens. ARobustMid-levelRepresentationforHarmoni Contentin
Musi Signals. InISMIR,O tober2005.
6. A.S.Bregman. AuditoryS eneAnalysis: ThePer eptual Organization ofSound. The
MITPress,1990.
7. J.P.Burg.MaximumEntropy Spe tralAnalysis. PhDthesis,StanfordUniversity,1975.
8. M.G.ChristensenandS.H.Jensen.Onper eptualdistortionminimizationandnonlinear
least-squaresfrequen yestimation. IEEETransa tionsonAudio,Spee h,andLanguage
Pro essing,14(1):99109,Jan.2006.
9. M.Cooke.ModellingAuditoryPro essingandOrganization.CambridgeUniversityPress,
NewYork,1993.
10. L.Daudet. Sparseandstru turedde ompositionsofsignalswiththemole ularmat hing
pursuit.IEEETransa tionsonAudio,Spee h,andLanguagePro essing,14(5):18081816,
Sept.2006.
11. P.Depalle,G.Gar ia,andX.Rodet. Tra kingofPartialsforAdditiveSoundSynthesis
UsingHiddenMarkovModels. InIEEEInternational Conferen e onA ousti s, Spee h
andSignalPro essing(ICASSP),volume1,pages225228,April1993.
12. D.Ellis. Predi tion-driven omputationalauditory s ene analysis. PhDthesis,
Depart-ment.ofEle tri alEngineering&ComputerS ien e,M.I.T,1996.
13. D.Ellisand D.Rosenthal. Mid-levelrepresentationsforComputational AuditoryS ene
Analysis.InInternationalJointConferen eonArti ialIntelligen e(IJCAI)-Workshop
onComputationalAuditoryS eneAnalysis,August1995.
14. D.EllisandB.Ver oe.Aper eptualrepresentationofsoundforauditorysignalseparation.
In123rd meetingoftheA ousti alSo ietyofAmeri a,May1992.
15. P. Fernandez and J. Casajus-Quiros. Multi-Pit h Estimation for Polyphoni Musi al
Signals. InIEEEInternational Conferen eonA ousti s, Spee h and SignalPro essing
(ICASSP),pages35653568,April1998.
16. L. Fritts. The IOWA Musi Instrument Samples. Online. URL:
http://theremin.musi .uiow a.e du, 1997.
17. S.Grossberg. Pit hBased StreaminginAuditoryPer eption.CambridgeMA,MitPress,
1996.
18. P.Herrera, G. Peeters, and S. Dubnov. Automati Classi ation of Musi al Sounds.
"JournalofNewMusi alResear h",32(1):321,2003.
19. A.N.S.Institute.USAStandardA ousti alTerminology,1960.
20. F.Itakura.MinimumPredi tionResidualPrin ipleAppliedtoSpee hRe ognition.IEEE
Transa tionsonA ousti s,Spee handSignalPro essing,23(1):6772,1975.
21. C.Joder,S.Essid,andG.Ri hard. TemporalIntegration forAudioClassi ation with
Appli ation toMusi alInstrumentClassi ation. IEEETransa tionsonAudio, Spee h
andLanguagePro essing,17(1):174186,2009.
22. S.C.Johnson.Hierar hi alClusteringS hemes.Psy hometrika,2(2):241254,1967.
23. A.Klapuri.SeparationofHarmoni SoundsUsingLinearModelsfortheOvertoneSeries.
InIEEEInternationalConferen eonA ousti s,Spee handSignalPro essing(ICASSP),
2002.
24. M. Lagrange. ANew DissimilarityMetri For The Clustering Of Partials Using The
CommonVariationCue. InPro eedingsoftheInternationalComputerMusi Conferen e
25. M. Lagrange and S. Mar hand. Estimating the instantaneous frequen y of sinusoidal
omponentsusingphase-basedmethods.JournaloftheAudioEngineeringSo iety,2007.
26. M. Lagrange, S. Mar hand, and J. Rault. Enhan ing the tra kingof partials for the
sinusoidal modeling of polyphoni sounds. IEEE Transa tions on Audio, Spee h and
LanguagePro essing,28:357366,Aug.2007.
27. M. Lagrange, S. Mar hand, and J.-B.Rault. Using Linear Predi tion to Enhan e the
Tra kingofPartials. InIEEEInternationalConferen eonA ousti s,Spee handSignal
Pro essing(ICASSP),volume4,pages241244,May2004.
28. M.Lagrange,L.G.Martins,J.Murdo h,andG.Tzanetakis. NormalizedCutsfor
Pre-dominantMelodi Sour eSeparation.IEEETransa tionsonAudio,Spee handLanguage
Pro essing,16(2):278290,2008.
29. J. Laro he. Theuse ofthe matrix pen il methodfor the spe trum analysisofmusi al
signals.TheJournal oftheA ousti alSo ietyofAmeri a,94(4):19581965,1993.
30. S.Mar handandM.Raspaud.Enhan edTime-Stret hingUsingOrder-2Sinusoidal
Mod-eling.InPro .DAFx,pages7682.Federi oIIUniversityofNaple,Italy,O tober2004.
31. K. D. Martin and Y. E. Kim. Musi al Instrument Re ognition: a pattern-re ognition
approa h. In136thmeetingoftheA ousti alSo ietyofAmeri a,O tober1998.
32. S. M Adams. Segregation of Con urrrents Sounds : Ee ts of Frequen y Modulation
Coheren e. Journal oftheAudioEngineeringSo iety,86(6):21482159,1989.
33. R.J.M AulayandT.F.Quatieri.Spee hAnalysis/SynthesisBasedonaSinusoidal
Repre-sentation.IEEETransa tionsonA ousti s,Spee handSignalPro essing,34(4):744754,
1986.
34. A.Nealen.Anas-short-as-possibleintrodu tiontotheleastsquares,weightedleastsquares
and movingleast squaresmethods for s attered data approximation and interpolation.
URL:http://www.nealen. om/ proj e ts /,May2004.
35. L.Nunes,R.Mer hed, andL.Bis ainho. Re ursiveleast-squaresestimationofthe
evo-lutionofpartialsinsinusoidalanalysis. InIEEEInternationalConferen eonA ousti s,
Spee handSignalPro essing(ICASSP),2007.
36. M.RamonaandG.Ri hard. Vo aldete tioninmusi withsupportve torma hines. In
IEEE International Conferen e onA ousti s, Spee h andSignal Pro essing (ICASSP),
2008.
37. M.RaspaudandG.Evangelista.Binauralpartialtra king.InPro .DAFx,pages123128,
Espoo,Finland,September2008.
38. M.Raspaud,S.Mar hand,andL.Girin.AGeneralizedPolynomialandSinusoidalModel
for Partial Tra king and Time Stret hing. In Pro . DAFx, pages 2429. Universidad
Polit
ni adeMadrid,September2005.ISBN:84-7402-318-1.
39. L. Régnier and G. Peeters. Singing voi e dete tion inmusi tra ks usingdire t voi e
vibrato dete tion. InIEEE International Conferen e on A ousti s, Spee h and Signal
Pro essing(ICASSP),2009.
40. A. Röbel. Adaptive additivemodeling with ontinuous parameter traje tories. IEEE
Transa tionsonA ousti s,Spee handSignalPro essing,14(4):14401453,2006.
41. J.RosierandY.Grenier. UnsupervisedClassi ationTe hniquesforMultipit h
Estima-tion. In116thConventionoftheAudioEngineering So iety.AudioEngineeringSo iety
(AES),May2004.
42. A.R
¶
bel. Frequen y-slopeestimationanditsappli ation toparameterestimationfornon-stationarysinusoids. ComputerMusi Journal,32:6879,2008.
43. X.Serra. Musi alSignalPro essingwith SinusoidsplusNoise, hapter3,pages91122.
StudiesonNewMusi Resear h.Swets&Zeitlinger,Lisse,theNetherlands,1997.
44. A.SterianandG.H.Wakeeld. AModel-BasedApproa htoPartialTra kingforMusi al
Trans ription. SPIEannualmeeting,SanDiego,California,1998.
45. G.TzanetakisandP.Cook.Musi algenre lassi ationofaudiosignals.IEEE
Transa -tionsonAudio,Spee h andLanguagePro essing,10(5):293302,2002.
46. T. Virtanenand A. Klapuri. Separation of Harmoni Sound Sour es UsingSinusoidal
Modeling.InIEEEInternationalConferen eonA ousti s,Spee handSignalPro essing
(ICASSP),volume2,pages765768,April2000.
47. J.H. Ward. Hierar hi alGrouping toOptimizean Obje tiveFun tion. Journal of the