HAL Id: hal-02287339
https://hal.telecom-paris.fr/hal-02287339
Submitted on 4 Feb 2021
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Phase reconstruction of spectrograms with linear
unwrapping : application to audio signal restoration
Paul Magron, Roland Badeau, Bertrand David
To cite this version:
Paul Magron, Roland Badeau, Bertrand David. Phase reconstruction of spectrograms with linear
unwrapping : application to audio signal restoration. [Research Report] 2015D002, Télécom ParisTech.
2015. �hal-02287339�
Phase reconstruction of spectrograms
with linear unwrapping :
application to audio signal restoration
Reconstruction de phases de spectrogrammes
par déroulé linéaire : application à la restauration
de signaux audio
Paul Magron,
Roland Badeau
Bertrand David
avril 2015
Département Traitement du Signal et des Images
Groupe AAO : Audio, Acoustique et Ondes
appli ation to audio signal restoration
Re onstru tion de phases de spe trogrammes par déroulé linéaire :
appli ation à la restauration de signaux audio
PaulMagron Roland Badeau Bertrand David
Institut Mines-Télé om,Télé om ParisTe h,CNRS LTCI, Paris, Fran e
<firstname>.<lastname>tel e om -par iste h. fr
∗
Abstra t
Thispaperintrodu esanovelte hniquefor re onstru tingthephaseofmodiedspe trogramsofaudio
signals. Fromtheanalysisofmixturesofsinusoidsweobtainrelationshipsbetweenphasesofsu essivetime
framesintheTime-Frequen y(TF)domain. Toobtainsimilarrelationshipsoverfrequen ies,inparti ular
withinonsetframes,westudyanimpulsemodel. Instantaneousfrequen iesandatta ktimesareestimated
lo ally to en ompass the lass of non-stationary signals su has vibratos. These te hniques ensure both
the verti al oheren eofpartials (overfrequen ies) andthehorizontal oheren e (overtime). Themethod
is testedonavarietyofdataand demonstratesbetterperforman ethantraditional onsisten y-based
ap-proa hes. We alsointrodu eanaudio restoration frameworkand observethat ourte hniqueoutperforms
traditionalmethods.
Key words
Phasere onstru tion,sinusoidalmodeling,linearunwrapping,phase onsisten y,audiorestoration.
Résumé
Cerapportprésenteunenouvellete hniquepourlare onstru tiondephasesdespe trogrammesmodiés.
Apartirdel'analysedemélangesdesinusoïdes,onobtientdesrelationsentrelesphasestramessu essives
dansleplantemps-fréquen e(TF).Pourobtenirdesrelationssimilairesentrefréquen es,enparti ulierau
sein destramesd'attaque, nousétudionsunmodèled'impulsion.Les fréquen esinstantanéesetlestemps
d'attaque sont estiméslo alement an de pouvoir représenter des signaux non stationnaires, tels que les
vibratos.Ces te hniquespermettentd'assureràlafois une ohéren everti aleentre lespartiels (àtravers
lesfréquen es)ethorizontale(au oursdutemps).Cetteméthodeesttestéesurdesdonnéesexpérimentales,
et montre de meilleurs résultats que l'appro he traditionnelle basée sur la onsistan e. Nous proposons
également d'introduire ettete hniquedansun ontextederestaurationdesignauxaudio,danslequelune
meilleureperforman equ'ave lesméthodestraditionnellesestobservée.
Mots lés
Re onstru tiondephase,mélangesdesinusoïdes,déeroulélinéaire, onsistan edephase,restaurationde
signauxaudio
∗
ThisworkispartlysupportedbytheFren hNationalResear hAgen y(ANR)asapartoftheEDISON3Dproje t
A variety of musi signalpro essing te hniques a t in the TF domain, exploiting the parti ular stru ture of
musi signals. Forinstan e,thefamilyofte hniquesbasedonNonnegativeMatrixFa torization(NMF)isoften
applied to spe trogram-likerepresentations, and hasprovedto provideasu essful and promisingframework
forsour eseparation[1℄. Magnitude-re overyte hniques arealsousefulforrestoringmissingdatain orrupted
signals[2℄.
However, whenit omes to resynthesize timesignals, thephasere overyof the orrespondingShort-Time
Fourier Transform (STFT) is ne essary. In the sour e separationframework, a ommon pra ti e onsists in
applyingWiener-likeltering(softmaskingofthe omplex-valuedSTFToftheoriginalmixture). Whenthere
isnoprioronthephaseofa omponent(e.g. inthe ontextofaudiorestoration),a onsisten y-basedapproa h
isoftenusedforphasere overy[3℄. Thatis,a omplex-valuedmatrixisiteratively omputedtobe losetothe
STFTof a time signal. A re ent ben hmark has been ondu ted to assess thepotentialof sour e separation
methodswithphasere overyinNMF[4℄. Itpointsoutthat onsisten y-basedapproa hesprovidepoorresults
in terms of audio quality. Besides, Wiener lteringfails to provide good resultswhen sour es overlapin the
TFdomain. Thus,phasere overyofmodiedaudio spe trogramsisstill anopenissue. TheHigh Resolution
NMF(HRNMF) model [5℄ hasshownto beapromising approa h,sin e itmodelsaTF mixtureasa sumof
autoregressive(AR) omponentsintheTFdomain, thus dealingexpli itlywithaphasemodel.
Anotherapproa htore onstru tthephaseofaspe trogramistouseaphasemodelbasedontheobservation
of fundamental signals that are mixtures of sinusoids. Contrary to onsisten y-based approa hes using the
redundan y of the STFT, this model exploits the natural relationship betweenadja entTF bins due to the
model. This approa h is used in the phase vo oder algorithm [6℄, although it is mainly dedi ated to time
stret hing andpit h modi ationofsignals,and itrequiresthephaseoftheoriginal STFT.Morere ently,[7℄
proposeda omplexNMFframeworkwithphase onstraintsbasedonsinusoidalmodeling. Althoughpromising,
this approa h is limited to harmoni and stationary signals, and requires prior knowledge on fundamental
frequen iesand numbersofpartials.
In this paper, we propose a generalizationof this approa h that onsists in estimating the phase eld of
mixtures of sinusoids from its expli it al ulation. We then obtain an algorithm whi h unwraps the phases
horizontally (overtimeframes) toensurethe temporal oheren eof thesignal,and verti ally (overfrequen y
hannels)toenfor espe tral oheren ebetweenpartials,whi harenaturallyobservedinmusi ala ousti s. Our
te hniqueissuitableforavarietyofpit hedmusi signals,su haspianoorguitarsounds. Adynami estimation
(atea htimeframe)ofinstantaneousfrequen iesextendsthevalidityofthiste hniquetonon-stationarysignals
su has ellosandspee h. Thiste hniqueistestedonavarietyofsignalsandintegratedinanaudiorestoration
framework.
Thepaperisorganizedasfollows. Se tion2presentsthehorizontal phaseunwrappingmodel. Se tion3is
dedi atedtophasere onstru tionononsetframes. Se tion4presentsaperforman eevaluationofthiste hnique
through various experiments. Se tion 5 introdu es anaudio restoration framework using this phase re overy
method. Finally,se tion6drawssome on ludingremarks.
2 Horizontal phase re onstru tion
2.1 Sinusoidal modeling
Letus onsiderasinusoidofnormalizedfrequen y
f
0
∈ [−
1
2
;
1
2
]
,originphaseφ
0
∈ [−π; π]
andamplitudeA >
0
:∀n ∈ Z
,x(n) = Ae
2iπf
0
n+iφ
0
.
(1)TheexpressionoftheSTFTis,forea hfrequen y hannel
k
∈
J−
F −1
2
;
F −1
2
K(withF
theodd-valuedFouriertransformlength)andtimeframe
t
∈ Z
:X(k, t) =
N −1
X
n=0
x(n + tS)w(n)e
−2iπ
k
F
n
(2)where
w
isaN
sample-longanalysiswindowandS
isthetimeshift(insamples)betweensu essiveframes. LetW
(f ) =
P
N −1
n=0
w(n)e
−2iπf n
bethedis retetimeFouriertransformoftheanalysiswindowforea hnormalizedfrequen y
f
∈ [−
1
2
;
1
2
]
. ThentheSTFTofthesinusoid (1)is:X
(k, t) = Ae
2iπf
0
St+iφ
0
W
k
F
− f
0
.
(3)TheunwrappedphaseoftheSTFTisthen:
φ(k, t) = φ
0
+ 2πSf
0
t
+ ∠W
k
F
− f
0
(4)
where
∠z
denotestheargumentofthe omplexnumberz
. This leadsto arelationshipbetweentwosu essivetimeframes:
φ(k, t) = φ(k, t − 1) + 2πSf
0
.
(5)Moregenerally,we an omputethephaseoftheSTFTofafrequen y-modulatedsinusoid. Ifthefrequen y
variationislowbetweentwosu essivetimeframes,we angeneralizethepreviousequation:
φ(k, t) = φ(k, t − 1) + 2πSf
0
(t).
(6)Instantaneousfrequen ymustthenbeestimatedatea htimeframetoen ompassvariablefrequen ysignals
su h asvibratos,whi h ommonlyo urinmusi signals(singingvoi eor ellosignalsforinstan e).
2.2 Instantaneous frequen y estimation
Quadrati interpolation FFT (QIFFT) is a powerfultool for estimating the instantaneous frequen y near a
magnitude peak in thespe trum[8℄. It onsists in approximatingtheshapeof aspe trumnear amagnitude
peakbyaparabola. Thisparaboli approximationisjustiedtheoreti allyforGaussiananalysiswindows,and
usedinpra ti alappli ationsforanywindowtype. The omputationofthemaximumoftheparabolaleadsto
theinstantaneousfrequen yestimate. Note thatthis te hniqueissuitableforsignalswhere onlyonesinusoid
isa tiveperfrequen y hannel.
The frequen y bias of this method anbe redu edby in reasing thezero-padding fa tor [9℄. Fora Hann
windowwithoutzero-padding, the frequen yestimation erroris lessthan
1
%,whi h ishardly per eptible inmostmusi appli ationsa ordingtotheauthors.
2.3 Regions of inuen e
Whenthemixtureis omposedofonesinusoid,thephasemustbeunwrappedinallfrequen y hannelsa ording
to (5) using the instantaneous frequen y
f
0
. When there is more than onesinusoid, frequen y estimation isperformednearea hmagnitudepeak. Then,thewholefrequen yrangemustbede omposedinseveralregions
(regionsofinuen e[6℄)toensurethatthephaseinagivenfrequen y hannelisunwrappedwiththeappropriate
instantaneousfrequen y.
At time frame
t
, we onsider amagnitude peakA
p
in hannelk
p
. The magnitudes (resp. the frequen yhannels) of neighboring peaks are denoted
A
p−1
andA
p+1
(resp.k
p−1
andk
p+1
). We dene the region ofinuen e
I
p
ofthep
-thpeakasfollows:I
p
=
A
p
k
p−1
+ A
p−1
k
p
A
p
+ A
p−1
;
A
p
k
p+1
+ A
p+1
k
p
A
p
+ A
p+1
.
(7)The greater
A
p
is relativelytoA
p−1
andA
p+1
, the widerI
p
is. Note that other denitions of regions of3.1 Impulse model
Impulsesignalsareusefulto obtainarelationshipbetweenphasesoverfrequen ies(verti alunwrapping)[10℄.
Although they do not a urately model atta k sounds, they provide simple equations that an be further
improvedformore omplexsignals. Themodelis:
∀n ∈ Z
,x(n) = Aδ
n−n
0
(8)where
δ
isequaltooneifn
= n
0
(theso- alledatta ktime)andzeroelsewhereandA >
0
istheamplitude. ItsSTFTisequaltozeroex eptwithinatta kframes:
X
(k, t) = Aw(n
0
− St)e
−2iπ
k
F
(n
0
−St)
.
(9)We anthenobtainarelationshipbetweenthephasesoftwosu essivefrequen y hannelswithinanonset
frame,assumingthat
w
≥ 0
:φ(k, t) = φ(k − 1, t) −
2π
F
(n
0
− St).
(10)Thesimilaritybetween(10)and(5)wasexpe tedbe ausetheimpulseisthedualofthesinusoidin theTF
domain. This omparisonnaturallyleadstoestimatingparameter
n
0
(the"instantaneous"atta ktime)inea hfrequen y hannelaswepreviouslyestimated
f
0
(theinstantaneousfrequen y)inea htimeframe( f. equation(6)). Thisleadsto thefollowingverti alunwrappingequation:
φ(k, t) = φ(k − 1, t) −
2π
F
(n
0
(k) − St).
(11)3.2 Atta k time estimation
Inordertoestimate
n
0
(k)
, welookatthemagnitudeoftheSTFToftheimpulseinafrequen y hannelk
:|X(k, t)| = Aw(n
0
(k) − St).
(12)Wethen hoose
n
0
su h thattheSTFTmagnitudeoftheimpulseoveronsetframeshasashapesimilar tothat oftheanalysiswindow. Forinstan e,aleast-squaresestimationmethod anbeused. Syntheti mixtures
of impulses areperfe tly re onstru ted with this te hnique. Alternatively, we an alsoestimate
n
0
(k)
withatemporalQIFFT andupdatethephasewith(11).
4 Experimental evaluation
4.1 Proto ol and datasets
TheMATLABTempogramToolbox[11℄providesafastandreliableonsetframesdete tionfromspe trograms.
Weuseseveraldatasetsinourexperiments:
A: 30mixturesofpianonotesfromtheMidiAligned PianoSounds(MAPS)database[12℄,
B: 30pianopie esfrom theMAPS database,
C: 12stringquartetsfrom theSCoreInformedSour e SeparationDataBase(SCISSDB)[13℄,
D: 40spee hex erptsfromtheComputationalHearinginMultisour eEnvironments(CHiME)database[14℄.
The data issampled at
F
s
= 11025
Hz and theSTFTis omputed witha512
sample-longHann windowand
75
%overlap. TheSignaltoDistortionRatio(SDR)isusedforperforman emeasurement. Itis omputedwith the BSS Eval toolbox [15℄ and expressed in dB. The popular onsisten y-basedGrin and Lim (GL)
algorithm [3℄ isalso used as areferen e. We run
200
iterationsof this algorithm (performan eis notfurtherimproved beyond). It is initialized with known phase values ifany or random valuesif not, and results are
Time (s)
Frequency (Hz)
0
0.5
1
1.5
2
2.5
3
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
0
0.5
1
1.5
2
2.5
3
1500
1520
1540
1560
1580
1600
1620
1640
1660
Time (s)
Frequency (Hz)
phase vocoder
QIFFT
Figure4.1: Spe trogramofamixturewithvibrato(left)andinstantaneousfrequen iesinthe
2800
Hz hannel(right) Dataset Error GL PU A
0.38
−6.9
2
.5
B0.36
−12.6
1
.7
C0.41
−9.7
5
.3
D0.52
−0.4
0
.5
Table1: Frequen yestimationerror(%)andre onstru tionperforman e(SDRindB)forvariousaudio
datasets
4.2 Horizontal phase re onstru tion
Arstexperiment onsistsin estimatinginstantaneousfrequen iesonsyntheti mixturesofdampedsinusoids,
whi h parameters(in parti ularthe frequen ies) areuser-dened. Frequen y estimation errorwith QIFFT is
belowthethresholdof
0.2
%, ommonlyreferredtoasthemaximalhumanauditory resolution.Figure4.1 illustratestheinstantaneousfrequen iesestimatedwiththephasevo oderte hnique[6℄,used as
areferen e,andwithouralgorithmonavibrato. Identi alresultsareobtained. Ourmethodisthussuitablefor
estimatingvariableinstantaneousfrequen ysignalsaswellasstationary omponents. We omputedtheaverage
frequen y error between phasevo oder and QIFFT estimates for the datasets presented in se tion 4.1. The
resultspresentedintherst olumnofTable1 onrmthatQIFFTprovidesana uratefrequen yestimation.
Table1 alsopresentsre onstru tion performan e (assumingonset phases areknown) for both Grin and
Lim(GL)andourPhaseUnwrapping(PU)algorithms. Ourapproa hsigni antlyoutperformsthetraditional
GL method: both stationary and variable frequen y signals are re onstru ted a urately. In addition, our
algorithm isfaster than theGL te hnique: ona
3
min48
s pianopie e,the re onstru tionis performedin18
swithourapproa handin
623
swithGLalgorithm.4.3 Onset phase re onstru tion
Onset phases an be re onstru ted with
n
0
-estimation using the impulse magnitude (PU-Impulse) or withQIFFT (PU-QIFFT). We also test random phases values (PU-Rand, no verti al oheren e), zero phases
(PU-0, partialsin phase) and alternatingpartialphasesbetween
0
andπ
(PU-Alt, phase-opposed partials).These hoi esarejustiedbytheobservationoftherelationshipbetweenpartialsinmusi ala ousti s[16℄. The
phaseofthepartialsisthenfullyre overedwithhorizontal unwrapping. Wetestthese methodsondatasetA.
ResultspresentedinTable2showthatallourapproa hesprovidebetterresultsthanGLalgorithmonthis lass
ofsignals. Onsetphaseunwrappingwith
n
0
-estimationbasedonQIFFTprovidesthebestresult,ensuringsomeformofverti al oheren e. Inparti ular,weper eptuallyobservethatthisapproa hprovidesaneatper ussive
GL
−7.9
PU-Impulse−4.0
PU-QIFFT−2.6
PU-Rand−4.3
PU-0−4.7
PU-Alt−3.5
Table2: Signalre onstru tion performan eofdierentmethodsondatasetA
10
20
30
40
50
60
70
80
90
100
−15
−10
−5
0
5
10
15
20
25
30
Percentage of corrupted bins
SDR (dB)
Phase unwrapping
GriffinLim
Corrupted
Figure4.2: Re onstru tion performan eofdierentmethodsandper entagesof orruptionondatasetA
4.4 Complete phase re onstru tion
We onsider unalteredmagnitudespe trogramsfrom datasetA.A variable per entageofthe STFTphasesis
randomly orrupted. We evaluate the performan e of our algorithm to restore thephase both on onset and
non-onsetframes.
Figure 4.2 onrms the potential ofthis te hnique. Ourmethod produ ed anaverage in reasein SDR of
6
dBoverthe orrupted data. It also performs better thanthe GL algorithm whena high per entageof theSTFTphasesmustbere overed.
However,notethatthisexperiment onsistsinphasere onstru tionof onsistent spe trograms(i.e positive
matri esthatare themagnitudeoftheSTFTofatimesignal): GLalgorithm isthennaturallyadvantagedin
this ase. Realisti appli ations( f. nextse tion)involvetherestorationofbothphaseand magnitude,whi h
leadsto in onsistentspe trograms.
The goalof audio inpaintingis to restore orrupted ormissing value of asignal. Sin e orruption an be
doneinthetemporaldomain orintheTFdomain,wewillstudythosetwoapproa hes.
First,wewillpartiallyre overphasefroma onsistentspe trogram(the STFTmagnitudeisnotmodied).
Se ondly, wewillre onstru t whole parts ofSTFT(bothmagnitudeandspe trogram). Finally, atimesignal
is orrupted with li ks and we ompare restoration with a temporal method, our approa h, and HR NMF
algorithm[5℄.
5 Appli ation of phase re onstru tion to audio restoration
%subse tionTemporal orruption: li kremoval
A ommonalterationofmusi signalsisthepresen e ofnoiseonshort timeperiods(afew samples) alled
li ks. We orrupttimesignalswith li ksthatrepresentlessthan
1
%ofthetotalduration. Cli ksareobtainedbydierentiatinga
10
sample-longHannwindow.Magnitude restoration of missing binsis performed by linear interpolation of the log-magnitudes in ea h
Time (s)
Frequency (Hz)
0
1
2
1000
2000
3000
4000
5000
Time (s)
0
1
2
1000
2000
3000
4000
5000
Time (s)
0
1
2
1000
2000
3000
4000
5000
Figure5.1: Restorationofspe trogrambylinearinterpolationofthelog-magnitudesonapianonotes
Dataset AR HRNMF GL PU
A
11.4
16
.9
8.6
11
.7
B
4.3
10
.9
5.9
7
.1
C
8.2
10
.6
6.6
7
.1
D
8.3
10
.9
8.9
9
.4
Table3: Signalrestorationperforman e(SDRindB)forvariousmethodsanddatasets
(PU)oralternativelywiththeGL algorithm. We omparethose resultstothetraditional restorationmethod
basedonautoregressive(AR)modelingofthetimesignal[18℄, andwithHRNMF[5℄.
Table 3presents results of restoration. HRNMFprovidesthe best results in termsof SDR. Though, our
approa houtperformsthetraditionalmethodandGLalgorithm.Besides,weunderlinethattheHRNMFmodel
uses the phase of the non- orruptedbins, while our algorithm is blind. Lastly, ourte hnique remains faster
thanHRNMF:fora
3
min55
spianopie e,restorationisperformedin99
swithouralgorithm andin222
swithHRNMF.
6 Con lusion
The new phase re onstru tion te hnique introdu ed in this work appears to be an e ient and promising
method. Theanalysisofmixturesofsinusoidsleadstorelationshipsbetweensu essiveTFbinsphases. Physi al
parameters su h as instantaneous frequen ies and atta k times are estimated dynami ally, en ompassing a
variety of signals su h as piano and ellos sounds. The phase is then unwrapped in all frequen y hannels
for onset framesand overtimefor partials. Experimentshavedemonstrated thea ura y of theunwrapping
method, andweintegrateditinanaudiorestorationframework. Betterresultsthanwith traditionalmethods
havebeenrea hed.
There onstru tion ofonset framesstillneedsto beimprovedassuggestedby thevarietyof data. Further
work willfo usonexploitingknownphasedataforre onstru tion: missingbins anbeinferredfromobserved
phasevalues. Alternatively,time-invariantparameterssu hasphaseosetsbetweenpartials[19℄ anbeused.
Su hdevelopmentswillbeintrodu edinanaudiosour eseparationframework,wherethephaseofthemixture
[1℄ ParisSmaragdisandJudithC.Brown, Non-negativematrixfa torizationforpolyphoni musi
trans rip-tion, in Pro . IEEE Workshop onAppli ations of SignalPro essing toAudioandA ousti s(WASPAA),
NewPaltz,NY, USA,O tober2003.
[2℄ DerryFitzgerald and Dan Barry, On inpainting the adress algorithm, in Pro . IET Irish Signals and
SystemsConferen e(ISSC),Maynooth,Ireland,June2012,pp.16.
[3℄ DanielGrinandJaeLim, Signalestimationfrommodiedshort-timeFouriertransform, IEEE
Trans-a tionson A ousti s,Spee h andSignalPro essing, vol.32,no.2,pp.236243,April1984.
[4℄ Paul Magron, Roland Badeau, and Bertrand David, Phase re onstru tion in NMF for audio sour e
separation: An insightfulben hmark, inPro . IEEE International Conferen e onA ousti s, Spee h and
SignalPro essing (ICASSP),Brisbane,Australia,April2015.
[5℄ Roland Badeau and Mark D. Plumbley, Multi hannel high resolution NMF for modelling onvolutive
mixtures ornon-stationary signalsin thetime-frequen y domain, IEEE Transa tions on Audio Spee h
andLanguage Pro essing,vol.22,no.11,pp.16701680,November2014.
[6℄ JeanLaro heandMarkDolson, Improvedphasevo odertime-s alemodi ationofaudio, IEEE
Trans-a tionson Spee h andAudioPro essing,vol.7,no.3,pp.323332,May1999.
[7℄ JamesBronsonand PhilippeDepalle, Phase onstrained omplexNMF:Separatingoverlappingpartials
in mixtures of harmoni musi al sour es, in Pro . IEEE International Conferen e on A ousti s, Spee h
andSignalPro essing(ICASSP),Floren e,Italy,May2014.
[8℄ MototsuguAbeandJuliusO.Smith, Design riteriaforsimplesinusoidalparameterestimationbasedon
quadrati interpolationof FFTmagnitude peaks, in Audio Engineering So iety Convention 117,Berlin,
Germany,May2004,Audio EngineeringSo iety.
[9℄ MototsuguAbeandJuliusO.Smith, Design riteriaforthequadrati allyinterpolatedFFTmethod (i):
Biasduetointerpolation, Te h.Rep.STAN-M-117,StanfordUniversity,DepartmentofMusi ,2004.
[10℄ AkihikoSugiyamaandRyojiMiyahara, Tapping-noisesuppressionwithmagnitude-weightedphase-based
dete tion, in Prof. of IEEE Workshop on Appli ations of Signal Pro essing to Audio and A ousti s
(WASPAA),NewPaltz,NY,USA,O tober2013,pp.14.
[11℄ Peter Gros heandMeinardMüller, TempogramToolbox: MATLABtempoand pulseanalysis ofmusi
re ordings, in Pro . International So iety for Musi Information Retrieval (ISMIR)Conferen e, Miami,
USA,O tober2011.
[12℄ Valentin Emiya, Nan y Bertin, Bertrand David, and Roland Badeau, MAPS - A piano database for
multipit h estimation and automati trans ription of musi , Te h. Rep. 2010D017, Télé om ParisTe h,
Paris,Fran e, July2010.
[13℄ RomainHennequin,RolandBadeau,and BertrandDavid, S oreinformedaudiosour eseparationusing
aparametri modelof non-negativespe trogram, in Pro . IEEE International Conferen e onA ousti s,
Spee h andSignalPro essing(ICASSP),Prague,Cze h Republi ,May2011,pp.4548.
[14℄ Jon Barker, Emmanuel Vin ent, Ning Ma, Heidi Christensen, and Phil Green, The PASCAL CHiME
Spee h Separationand Re ognitionChallenge, Computer Spee h andLanguage, vol.27, no.3, pp.621
633,Feb.2013.
[15℄ Emmanuel.Vin ent,RémiGribonval,andCédri Févotte,Performan emeasurementinblindaudiosour e
separation, IEEETransa tions onSpee h andAudioPro essing,vol.14,no.4,pp.14621469,July2006.
fa torization, ImageandVision Computing,2007.
[18℄ SimonJ.GodsillandPeterJ.W.Rayner,DigitalAudioRestoration-AStatisti alModel-BasedApproa h,
Springer-Verlag,1998.
[19℄ HolgerKir hho,RolandBadeau,andSimonDixon, Towards omplexmatrixde omposition of
spe tro-grambasedontherelativephaseosetsofharmoni sounds, inPro .IEEE International Conferen eon