Université Pierre et Marie Curie – Paris VI
Mémoire de stage de Master 2 ATIAM
Estimation du tempo perceptif et
réduction des erreurs d’octave du tempo
Joachim FLOCON-CHOLET
Sous la direction de Geoffroy Peeters
1er mars 2012 – 31 juillet 2012
Institut de Recherche et Coordination Acoustique/Musique 4, place Igor Stravinsky 75004 Paris
x
y
T
aα
x X p(x)
E E x y
N (z | µ, Σ) µ
Σ
T
eT
aBP M
M F CC
GM M
M T T
T
ii = 1, ..., 5
M T
iM
T
�T
�= arg max
Ti
p(T
i| M)
p(T
i| M ) T
iM
y
T F (y) F M ACF (y)
T F (y) × F M ACF (y)
Au d io O n se t D e te ct io n F u n ct io n
Pe ri o d ici ty Est ima tio n T e mp o / Me te r Est ima ti o n Me te r/ Be a t Su b d ivi si o n Te mp la te
R e a ssi g n e d Sp e ct ra l En e rg y F lu x C o mb in e d D F T / F M- AC F V it e rb i d e co d in g o f MBST o ve r ti me
0 50 100 150
rock
country pop soul classic rock
rnb christian oldies
alternative disco
vocalists hip−hop
christmas dance reggae latin jazz hard rock
rap salsa funk folk Genres musicaux représentés dans la base de données
Nombre d’occurrences
| x ¯
r− x ¯
s| < 6% max { x ¯
r; ¯ x
s}
¯
x
rx ¯
sr s
¯ x
r= 1
n
r�
n i=1x
ri6%
T a 4%
4%
T e
T a
T e = kT a k ∈ { 1/3; 1/2; 2/3; 2; 3 }
% T
e= 2T
aT
e= T
a/2 T
e= 2/3T
aT e = 3/2T
aT
e= 3T
aT
e= T
a/3
± 6%
% % %
T
e= 2T
aT
e= T
a/2 T
e= 3T
aT
e= T
a/3
%
!
"#"
"#"$
"#"$%
"#&"$%
"#%&"$
"#"
"#%"
t
S(f
k, t) f
kf
kt
il ∈ [0, 11] l
n(f
k, t
i) = 12 log
2� f
kf
ref�
mod 12
f
reft
ic(l, t
i) = �
fk n(fk,ti)=l
| S(f
k, t
i) |
2l = 0, 1, ..., 11
C(l, t)
t t
C(l, t) t
iL t
iR t
iL R
T
hT
h∈ [30, 200]
L = [t
i− αT
h, t
i] R = [t
i, t
i+ αT
h] T
hα = 4
t
iµ
LC(l, L) µ
RC (l, R) µ
Lµ
Rd(L, R) = 1 − µ
L· µ
R|| µ
L|| · || µ
R||
C
i(ω, T
h) C
ii ω
T
hTemps [sec]
Tempo assumption
5 10 15 20 25
40 60 80 100 120 140 160 180
Freq [bpm]
Tempo assumption
0 20 40 60 80 100 120 140 160 180 200
40 60 80 100 120 140 160 180
T
hT
ht
iT
h∈ [30; 200]
t
iL kmax
r(t
i) =
�
ti+L/2 t=ti−L/2�
N/2k=kmax
| S(ω
k, t) |
2�
ti+L/2 t=ti−L/2�
kmaxk=1
| S(ω
k, t) |
2N L T
h/2 kmax
t
ir T
hr
j�(t
i) = r(t
i− (j − 1)T
h)
j ∈ { 1; 2; 3; 4 } r
�j(t
i) �
j
r
�j(t
i) = 1 r
j��(t
i) = 1 − r
�j(t
i)
r
tot(t
i) = r
��j=1;3(t
i) − r
j=2;4��(t
i)
T
ht
ir
tot(t
i) r
tot(t
i+ T
h)
B(t
i, T
h) r
tot(t
i)
r
totT
hB(ω, T
h) w
B(t
i, T
h) B(w, T
h)
B
i(ω, T
h) B
ii ω
T
hTime [sec]
Hypothèse de tempo (Th)
5 10 15 20 25
40 60 80 100 120 140 160 180
Freq [bpm]
Hypothèse de tempo (Th)
0 20 40 60 80 100 120 140 160 180 200
40 60 80 100 120 140 160 180
r T
hr T
ht
iS(t
i, t
j) t
it
jv
tit
iS(t
i, t
j) = d(v
ti, v
tj)
S(t
i, t
j) L(t
i, l
j) l
j= t
j− t
il
jl
jt
iMatrice d’auto−similarité
Temps [sec]
Temps [sec]
0 2 4 6 8 10 12 14 16 18 20
0 5 10 15 20
Matrice de retard
Temps [sec]
Retard [sec]
0 2 4 6 8 10 12 14 16 18 20
0 5 10 15 20
0 20 40 60 80 100 120 140 160 180 200
0 0.5 1 1.5
2 x 10
−3Fréquence de répétition [BPM]
E(i) = �
f
| X(f, t
i) | − | X(f, t
i−1) |
X(f, t
i) x t
iE(i)
0 5 10 15 20 25 30 0
2 4 6 8
Temps [sec]
Amplitude
Fonction d’energie
0 50 100 150 200
0 0.05 0.1 0.15 0.2 0.25
Spectre de la fonction d’energie
Freq [BPM]
Amplitude
z z D K
p(z) =
�
K k=1π
kN (z | µ
k, Σ
k)
N (z | µ
k, Σ
k) N (z | µ
k, Σ
k)
N (z | µ
k, Σ
k) = 1
(2π)
D/2| Σ |
1/2exp
�
− 1
2 (z − µ
k)
TΣ
−1(z − µ
k)
�
µ
kD Σ
kD × D
| Σ | Σ
π
kπ
kz
k
eme�
K k=1π
k= 1, π
k� 0
x = [x
1x
2...x
N] Y = [y
1y
2...y
N] z
x
iy
iz =
� y
ix
i�
π
kµ
kΣ
kp(z)
p(x
i, y
i) EM
x y
F (y) = E [x | y] =
�
K k=1h
k�
µ
xk+ Σ
xyk(Σ
yyk)
−1(y − µ
yk) �
h
k(y) = π
kN (y | µ
yk, Σ
yyk)
�
Kk=1
π
kN (y | µ
yk, Σ
yyk)
y k
emeΣ
k=
� Σ
yykΣ
yxkΣ
xykΣ
xxk�
µ
k=
� µ
ykµ
xk�
x
y
x
m
im
i=
� T a
iT e
iz = [y x]
Tx
y
y
i C
iB
is
ie
iC
iB
is
ie
iT e
iC
iB
is
ie
ik =
� 1 4 ; 1
3 ; 1 2 ; 2
3 ; 3
4 ; 1; 1.25; 1.33; ...; 2
�
C
iB
iT h
k D = 12
D = 48
T
T eTk
!" # $ %&'(
T e T e = 106.3
c
�ib
�is
�ie
�iy
i=
c
�ib
�is
�ie
�i
z
i=
� y
ix
i�
=
c
�ib
�is
�ie
�iT a
i
T e
T T e
= 1
T e
T e = T a T e = 2T a
EM
EM
loglikelihood
NN loglikelihood
Nloglikelihood
N−1− 1 < 1
−10T a
T e
T a T a T e
T
aT
eObs T
aK
6%
T a
K
K = 20
K = 4 K = 8 K = 12 K = 16 K = 20 K = 24
K
K = 20
σ
σ = 1.45 − 4.1 σ = 2.96 +0.9 σ = 3.50 +0.4 σ = 2.31 − 1.13 σ = 1.91 − 0.36 σ = 2.33 − 2.05
T a
1%
α
α
αT e = T a
T e
�= ˆ αT e α ˆ
α α =
T aT eT e α
α α
T e = 2T a T e =
T a2α =
12α = 2
α = 1 α ∈ �
12
; 1; 2 �
α α ∈ �
1/2; 1; 2 � α ∈ �
1/2; 1; 2 �
± 0.2 α = 0.65
α = 0.5
T a T a ∈ [30, 200]
K
K = 2 K = 4 K = 8
% σ % σ % σ
1.79 2.36 2.22
3.27 3.39 2.38
1.79 3.69 1.94
2.24 3.78 3.26
2.16 2.29 3.00
2.34 2.03 2.65
2.33 3.53 2.95
3.11 1.72 3.16
2.52 2.51 2.63
2.31 1.99 2.82
1.43 2.01 2.01
2.59 2.86 2.84
α
α = 1 α = 1/2
α = 2
α = 1 T e = 2T a α = 1/2 T e = T a/2 α = 2
p(y) =
�
K k=1π
kN (y | µ
k, Σ
k)
y K
µ
kΣ
kK = 2 K = 4
% %
σ = 3.22 σ = 3.08
σ = 4.03 σ = 2.71
σ = 2.53 σ = 2.29
σ = 2.97 σ = 3.03
σ = 3.70 σ = 3.90
σ = 3.08 σ = 2.23
σ = 2.83 σ = 2.46
σ = 3.04 σ = 2.39
σ = 2.72 σ = 1.63
σ = 3.56 σ = 2.30
σ = 2.28 σ = 2.90
σ = 3.82 σ = 2.76
T a
T e = 2T a T e = T a/2
α αT e = T a
10%
N (z | µ
k, Σ
k) = 1
(2π)
D/2| Σ |
1/2exp
�
− 1
2 (z − µ
k)
TΣ
−1(z − µ
k)
�
µ
kD Σ
kD × D
| Σ | Σ
x D N (x | µ, Σ)
x x
ax
bx
aM x x
bM − D
x =
� x
ax
b�
µ µ =
� µ
aµ
b�
Σ Σ =
� Σ
aaΣ
abΣ
baΣ
bb�
Σ
T= Σ Σ
aaΣ
bbΣ
ba= Σ
TabΛ
Λ ≡ Σ
−1Λ =
� Λ
aaΛ
abΛ
baΛ
bb�
Λ Σ
Λ
aaΣ
aa−
12(x − µ)
TΣ
−1(x − µ) = −
12(x
a− µ
a)
TΛ
aa(x
a− µ
a) −
12(x
a− µ
a)
TΛ
ab(x
b− µ
b)
−
12(x
b− µ
b)
TΛ
ba(x
a− µ
a) −
12(x
b− µ
b)
TΛ
bb(x
b− µ
b) N (x | µ)
− 1
2 (x − µ)
TΣ
−1(x − µ) = − 1
2 x
TΣ
−1x + x
TΣ
−1µ + const x µ
TΣ
−1µ
Σ
−1x
x Σ
−1µ
p(x
a| x
b)
µ
a|bΣ
a|bx
ax
bx
a− 1
2 x
TaΛ
aax
ap(x
a| x
b) Σ
a|b= Λ
−aa1x
ax
Ta�
Λ
aaµ
a− Λ
ab(x
b− µ
b�
Λ
Tba= Λ
abx
aΣ
−a|b1µ
a|bµ
a|b= Σ
a|b�
Λ
aaµ
a− Λ
ab(x
b− µ
b) �
= µ
a− Λ
−aa1Λ
ab(x
b− µ
b)
µ
a|bΣ
a|bΛ
� A B C D
�
−1=
� M − M BD
−1− D
−1CM D
−1+ D
−1CM BD
−1�
M = (A − BD
−1C )
−1.
M
−1� Σ
aaΣ
abΣ
baΣ
bb�
−1=
� Λ
aaΛ
abΛ
baΛ
bb�
Λ
aa= (Σ
aa− Σ
abΣ
−1bbΣ
ba)
−1Λ
ab= − (Σ
aa− Σ
abΣ
−bb1Σ
ba)
−1Σ
abΣ
−bb1p(x
a| x
b)
µ
a|b= µ
a+ Σ
abΣ
−bb1(x
b− µ
b) Σ
a|b= Σ
aa− Σ
abΣ
−1bbΣ
baµ
a|bF (y) = E [x | y] =
�
K k=1h
k�
µ
xk+ Σ
xyk(Σ
yyk)
−1(y − µ
yk) �
h
k(y) = π
kN(y | µ
yk, Σ
yyk)
�
Kk=1
π
kN (y | µ
yk, Σ
yyk)
F (y) = E [x | y] = µ
x|yh
k(y)
µ
x|yx
iC
i
!""# $%"&$
!""# $#!$
!""# $%$