Quantum tomography based on quantum trajectories Quantum Control Theory:
Mathematical Aspects and Physical Applications TUM-IAS, Garching, April 3-5, 2017
Pierre Rouchon
Centre Automatique et Systèmes, Mines ParisTech, PSL Research University Quantic Research Team, Inria
1 / 27
Outline
Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states
Discrete time case Continuous time case
MaxLike estimations with experimental data
Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon
Fisher information and low-rank MaxLike estimates
Appendix: asymptotics for multi-dimension Laplace integrals and boundary corrections
2 / 27
Quantum state tomography based on POVM, P
j
π
j= I
I Tomography ofρviaNindependent measurementsYassociated to POVM: probability Tr(ρπj)of each measurement outcomejgiven byπj; forNj the number ofjoutcomes,Y ≡(Nj)withP
jNj=N, the number of measurements.
I Several estimation methods:
MaxEnt: ρMEmaximizes−Tr(ρlog(ρ))under the constraints
|Tr(ρπj)−Nj/N| ≤(Bužek et al, Ann. Phys. 1996).
Compress Sensing: ρCSminimizes Tr(ρ)under the constraints
|Tr(ρπj)−Nj/N| ≤(Gross et al PRL2010) MaxLike: ρMLmaximizes the likelihood function,
ρ7→P(Y|ρ) =Q
j Tr(ρπj)Nj
(see, e.g., Lvovsky/Raymer RMP 2009)
Bayesian Mean: ρBM ∝R
ρP(Y|ρ)P0(ρ)dρwhereP0is some prior distributionP0(ρ)dρ(see, e.g., Blume-Kohout NJP2010).
Low rank, high dimensional systems: see, e.g, PhD thesis "Efficient and Robust Methods for Quantum Tomography" of Charles Heber Baldwin, University of New Mexico, December 2016.
3 / 27
Quantum filtering / tomography with quantum trajectoriesY = yt(n) Filtering: estimation of the quantum stateρt at timet>0 from the
measurement trajectory[0,t[3τ7→yτ and the initial stateρ0; seeBelavkinsemilar contributions (links with Monte-Carlo quantum-trajectories).
State tomography: estimation of the initial stateρ0=
ρ
from a collection ofN measurement trajectories:Y =yt(n)
withn∈ {1, . . . ,N}
andt∈[0,T].
Process tomography: estimation of a parameterpfrom a known initial state
ρ
and a collection ofNmeasurement trajectoriesY.This talk: MaxLike estimation with decoherence and measurement imperfections(PhD thesis of Pierre Six, November 2016):
1. How to compute the likelihood functionP Y/
ρ
,pand its gradient from the stochastic master equation governing filtering (P. Six et al. PRA 2016).
2. For state estimation: variance computation based on asymptotic expansions of Laplace integrals for low rank MaxLike estimates (P. Six /PR, chapter in Lecture Notes in Control and Information Sciences no 473, April 2017).
4 / 27
Regular MaxLike estimation of a parameter p
Log-likelihood functionf(p) =log P(Y |p)admits a unique maximum atpML(∇f(pML) =0) with a negative definite Hessian (∇2f(pML)<0).
f coming fromNindependent realisations: f(p)≡N¯f(p)with asymptotics forN7→+∞of the Laplace integrals connecting
I Bayesian MeanpBMand MaxLike estimationpML:
pBM=
Rp eN¯f(p)P0(p)dp
R eN¯f(p)P0(p)dp =pML+O(1/N).
with any smooth prior distributionP0(p)dp
I Bayesian variance and Fisher informationFML=−∇2f(pML):
R kp−pMLk2eN¯f(p)P0(p)dp ReN¯f(p)P0(p)dp = Tr
FML−1
/(2N) +O(1/N2).
Confidence intervals based on−∇2f(pML).
5 / 27
Outline
Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states
Discrete time case Continuous time case
MaxLike estimations with experimental data
Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon
Fisher information and low-rank MaxLike estimates
Appendix: asymptotics for multi-dimension Laplace integrals and boundary corrections
6 / 27
Discrete-time models of open quantum systems
Four features1:1. Bayes law: P(µ0/µ) =P(µ/µ0)P(µ0)/ P
ν0P(µ/ν0)P(ν0) , 2. Schrödinger equationsdefining unitary transformations.
3. Randomness, irreversibility and dissipation induced by the measurementof observables withdegenerate spectra.
4. Entanglement and tensor product for composite systems.
VDiscrete-time models
Take a set of operatorsMµsatisfyingP
µM†µMµ=I and a left stochastic matrices(ηyt,µ). Consider the followingMarkov processof stateρ(density op.) and measured outputy:
ρt+1 = Kyt(ρt)
Tr(Kyt(ρt)), with proba.Pyt(ρt) = Tr(Kyt(ρt)) withKy(ρ) =Pm
µ=1ηy,µMµρM†µ. It is associated to theKraus map (ensemble average, quantum channel)
E
(ρt+1|ρt) =K(ρt) =Xy
Ky(ρt) =X
µ
MµρtM†µ.
1See the book of S. Haroche and J.M. Raimond.
7 / 27
Computation of the likelihood function via the adjoint state (1)
I Denote byPn(
ρ
,p)the probability of getting measurement trajectoryn,(yt(n))t=0,...,T, knowing the initial stateρ(n)0 =ρ
and parameterp.I Sinceρ(n)t+1=
Kp
y(n) t
ρ(n)t
Tr Kp
y(n) t
ρ(n)t
!with Tr
Kp
yt(n)
ρ(n)t
the
probability of having detectedyt(n) knowingρ(n)t andp, a direct use of Bayes law yields
Pn(
ρ
,p) =T
Y
t=0
Tr
Kp
yt(n)
ρ(n)t
= Tr
Kp
yT(n)◦. . .◦Kp
y0(n)(
ρ
).
8 / 27
Computation of the likelihood function via the adjoint state (2)
I Withadjoint mapKp∗y (∀A,B, Tr Kpy(A)B
≡ Tr AKp∗y (B) ):
Pn(
ρ
,p) = TrKp
yT(n)◦. . .◦Kp
y0(n)(
ρ
) I= Tr
ρ
Kp∗y0(n)◦. . .◦Kp∗
yT(n)(I)
.
I Normalizedadjoint quantum filter2Et(n)=
Kp∗
y(n) t
Et+1(n)
Tr Kp∗
y(n) t
Et+1(n)
!with
ET(n)+1=I/Tr(I), we get
Pn(
ρ
,p) =0
Y
t=T
Tr
Kp∗
yt(n)
Et+1(n) Tr
ρ
E0(n),gn(Y,p)Tr
ρ
E0(n) .I A simple expression of the gradients:
∇
ρ
logPn= E0(n) Trρ
E0(n), ∇plogPn·δp=T
X
t=0
Tr
Et+1(n) ∇pKp
yt(n)
ρ(n)t
·δp
Tr
Et+1(n) Kp
yt(n)
ρ(n)t ,
2M. Tsang. Time-symmetric quantum theory of smoothing. PRL 2009.
S. Gammelmark, B. Julsgaard, and K. Mølmer. Past quantum states of a monitored system. PRL 2013.
9 / 27
MaxLike tomography based on N trajectories data Y =
y
t(n) FromPn(ρ
,p) =gn(Y,p)Trρ
E0(n)we have
P(
ρ
,p),N
Y
n=1
Pn(
ρ
,p) =N
Y
n=1
gn(Y,p)
! N Y
n=1
Tr
ρ
E0(n)! .
I MaxLikestate tomography:pis known and
ρ
MLmaximizesρ
7→N
X
n=1
log Tr
ρ
E0(n)a concave function on the convex set of density operatorsρ:
a well structured convex optimization problem.
I MaxLikeprocess tomography:
ρ
is known andpMLmaximizes p7→f(p) =logP(ρ
,p)those gradient is given by∇pf(p)·δp=PN n=1
PT t=0
Tr Et+1(n) ∇pKp
y(n) t
ρ(n)t
·δp
!
Tr Et+1(n) Kp
y(n) t
ρ(n)t
! ,
The Hessian∇2pf can be computed similarly (Fisher information).
10 / 27
Continuous/discrete-time Stochastic Master Equation (SME)
Discrete-time models:Markov chainsρt+1= Kyt(ρt)Tr(Kyt(ρt)), with Kyt(ρt) =Pm
µ=1ηyt,µMµρtM†µ, and proba.Pyt(ρt) = Tr(Kyt(ρt)).
Ensemble averages correspond toKraus linear maps
E
(ρt+1|ρt) =K(ρt) =Xy
Ky(ρt) =X
µ
MµρtM†µ with X
µ
M†µMµ=I
Continuous-time models:stochastic differential systems(see, e.g., Barchielli/Gregoratti, 2009)
dρt = −i
~[H, ρt] +X
ν
LνρtL†ν−1
2(L†νLνρt+ρtL†νLν)
! dt
+X
ν
√ην
Lνρt+ρtL†ν− Tr
(Lν+L†ν)ρt ρt
dWν,t
driven byWiener processesdWν,t, with measurementsdyν,t, dyν,t =√
ην Tr
(Lν+L†ν)ρt
dt +dWν,t, detection efficiencies ην ∈[0,1]andLindblad-Kossakowskimaster equations (ην ≡0):
d
dtρ=−i
~[H, ρ] +X
ν
LνρL†ν−1
2(L†νLνρ+ρL†νLν)
11 / 27
Continuous/discrete-time diffusive SME
TheBelavkinquantum filterdρt = −i
~[H, ρt] +X
ν
LνρtL†ν−1
2(L†νLνρt+ρtL†νLν)
! dt
+X
ν
√ην
Lνρt+ρtL†ν− Tr
(Lν+L†ν)ρt ρt
dWν,t withdWν,t =dyν,t −√
ην Tr
(Lν+L†ν)ρt
dtgiven by the measurement signaldyν,t, is always a stable filtering process.3 UsingIto rules, it can be written as a "discrete-time" Markov model¯ 4
ρt+dt=Kdyt(ρt)/Tr(Kdyt(ρt)) with partial Kraus mapsKdyt(ρt) =MdytρtM†dy
t +P
ν(1−ην)LνρtL†νdt Mdyt =I+
−i
~H−12 P
νL†νLν
dt+P
ν
√ηνdyν,tL
where the probability of outcomedyt = (dyν,t)reads:
P
dyt ∈Q
ν[ξν, ξν+dξν]. ρt
= Tr(Kξ(ρt)) Q
νe−ξ2ν/2dt√dξν
2πdt 3H. Amini et al., Russian J. of Math. Physics, 2014, 21, 297-315.
4PR, J. Ralph PRA2015; see also PhD thesis of Ph. Campagne-Ibracq (2015) and of P. Six (2016).
12 / 27
Outline
Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states
Discrete time case Continuous time case
MaxLike estimations with experimental data
Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon
Fisher information and low-rank MaxLike estimates
Appendix: asymptotics for multi-dimension Laplace integrals and boundary corrections
13 / 27
QND measurement of photons
C
B
D R1
R2
I The probabilityP(y |φR,n)to gety ∈ {g,e}knowing the Ramsey angleφRand the number of photon(s)n∈ {0,1,2, . . .}:
P(y|φR,n) =1+y A+Bc(n)cosφR+Bs(n)sinφR
withe/g =±1.
depends on the parametersp= (Bc(n),Bs(n))n∈{0,1,...,}.
I The Kraus mapsKpy based on known cavity decay and thermal photons.
14 / 27
A priori calibration
5(black dots) versus MaxLike (blue dots)
0 5 10 15
0 0.2 0.4 0.6 0.8 1
Photon number, n P(e 0 , n)
MaxLike estimation of 32 parameters p based on N = 8000 trajectories of T = 6000 outcome measurements.
5T. Rybarczyk, B. Peaudecerf, M. Penasa, S. Gerlich, B. Julsgaard, K.
Mølmer, S. Gleyzes, M. Brune, J. M. Raimond, S. Haroche, and I. Dotsenko.
Forward-backward analysis of the photon-number evolution in a cavity. PRA 2015.
15 / 27
A quantum Maxwell demon experiment arXiv:1702.01917v1
(a) After preparation in a thermal or quantum state the system S (superconducting qubit) state is recorded into the demon’s quantum memory D (microwave cavity) via a pulse that populates the cavity mode only if the qubit is in the ground state. This information is used to extract work which charges a battery with one extra photon:
system S emits this photon only when the demon’s cavity is empty. The memory reset is performed by cavity relaxation.
(b) When the system starts in a quantum superposition of the demon and system are
entangled after the record step. 16 / 27
Tomography of the demon after the work extraction step
rank 2
rank 6 rank 4
rank 4
Computations are based on a truncation to 20 photons How to define the confidence intervals for low rankρML?
17 / 27
Outline
Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states
Discrete time case Continuous time case
MaxLike estimations with experimental data
Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon
Fisher information and low-rank MaxLike estimates
Appendix: asymptotics for multi-dimension Laplace integrals and boundary corrections
18 / 27
Asymptotic when ρ
MLis on the boundary
I To bypass boundary problem, consider Bayesian estimate instead of MaxLike ones
ρBM =
RρP(Y|ρ)P0(ρ)dρ RP(Y |ρ)P0(ρ)dρ with some prior distributionP0(ρ)dρ.
I When the likelihood exp(f(ρ))≡P(Y|ρ)is concentrated (f =N¯fwith N1) around its maximumρMLthat lies on the boundary (ρMLnot full rank), how to compute the first terms of an asymptotic expansion versus Nof
Z
Tr(ρA)rexp(N¯f(ρ))P0(ρ)dρ
for any operatorAand exponentrand for some prior distribution P0(ρ)dρ(e.g., Gausssian unitary ensemble).
I Since all functions are analytic such an asymptotic expansion versusN always exists: Integration by parts, Watson’s lemma, Laplace’s method, stationary phase, steepest descents, Hironaka’s resolution of
singularities6, "singular learning"7
6An important reference: V.I. Arnold, S.M. Gusein-Zade, and A.N. Varchenko.
Singularities of Differentiable Maps, Vol. II. Birkhäuser, Boston, 1985
7S. Watanabe: Algebraic Geometry and Statistical Learning Theory, Cambridge University Press, 2009.
19 / 27
Geometric optimality condition for the log-likelihood function
Assume thatρMLis an argument of the maximum off :D 3ρ7→ X
µ∈M
log Tr
ρE(µ)
∈[−∞,0]
overD(the set of density operators,E(µ) ∈ D.). Then necessarily, ρMLsatisfies the following conditions:
I Tr ρMLE(µ)
>0 for eachµ∈ M;
I h
ρML, ∇f|ρ
ML
i=0, where ∇f|ρ
ML =P
µ∈M E(µ)
Tr(ρMLE(µ)) is the gradient off atρMLfor the Frobenius scalar product;
I there existsλML>0 such thatλMLPML=PML∇f|ρ
ML and
∇f|ρ
ML ≤λMLI, wherePMLis the orthogonal projector on the range ofρMLandI is the identity operator.
These conditions are also sufficient and characterize the unique maximum when, additionally, the vector space spanned by theE(µ)’s coincides with the set of Hermitian matrices.
20 / 27
Geometric asymptotic expansions of Bayesian mean and variance
For any Hermitian operatorA, its Bayesian mean and variance read:
IA(N) = R
D Tr(ρA)eNf(ρ)P0(ρ)dρ R
DeNf(ρ)P0(ρ)dρ , VA(N) = R
D
Tr(ρA)−IA(N)2
eNf(ρ)P0(ρ)dρ R
DeNf(ρ)P0(ρ)dρ . Denote byρMLthe unique maximum offonDand byPMLthe orthogonal
projector on its range. In addition to the necessary and sufficient geometric conditions above, assume that ker
λMLI− ∇f|ρ
ML
=ker(I−PML).
IA(N) = Tr(AρML)+O(1/N), VA(N) = Tr
Ak (FML)−1(Ak)
/N+O(1/N2) whereBkis an orthogonal projection
Bk=B− Tr(BPML)
Tr(PML) PML−(I−PML)B(I−PML);
and whereFMLis a linear super-operator, corresponds to the Hessian atρML
of some restriction off andgeneralizes the Fisher information matrix:
FML(X) =X
µ
Tr XEk(µ)
Tr2(ρMLE(µ))Ek(µ)+
λMLI− ∇f|ρ
ML
Xρ+ML+ρ+MLX
λMLI− ∇f|ρ
ML
withρ+MLthe Moore-Penrose pseudo-inverse ofρML.
21 / 27
Concluding remarks
I Low-rank approximationsand efficient numerical schemes for computations ofρML, the adjoint statesE(n), . . .
I Asymptoticswhen the log-likelihood function is not strongly concave, when ker
λMLI− ∇f|ρ
ML
6=ker(I−PML). . .
I Process tomography: log-likelihood function not concave . . .
I Parameter estimation along quantum trajectories (in real-time) . . .
I Thematic quarter at Institut Henri Poincaré in Paris next Spring 2018 gathering experimental physicists and theoreticians.
22 / 27
23 / 27
Asymptotics for multi-dimension Laplace integrals (1)
Theorem (interior max) Ig(N) =Rz∈(−1,1)ng(z)exp(Nf(z)) dz withf andganalytic functions ofz on a compact neighbourhood of D, the closure ofD.Assumethatf admits a unique maximum onDat z=0 with ∂z∂2f2
0negative definite.
Ifg(0)6=0, we have the following dominant term in the asymptotic expansion ofIg(N)for largeN:
Ig(N) =
g(0) (2π)n/2eNf(0)N−n/2 r
det
∂2f
∂z2
0
+O
eNf(0)N−n/2−1 .
Ifg(0) =0, with ∂g∂z 0
=0, then we have:
Ig(N) =
Tr
− ∂2g
∂z2
0
∂2f
∂z2
0
−1
(2π)n/2 2
r det
∂2f
∂z2
0
eNf(0)N−n/2−1
+O
eNf(0)N−n/2−2
. 24 / 27
Asymptotics for Bayesian integrals (1’)
Corollary (interior max):Assume thatf admits a unique maximum onDatz =0 with ∂z∂2f2
0negative definite. Then we have the following asymptotic for any analytic functiong(z):
Mg(N), R
z∈(−1,1)ng(z)exp(Nf(z))dz R
z∈(−1,1)nexp(Nf(z))dz =g(0) +O(N−1) We have also:
Vg(N), R
z∈(−1,1)n
g(z)− Mg(N)2
exp(Nf(z)) dz R
z∈(−1,1)nexp(Nf(z))dz
= Tr
− ∂∂z2g2 0
∂2f
∂z2
0
−1
2N +O N−2
.
25 / 27
Asymptotics for multi-dimension Laplace integrals (2)
Theorem (boundary max):
Ig(N) =R
x∈(0,1)
R
z∈(−1,1)nxmg(x,z)exp(Nf(x,z))dxdzwithfandg analytic functions of(x,z)on a compact neighbourhood ofD, the closure of D.Assumethatf admits a unique maximum onDat(x,z) = (0,0), with
∂2f
∂z2
(0,0)
negative definite and ∂x∂f
(0,0)<0. Ifg(0,0)6=0, we have the following dominant term in the asymptotic expansion ofIg(N)for largeN:
Ig(N) =
g(0,0)m! (2π)n/2eNf(0,0)N−m−n/2−1 s
det
∂2f
∂z2
(0,0)
−∂x∂f (0,0)
m+1
+O
eNf(0,0)N−m−n/2−2 .
Ifg(0,0) =0, with ∂g∂x
(0,0)=0 and ∂g∂z
(0,0)=0, then we have:
Ig(N) =
Tr − ∂∂z2g2 (0,0)
∂2f
∂z2
(0,0)
−1!
m! (2π)n/2
2 s
det
∂2f
∂z2
(0,0)
− ∂x∂f (0,0)
m+1
eNf(0,0))N−m−n/2−2
+O
eNf(0,0))N−m−n/2−3
. 26 / 27
Asymptotics for Bayesian integrals (2’)
Corollary (boundary max):Assume thatfadmits a unique maximum onD at(x,z) = (0,0), with ∂z∂2f2
(0,0)negative definite and ∂x∂f
(0,0)<0. Then, we have the following asymptotic for any analytic functiong(x,z):
Mg(N), R
x∈(0,1)
R
z∈(−1,1)nxmg(x,z)exp(Nf(x,z))dxdz R
x∈(0,1)
R
z∈(−1,1)nxmexp(Nf(x,z))dxdz =g(0,0)+O(N−1) We have also:
Vg(N), R
x∈(0,1)
R
z∈(−1,1)nxm
g(x,z)− Mg(N)2
exp(Nf(x,z))dxdz R
x∈(0,1)
R
z∈(−1,1)nxmexp(Nf(x,z))dxdz
=
Tr − ∂∂z2g2
(0,0)
∂2f
∂z2
(0,0)
−1!
2N +O N−2
.
27 / 27