Quantum tomography based on quantum trajectories Quantum Control Theory: Mathematical Aspects and Physical Applications TUM-IAS, Garching, April 3-5, 2017

(1)

Quantum tomography based on quantum trajectories Quantum Control Theory:

Mathematical Aspects and Physical Applications TUM-IAS, Garching, April 3-5, 2017

Pierre Rouchon

Centre Automatique et Systèmes, Mines ParisTech, PSL Research University Quantic Research Team, Inria

1 / 27

(2)

Outline

Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states

Discrete time case Continuous time case

MaxLike estimations with experimental data

Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon

Fisher information and low-rank MaxLike estimates

Appendix: asymptotics for multi-dimension Laplace integrals and boundary corrections

2 / 27

(3)

Quantum state tomography based on POVM, P

j

π

j

= I

I Tomography ofρviaNindependent measurementsYassociated to POVM: probability Tr(ρπj)of each measurement outcomejgiven byπj; forNj the number ofjoutcomes,Y ≡(Nj)withP

jNj=N, the number of measurements.

I Several estimation methods:

MaxEnt: ρMEmaximizes−Tr(ρlog(ρ))under the constraints

|Tr(ρπj)−Nj/N| ≤(Bužek et al, Ann. Phys. 1996).

Compress Sensing: ρCSminimizes Tr(ρ)under the constraints

|Tr(ρπj)−Nj/N| ≤(Gross et al PRL2010) MaxLike: ρMLmaximizes the likelihood function,

ρ7→P(Y|ρ) =Q

j Tr(ρπj)N_j

(see, e.g., Lvovsky/Raymer RMP 2009)

Bayesian Mean: ρBM ∝R

ρP(Y|ρ)P0(ρ)dρwhereP0is some prior distributionP⁰(ρ)dρ(see, e.g., Blume-Kohout NJP2010).

Low rank, high dimensional systems: see, e.g, PhD thesis "Efficient and Robust Methods for Quantum Tomography" of Charles Heber Baldwin, University of New Mexico, December 2016.

3 / 27

(4)

Quantum filtering / tomography with quantum trajectoriesY = y_t⁽ⁿ⁾ Filtering: estimation of the quantum stateρt at timet>0 from the

measurement trajectory[0,t[3τ7→yτ and the initial stateρ0; seeBelavkinsemilar contributions (links with Monte-Carlo quantum-trajectories).

State tomography: estimation of the initial stateρ0=

ρ

from a collection ofN measurement trajectories:Y =

y_t⁽ⁿ⁾

withn∈ {1, . . . ,N}

andt∈[0,T].

Process tomography: estimation of a parameterpfrom a known initial state

ρ

and a collection ofNmeasurement trajectoriesY.

This talk: MaxLike estimation with decoherence and measurement imperfections(PhD thesis of Pierre Six, November 2016):

1. How to compute the likelihood functionP Y/

ρ

,p

and its gradient from the stochastic master equation governing filtering (P. Six et al. PRA 2016).

2. For state estimation: variance computation based on asymptotic expansions of Laplace integrals for low rank MaxLike estimates (P. Six /PR, chapter in Lecture Notes in Control and Information Sciences no 473, April 2017).

4 / 27

(5)

Regular MaxLike estimation of a parameter p

Log-likelihood functionf(p) =log P(Y |p)

admits a unique maximum atp_ML(∇f(p_ML) =0) with a negative definite Hessian (∇²f(p_ML)<0).

f coming fromNindependent realisations: f(p)≡N¯f(p)with asymptotics forN7→+∞of the Laplace integrals connecting

I Bayesian Meanp_BMand MaxLike estimationp_ML:

pBM=

Rp e^N^¯^f(p)P0(p)dp

R e^N^¯^f(p)P0(p)dp =pML+O(1/N).

with any smooth prior distributionP0(p)dp

I Bayesian variance and Fisher informationFML=−∇²f(pML):

R kp−pMLk²e^N^¯^f(p)P0(p)dp Re^N^¯^f(p)P0(p)dp = Tr

F_ML−1

/(2N) +O(1/N²).

Confidence intervals based on−∇²f(pML).

5 / 27

(6)

Outline

Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states

Discrete time case Continuous time case

MaxLike estimations with experimental data

Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon

Fisher information and low-rank MaxLike estimates

6 / 27

(7)

Discrete-time models of open quantum systems

Four features¹:

1. Bayes law: P(µ⁰/µ) =P(µ/µ⁰)P(µ⁰)/ P

ν⁰P(µ/ν⁰)P(ν⁰) , 2. Schrödinger equationsdefining unitary transformations.

3. Randomness, irreversibility and dissipation induced by the measurementof observables withdegenerate spectra.

4. Entanglement and tensor product for composite systems.

VDiscrete-time models

Take a set of operatorsMµsatisfyingP

µM^†_µMµ=I and a left stochastic matrices(η_y_t,µ). Consider the followingMarkov processof stateρ(density op.) and measured outputy:

ρ_t+1 = ^K^yt^(ρ^t⁾

Tr(K_yt(ρt)), with proba.Pyt(ρ_t) = Tr(K_y_t(ρ_t)) withKy(ρ) =Pm

µ=1η_y,µMµρM^†_µ. It is associated to theKraus map (ensemble average, quantum channel)

E

^(ρ^t+1^|ρ^t^{) =}^K^(ρ^t^{) =}^X

y

Ky(ρ_t) =X

µ

Mµρ_tM^†_µ.

1See the book of S. Haroche and J.M. Raimond.

7 / 27

(8)

Computation of the likelihood function via the adjoint state (1)

I Denote byPn(

ρ

,p)the probability of getting measurement trajectoryn,(y_t⁽ⁿ⁾)_t=0,...,T, knowing the initial stateρ⁽ⁿ⁾₀ =

ρ

and parameterp.

I Sinceρ⁽ⁿ⁾_t+1=

K^p

y(n) t

ρ⁽ⁿ⁾_t

Tr K^p

y(n) t

ρ⁽ⁿ⁾_t

!with Tr

K^p

y_t⁽ⁿ⁾

ρ⁽ⁿ⁾_t

the

probability of having detectedy_t⁽ⁿ⁾ knowingρ⁽ⁿ⁾_t andp, a direct use of Bayes law yields

Pn(

ρ

,p) =

T

Y

t=0

Tr

K^p

y_t⁽ⁿ⁾

ρ⁽ⁿ⁾_t

= Tr

K^p

y_T⁽ⁿ⁾◦. . .◦K^p

y₀⁽ⁿ⁾(

ρ

)

.

8 / 27

(9)

Computation of the likelihood function via the adjoint state (2)

I Withadjoint mapK^p∗_y (∀A,B, Tr K^p_y(A)B

≡ Tr AK^p∗_y (B) ):

Pn(

ρ

,p) = Tr

K^p

y_T⁽ⁿ⁾◦. . .◦K^p

y₀⁽ⁿ⁾(

ρ

) I

= Tr

ρ

K^p∗

y₀⁽ⁿ⁾◦. . .◦K^p∗

y_T⁽ⁿ⁾(I)

.

I Normalizedadjoint quantum filter²E_t⁽ⁿ⁾=

K^p∗

y(n) t

E_t+1⁽ⁿ⁾

Tr K^p∗

y(n) t

E_t+1⁽ⁿ⁾

!with

E_T⁽ⁿ⁾₊₁=I/Tr(I), we get

Pn(

ρ

,p) =

0

Y

t=T

Tr

K^p∗

y_t⁽ⁿ⁾

E_t+1⁽ⁿ⁾ Tr

ρ

E₀⁽ⁿ⁾

,g_n(Y,p)Tr

ρ

E₀⁽ⁿ⁾ .

I A simple expression of the gradients:

∇

ρ

^logPn= E₀⁽ⁿ⁾ Tr

ρ

E₀⁽ⁿ⁾, ∇_plogPn·δp=

T

X

t=0

Tr

E_t+1⁽ⁿ⁾ ∇_pK^p

y_t⁽ⁿ⁾

ρ⁽ⁿ⁾_t

·δp

Tr

E_t+1⁽ⁿ⁾ K^p

y_t⁽ⁿ⁾

ρ⁽ⁿ⁾_t ,

2M. Tsang. Time-symmetric quantum theory of smoothing. PRL 2009.

S. Gammelmark, B. Julsgaard, and K. Mølmer. Past quantum states of a monitored system. PRL 2013.

9 / 27

(10)

MaxLike tomography based on N trajectories data Y =

y

_t⁽ⁿ⁾

FromPn(

ρ

,p) =g_n(Y,p)Tr

ρ

E₀⁽ⁿ⁾

we have

P(

ρ

,p),

N

Y

n=1

Pn(

ρ

,p) =

N

Y

n=1

g_n(Y,p)

! _N Y

n=1

Tr

ρ

E₀⁽ⁿ⁾

! .

I MaxLikestate tomography:pis known and

ρ

_MLmaximizes

ρ

7→

N

X

n=1

log Tr

ρ

E₀⁽ⁿ⁾

a concave function on the convex set of density operatorsρ:

a well structured convex optimization problem.

I MaxLikeprocess tomography:

ρ

is known andp_MLmaximizes p7→f(p) =logP(

ρ

,p)those gradient is given by

∇_pf(p)·δp=PN n=1

PT t=0

Tr E_t+1⁽ⁿ⁾ ∇pK^p

y(n) t

ρ⁽ⁿ⁾_t

·δp

!

Tr E_t+1⁽ⁿ⁾ K^p

y(n) t

ρ⁽ⁿ⁾_t

! ,

The Hessian∇²_pf can be computed similarly (Fisher information).

10 / 27

(11)

Continuous/discrete-time Stochastic Master Equation (SME)

Discrete-time models:Markov chainsρ_t+1= ^K^yt^(ρ^t⁾

Tr(K_yt(ρ_t)), with K_y_t(ρ_t) =Pm

µ=1η_y_t,µMµρ_tM^†_µ, and proba.Pyt(ρ_t) = Tr(K_y_t(ρ_t)).

Ensemble averages correspond toKraus linear maps

E

^(ρ^t+1^|ρ^t^{) =}^K^(ρ^t^{) =}^X

y

K_y(ρ_t) =X

µ

Mµρ_tM^†_µ with X

µ

M^†_µMµ=I

Continuous-time models:stochastic differential systems(see, e.g., Barchielli/Gregoratti, 2009)

dρ_t = −ⁱ

~[H, ρ_t] +X

ν

Lνρ_tL^†_ν−¹

2(L^†_νLνρ_t+ρ_tL^†_νLν)

! dt

+X

ν

√η_ν

Lνρ_t+ρ_tL^†_ν− Tr

(Lν+L^†_ν)ρ_t ρ_t

dW_ν,t

driven byWiener processesdWν,t, with measurementsdy_ν,t, dy_ν,t =√

ην Tr

(Lν+L^†_ν)ρ_t

dt +dW_ν,t, detection efficiencies η_ν ∈[0,1]andLindblad-Kossakowskimaster equations (ην ≡0):

d

dtρ=−ⁱ

~[H, ρ] +X

ν

LνρL^†_ν−¹

2(L^†_νLνρ+ρL^†_νLν)

11 / 27

(12)

Continuous/discrete-time diffusive SME

TheBelavkinquantum filter

dρ_t = −ⁱ

~[H, ρ_t] +X

ν

Lνρ_tL^†_ν−¹

2(L^†_νLνρ_t+ρ_tL^†_νLν)

! dt

+X

ν

√ην

Lνρ_t+ρ_tL^†_ν− Tr

(Lν+L^†_ν)ρ_t ρ_t

dW_ν,t withdW_ν,t =dy_ν,t −√

η_ν Tr

(Lν+L^†_ν)ρ_t

dtgiven by the measurement signaldy_ν,t, is always a stable filtering process.³ UsingIto rules, it can be written as a "discrete-time" Markov model¯ ⁴

ρ_t+dt=Kdyt(ρ_t)/Tr(Kdyt(ρ_t)) with partial Kraus mapsK_dy_t(ρ_t) =M_dy_tρ_tM^†_dy

t +P

ν(1−ην)Lνρ_tL^†_νdt Mdyt =I+

−ⁱ

~H−¹₂ P

νL^†_νLν

dt+P

ν

√η_νdyν,tL

where the probability of outcomedy_t = (dyν,t)reads:

P

dy_t ∈Q

ν[ξν, ξν+dξν]. ρ_t

= Tr(Kξ(ρ_t)) Q

νe^−ξ²^ν^/2dt^√^dξ^ν

2πdt 3H. Amini et al., Russian J. of Math. Physics, 2014, 21, 297-315.

4PR, J. Ralph PRA2015; see also PhD thesis of Ph. Campagne-Ibracq (2015) and of P. Six (2016).

12 / 27

(13)

Outline

Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states

Discrete time case Continuous time case

MaxLike estimations with experimental data

Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon

Fisher information and low-rank MaxLike estimates

13 / 27

(14)

QND measurement of photons

C

B

D R1

R2

I The probabilityP(y |φ_R,n)to gety ∈ {g,e}knowing the Ramsey angleφ_Rand the number of photon(s)n∈ {0,1,2, . . .}:

P(y|φ_R,n) =1+y A+Bc(n)cosφ_R+Bs(n)sinφ_R

with_e/g =±1.

depends on the parametersp= (Bc(n),Bs(n))n∈{0,1,...,}.

I The Kraus mapsK^p_y based on known cavity decay and thermal photons.

14 / 27

(15)

A priori calibration

⁵

(black dots) versus MaxLike (blue dots)

0 5 10 15

0 0.2 0.4 0.6 0.8 1

Photon number, n P(e 0 , n)

MaxLike estimation of 32 parameters p based on N = 8000 trajectories of T = 6000 outcome measurements.

5T. Rybarczyk, B. Peaudecerf, M. Penasa, S. Gerlich, B. Julsgaard, K.

Mølmer, S. Gleyzes, M. Brune, J. M. Raimond, S. Haroche, and I. Dotsenko.

Forward-backward analysis of the photon-number evolution in a cavity. PRA 2015.

15 / 27

(16)

A quantum Maxwell demon experiment arXiv:1702.01917v1

(a) After preparation in a thermal or quantum state the system S (superconducting qubit) state is recorded into the demon’s quantum memory D (microwave cavity) via a pulse that populates the cavity mode only if the qubit is in the ground state. This information is used to extract work which charges a battery with one extra photon:

system S emits this photon only when the demon’s cavity is empty. The memory reset is performed by cavity relaxation.

(b) When the system starts in a quantum superposition of the demon and system are

entangled after the record step. _{16 / 27}

(17)

Tomography of the demon after the work extraction step

rank 2

rank 6 rank 4

rank 4

Computations are based on a truncation to 20 photons How to define the confidence intervals for low rankρML?

17 / 27

(18)

Outline

Quantum tomography versus quantum filtering Likelihood function calculations via adjoint states

Discrete time case Continuous time case

MaxLike estimations with experimental data

Process tomography for QND measurement of photons State tomography of a quantum Maxwell demon

Fisher information and low-rank MaxLike estimates

18 / 27

(19)

Asymptotic when ρ

ML

is on the boundary

I To bypass boundary problem, consider Bayesian estimate instead of MaxLike ones

ρBM =

RρP(Y|ρ)P0(ρ)dρ RP(Y |ρ)P0(ρ)dρ with some prior distributionP0(ρ)dρ.

I When the likelihood exp(f(ρ))≡P(Y|ρ)is concentrated (f =N¯fwith N1) around its maximumρMLthat lies on the boundary (ρMLnot full rank), how to compute the first terms of an asymptotic expansion versus Nof

Z

Tr(ρA)^rexp(N¯f(ρ))P⁰(ρ)dρ

for any operatorAand exponentrand for some prior distribution P0(ρ)dρ(e.g., Gausssian unitary ensemble).

I Since all functions are analytic such an asymptotic expansion versusN always exists: Integration by parts, Watson’s lemma, Laplace’s method, stationary phase, steepest descents, Hironaka’s resolution of

singularities⁶, "singular learning"⁷

6An important reference: V.I. Arnold, S.M. Gusein-Zade, and A.N. Varchenko.

Singularities of Differentiable Maps, Vol. II. Birkhäuser, Boston, 1985

7S. Watanabe: Algebraic Geometry and Statistical Learning Theory, Cambridge University Press, 2009.

19 / 27

(20)

Geometric optimality condition for the log-likelihood function

Assume thatρ_MLis an argument of the maximum of

f :D 3ρ7→ X

µ∈M

log Tr

ρE^(µ)

∈[−∞,0]

overD(the set of density operators,E^(µ) ∈ D.). Then necessarily, ρ_MLsatisfies the following conditions:

I Tr ρ_MLE^(µ)

>0 for eachµ∈ M;

I h

ρ_ML, ∇f|_ρ

ML

i=0, where ∇f|_ρ

ML =P

µ∈M E^(µ)

Tr(^ρMLE^(µ)) is the gradient off atρ_MLfor the Frobenius scalar product;

I there existsλ_ML>0 such thatλ_MLPML=PML∇f|_ρ

ML and

∇f|_ρ

ML ≤λ_MLI, wherePMLis the orthogonal projector on the range ofρ_MLandI is the identity operator.

These conditions are also sufficient and characterize the unique maximum when, additionally, the vector space spanned by theE^(µ)’s coincides with the set of Hermitian matrices.

20 / 27

(21)

Geometric asymptotic expansions of Bayesian mean and variance

For any Hermitian operatorA, its Bayesian mean and variance read:

IA(N) = R

D Tr(ρA)e^Nf(ρ)P⁰(ρ)dρ R

De^Nf(ρ)P⁰(ρ)dρ , VA(N) = R

D

Tr(ρA)−IA(N)2

e^Nf(ρ)P0(ρ)dρ R

De^Nf(ρ)P⁰(ρ)dρ . Denote byρMLthe unique maximum offonDand byPMLthe orthogonal

projector on its range. In addition to the necessary and sufficient geometric conditions above, assume that ker

λMLI− ∇f|_ρ

ML

=ker(I−PML).

IA(N) = Tr(AρML)+O(1/N), VA(N) = Tr

Ak (FML)⁻¹(Ak)

/N+O(1/N²) whereBkis an orthogonal projection

Bk=B− Tr(BPML)

Tr(PML) PML−(I−PML)B(I−PML);

and whereFMLis a linear super-operator, corresponds to the Hessian atρML

of some restriction off andgeneralizes the Fisher information matrix:

FML(X) =X

µ

Tr XE_k^(µ)

Tr²(ρMLE^(µ))E_k^(µ)+

λMLI− ∇f|_ρ

ML

Xρ⁺_ML+ρ⁺_MLX

λMLI− ∇f|_ρ

ML

withρ⁺_MLthe Moore-Penrose pseudo-inverse ofρML.

21 / 27

(22)

Concluding remarks

I Low-rank approximationsand efficient numerical schemes for computations ofρ_ML, the adjoint statesE⁽ⁿ⁾, . . .

I Asymptoticswhen the log-likelihood function is not strongly concave, when ker

λ_MLI− ∇f|_ρ

ML

6=ker(I−PML). . .

I Process tomography: log-likelihood function not concave . . .

I Parameter estimation along quantum trajectories (in real-time) . . .

I Thematic quarter at Institut Henri Poincaré in Paris next Spring 2018 gathering experimental physicists and theoreticians.

22 / 27

(23)

23 / 27

(24)

Asymptotics for multi-dimension Laplace integrals (1)

Theorem (interior max) I_g(N) =R

z∈(−1,1)ⁿg(z)exp(Nf(z)) dz withf andganalytic functions ofz on a compact neighbourhood of D, the closure ofD.Assumethatf admits a unique maximum onDat z=0 with _∂z^∂²^f2

0negative definite.

Ifg(0)6=0, we have the following dominant term in the asymptotic expansion ofI_g(N)for largeN:

I_g(N) =







g(0) (2π)^n/2e^Nf⁽⁰⁾N^−n/2 r

det

∂²f

∂z²

0





 +O

e^Nf⁽⁰⁾N^−n/2−1 .

Ifg(0) =0, with ^∂g_∂z 0

=0, then we have:

I_g(N) =





 Tr

− ^∂²^g

∂z²

0

∂²f

∂z²

0

−1

(2π)^n/2 2

r det

∂²f

∂z²

₀







e^Nf⁽⁰⁾N^−n/2−1

+O

e^Nf⁽⁰⁾N^−n/2−2

. 24 / 27

(25)

Asymptotics for Bayesian integrals (1’)

Corollary (interior max):Assume thatf admits a unique maximum onDatz =0 with _∂z^∂²^f2

0negative definite. Then we have the following asymptotic for any analytic functiong(z):

M_g(N), R

z∈(−1,1)ⁿg(z)exp(Nf(z))dz R

z∈(−1,1)ⁿexp(Nf(z))dz =g(0) +O(N⁻¹) We have also:

V_g(N), R

z∈(−1,1)ⁿ

g(z)− M_g(N)2

exp(Nf(z)) dz R

z∈(−1,1)ⁿexp(Nf(z))dz

= Tr

− ^∂_∂z²^g₂ ₀

∂²f

∂z²

₀

−1

2N +O N⁻²

.

25 / 27

(26)

Asymptotics for multi-dimension Laplace integrals (2)

Theorem (boundary max):

Ig(N) =R

x∈(0,1)

R

z∈(−1,1)ⁿx^mg(x,z)exp(Nf(x,z))dxdzwithfandg analytic functions of(x,z)on a compact neighbourhood ofD, the closure of D.Assumethatf admits a unique maximum onDat(x,z) = (0,0), with

∂²f

∂z²

(0,0)

negative definite and _∂x^∂f

_(0,0)<0. Ifg(0,0)6=0, we have the following dominant term in the asymptotic expansion ofIg(N)for largeN:

Ig(N) =







g(0,0)m! (2π)^n/2e^Nf^(0,0)N^{−m−n/2−1} s

det

∂²f

∂z²

(0,0)

−_∂x^∂f (0,0)

m+1





 +O

e^Nf(0,0)N^{−m−n/2−2} .

Ifg(0,0) =0, with ^∂g_∂x

(0,0)=0 and ^∂g_∂z

(0,0)=0, then we have:

Ig(N) =







Tr − ^∂_∂z²^g₂ (0,0)

∂²f

∂z²

(0,0)

−1!

m! (2π)^n/2

2 s

det

∂²f

∂z²

(0,0)

− _∂x^∂f (0,0)

m+1







e^Nf^(0,0))N^{−m−n/2−2}

+O

e^Nf(0,0))N^{−m−n/2−3}

. _{26 / 27}

(27)

Asymptotics for Bayesian integrals (2’)

Corollary (boundary max):Assume thatfadmits a unique maximum onD at(x,z) = (0,0), with _∂z^∂²^f₂

(0,0)negative definite and _∂x^∂f

(0,0)<0. Then, we have the following asymptotic for any analytic functiong(x,z):

Mg(N), R

x∈(0,1)

R

z∈(−1,1)ⁿx^mg(x,z)exp(Nf(x,z))dxdz R

x∈(0,1)

R

z∈(−1,1)ⁿx^mexp(Nf(x,z))dxdz =g(0,0)+O(N⁻¹) We have also:

Vg(N), R

x∈(0,1)

R

z∈(−1,1)ⁿx^m

g(x,z)− Mg(N)2

exp(Nf(x,z))dxdz R

x∈(0,1)

R

z∈(−1,1)ⁿx^mexp(Nf(x,z))dxdz

=

Tr − ^∂_∂z²^g2

(0,0)

∂²f

∂z²

(0,0)

−1!

2N +O N⁻²

.

27 / 27