Maximum-Likelihood Quantum State Tomography by Cover's Method with Non-Asymptotic Analysis

by   Chien-Ming Lin, et al.

We propose an iterative algorithm that computes the maximum-likelihood estimate in quantum state tomography. The optimization error of the algorithm converges to zero at an O ( ( 1 / k ) log D ) rate, where k denotes the number of iterations and D denotes the dimension of the quantum state. The per-iteration computational complexity of the algorithm is O ( D ^ 3 + N D ^2 ), where N denotes the number of measurement outcomes. The algorithm can be considered as a parameter-free correction of the R ρ R method [A. I. Lvovsky. Iterative maximum-likelihood reconstruction in quantum homodyne tomography. J. Opt. B: Quantum Semiclass. Opt. 2004] [G. Molina-Terriza et al. Triggered qutrits for quantum communication protocols. Phys. Rev. Lett. 2004.].



page 1

page 2

page 3

page 4


An Online Algorithm for Maximum-Likelihood Quantum State Tomography

We propose, to the best of our knowledge, the first online algorithm for...

Efficient Approximate Quantum State Tomography with Basis Dependent Neural-Networks

We use a meta-learning neural-network approach to predict measurement ou...

Maximum likelihood quantum state tomography is inadmissible

Maximum likelihood estimation (MLE) is the most common approach to quant...

Inapproximability of Positive Semidefinite Permanents and Quantum State Tomography

Quantum State Tomography is the task of estimating a quantum state, give...

Learning to Learn Quantum Turbo Detection

This paper investigates a turbo receiver employing a variational quantum...

A comparative study of estimation methods in quantum tomography

As quantum tomography is becoming a key component of the quantum enginee...

Maximum Likelihood Joint Tracking and Association in a Strong Clutter without Combinatorial Complexity

We have developed an efficient algorithm for the maximum likelihood join...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Quantum state tomography aims to estimate the quantum state of a physical system, given measurement outcomes (see, e.g., [Paris2004] for a complete survey). There are various approaches to quantum state tomography, such as trace regression [Flammia2012, Gross2010, Opatrny1997, Yang2020, Youssry2019], maximum-likelihood estimation [Hradil1997, Hradil2004], Bayesian estimation [Blume-Kohout2010, Blume-Kohout2010a]

, and recently proposed deep learning-based methods

[Ahmed2020, Quek2021]111A confusion the authors frequently encounter is that many people mix state tomography with the notion of shadow tomography introduced by Aaronson [Aaronson2018, Aaronson2020]

. State tomography aims at estimating the quantum state, whereas shadow tomography aims at estimating the probability distribution of measurement outcomes. Indeed, one interesting conclusion by Aaronson is that shadow tomography requires much less data than state tomography.

. Among existing approaches, the maximum-likelihood estimation approach has been standard in practice and enjoys favorable asymptotic statistical guarantees (see, e.g., [Hradil2004, Scholten2018]). The maximum-likelihood estimator is given by the optimization problem:


for some Hermitian positive semi-definite matrices , where denotes the set of quantum density matrices, i.e.,

and denotes the number of measurement outcomes. We write for the conjugate transpose of .

is a numerical method developed to solve (1) [Lvovsky2004, MolinaTerriza2004]. Given a positive definite , iterates as

where the mapping scales its input such that , and denotes the gradient mapping of . is parameter-free (i.e., it does not require parameter tuning) and typically converges fast in practice. Unfortunately, one can construct a synthetic data-set on which does not converge [Rehacek2007].

According to [Lvovsky2004], is inspired by Cover’s method222

Indeed, Cover’s method coincides with the expectation maximization method for solving (

2) and hence is typically called expectation maximization in literature. Nevertheless, Cover’s and our derivations and convergence analyses do not need and are not covered by existing results on expectation maximization, so we do not call the method expectation maximization to avoid possible confusions. for solving the optimization problem [Cover1984]:


for some entry-wise non-negative vectors

, where

denotes the probability simplex in

, i.e.,

and the inner product is the one associated with the Euclidean norm. The optimization problem appears when one wants to compute the growth-optimal portfolio for long-term investment (see, e.g., [MacLean2012]). Given an entry-wise strictly positive initial iterate , Cover’s method iterates as

where denotes the entry-wise product, aka the Schur product. Cover’s method is guaranteed to converge to the optimum [Cover1984]. Indeed, if the matrices in (1) share the same eigenbasis, then it is easily checked that (1) is equivalent to (2) but is not equivalent to Cover’s method333Cover’s method does not need the scaling mapping . One can check that its iterates are already in .. This explains why does not inherit the convergence guarantee of Cover’s method.

Rehacek et al. proposed a diluted correction of [Rehacek2007]. Given a positive definite initial iterate , diluted iterates as

where the parameter is chosen by exact line search. Later, Goncalves et al. proposed a variant of diluted by adopting Armijo line search [Goncalves2014]. Both versions of diluted are guaranteed to converge to the optimum. Unfortunately, the convergence guarantees of both versions of diluted are asymptotic and do not allow us to characterize the iteration complexity, the number of iterations required to obtain an approximate solution of (1). In particular, the dimension

grows exponentially with the number of qubits and can be huge, but the dependence of the iteration complexities of the diluted

methods on is unclear.

We propose the following algorithm.

  • Set , where

    denotes the identity matrix.

  • For each , compute

    where and denote matrix exponential and logarithm, respectively.

Notice that the objective function implicitly requires that are strictly positive; otherwise, does not exist as is not well-defined. Our initialization and iteration rule guarantee that are full-rank and are strictly positive.

Let us discuss the computational complexity of the proposed algorithm. The computational complexity of computing is . The computational complexities of computing matrix logarithm and exponential are . The per-iteration computational complexity is hence .

We can observe that the proposed algorithm recovers Cover’s method when and share the same eigenbasis. We show that the proposed algorithm indeed converges and its iteration complexity is logarithmic in the dimension .

Theorem 1

Assume that . Let be the sequence of iterates generated by the proposed method. Define . Then, for every , we have if .

Remark 1

Suppose . Let be a matrix whose columns form an orthogonal basis of . Then, it suffices to solve (1) on a lower-dimensional space by replacing with in the objective function.

Recall that (2) and Cover’s method are special cases of (1) and the proposed algorithm, respectively. Moreover, Cover’s method is equivalent to the expectation maximization method for Poisson inverse problems [Vardi1993]. Theorem 1 is hence of independent interest even for computing the growth-optimal portfolio by Cover’s method and solving Poisson inverse problems by expectation maximization, showing that the iteration complexities of both are also . This supplements the asymptotic convergence results in [Cover1984] and [Vardi1985]. Whereas the same iteration complexity bound for Cover’s method in growth-optimal portfolio is immediate from a lemma due to Iusem [Iusem1992, Lemma 2.2], it is currently unclear to us how to extend Iusem’s analysis to the quantum setup.

2 Proof of Theorem 1

For convenience, let

be a random matrix following the empirical probability distribution of

444Notice the derivation here is not restricted to the empirical probability distribution.. Then, we have , where denotes the mathematical expectation. Define

where the inner product, as well as all inner products in the rest of this section, is the Hilbert-Schmidt inner product. We start with an error upper bound.

Lemma 1

For any density matrix such that exists,

Remark 2

Notice that is always positive definite and is always well-defined. Otherwise, suppose there exists some vector such that . Then, as are positive semi-definite, we have for all , violating the assumption that .

Proof (Lemma 1)

By Jensen’s inequality, we write

Deriving the following lemma is the major technical challenge in our convergence analysis. The lemma shows that the mapping is operator convex.

Lemma 2

For any density matrix , the function is convex.


Equivalently, we want to show that for all and Hermitian . Define . By [Hiai2014, Example 3.22 and Exercise 3.24], we have


. Recall the chain rule for the second-order Fréchet derivative (see, e.g.,

[Bhatia1997, p. 316]):

Then, we write


Since is obviously positive semi-definite, it suffices to show that is positive semi-definite for all . We write

which is positive semi-definite by an extension of the Cauchy-Schwarz inequality due to Lavergne [Lavergne2008]555Whereas Lavergne considers the real matrix case, we notice the proof directly extends to the Hermitian matrix case..

Now, we are ready to prove Theorem 1.

Proof (Theorem 1)

By the Golden-Thompson inequality, we write

Therefore, by the iteration rule of the proposed algorithm and operator motononicity of the matrix logarithm, we have


Then, for any , we write

where the first inequality follows from Lemma 1, the second follows from Lemma 2 and Jensen’s inequality, the third follows from (3), and the last follows from the fact that .


C.-M. Lin and Y.-H. Li are supported by the Young Scholar Fellowship (Einstein Program) of the Ministry of Science and Technology of Taiwan under grant numbers MOST MOST 109-2636-E-002-025 and MOST 110-2636-E-002-012. H.-C. Cheng is supported by the Young Scholar Fellowship (Einstein Program) of the Ministry of Science and Technology in Taiwan (R.O.C.) under grant number MOST 109-2636-E-002-001 & 110-2636-E-002-009, and is supported by the Yushan Young Scholar Program of the Ministry of Education in Taiwan (R.O.C.) under grant number NTU-109V0904 & NTU-110V0904.