Quantum state tomography aims to estimate the quantum state of a physical system, given measurement outcomes (see, e.g., [Paris2004] for a complete survey). There are various approaches to quantum state tomography, such as trace regression [Flammia2012, Gross2010, Opatrny1997, Yang2020, Youssry2019], maximum-likelihood estimation [Hradil1997, Hradil2004], Bayesian estimation [Blume-Kohout2010, Blume-Kohout2010a]
, and recently proposed deep learning-based methods[Ahmed2020, Quek2021]111A confusion the authors frequently encounter is that many people mix state tomography with the notion of shadow tomography introduced by Aaronson [Aaronson2018, Aaronson2020]
. State tomography aims at estimating the quantum state, whereas shadow tomography aims at estimating the probability distribution of measurement outcomes. Indeed, one interesting conclusion by Aaronson is that shadow tomography requires much less data than state tomography.. Among existing approaches, the maximum-likelihood estimation approach has been standard in practice and enjoys favorable asymptotic statistical guarantees (see, e.g., [Hradil2004, Scholten2018]). The maximum-likelihood estimator is given by the optimization problem:
for some Hermitian positive semi-definite matrices , where denotes the set of quantum density matrices, i.e.,
and denotes the number of measurement outcomes. We write for the conjugate transpose of .
is a numerical method developed to solve (1) [Lvovsky2004, MolinaTerriza2004]. Given a positive definite , iterates as
where the mapping scales its input such that , and denotes the gradient mapping of . is parameter-free (i.e., it does not require parameter tuning) and typically converges fast in practice. Unfortunately, one can construct a synthetic data-set on which does not converge [Rehacek2007].
According to [Lvovsky2004], is inspired by Cover’s method222 Indeed, Cover’s method coincides with the expectation maximization method for solving (
Indeed, Cover’s method coincides with the expectation maximization method for solving (2) and hence is typically called expectation maximization in literature. Nevertheless, Cover’s and our derivations and convergence analyses do not need and are not covered by existing results on expectation maximization, so we do not call the method expectation maximization to avoid possible confusions. for solving the optimization problem [Cover1984]:
for some entry-wise non-negative vectors, where
denotes the probability simplex in, i.e.,
and the inner product is the one associated with the Euclidean norm. The optimization problem appears when one wants to compute the growth-optimal portfolio for long-term investment (see, e.g., [MacLean2012]). Given an entry-wise strictly positive initial iterate , Cover’s method iterates as
where denotes the entry-wise product, aka the Schur product. Cover’s method is guaranteed to converge to the optimum [Cover1984]. Indeed, if the matrices in (1) share the same eigenbasis, then it is easily checked that (1) is equivalent to (2) but is not equivalent to Cover’s method333Cover’s method does not need the scaling mapping . One can check that its iterates are already in .. This explains why does not inherit the convergence guarantee of Cover’s method.
Rehacek et al. proposed a diluted correction of [Rehacek2007]. Given a positive definite initial iterate , diluted iterates as
where the parameter is chosen by exact line search. Later, Goncalves et al. proposed a variant of diluted by adopting Armijo line search [Goncalves2014]. Both versions of diluted are guaranteed to converge to the optimum. Unfortunately, the convergence guarantees of both versions of diluted are asymptotic and do not allow us to characterize the iteration complexity, the number of iterations required to obtain an approximate solution of (1). In particular, the dimension
grows exponentially with the number of qubits and can be huge, but the dependence of the iteration complexities of the dilutedmethods on is unclear.
We propose the following algorithm.
Set , where
denotes the identity matrix.
For each , compute
where and denote matrix exponential and logarithm, respectively.
Notice that the objective function implicitly requires that are strictly positive; otherwise, does not exist as is not well-defined. Our initialization and iteration rule guarantee that are full-rank and are strictly positive.
Let us discuss the computational complexity of the proposed algorithm. The computational complexity of computing is . The computational complexities of computing matrix logarithm and exponential are . The per-iteration computational complexity is hence .
We can observe that the proposed algorithm recovers Cover’s method when and share the same eigenbasis. We show that the proposed algorithm indeed converges and its iteration complexity is logarithmic in the dimension .
Assume that . Let be the sequence of iterates generated by the proposed method. Define . Then, for every , we have if .
Suppose . Let be a matrix whose columns form an orthogonal basis of . Then, it suffices to solve (1) on a lower-dimensional space by replacing with in the objective function.
Recall that (2) and Cover’s method are special cases of (1) and the proposed algorithm, respectively. Moreover, Cover’s method is equivalent to the expectation maximization method for Poisson inverse problems [Vardi1993]. Theorem 1 is hence of independent interest even for computing the growth-optimal portfolio by Cover’s method and solving Poisson inverse problems by expectation maximization, showing that the iteration complexities of both are also . This supplements the asymptotic convergence results in [Cover1984] and [Vardi1985]. Whereas the same iteration complexity bound for Cover’s method in growth-optimal portfolio is immediate from a lemma due to Iusem [Iusem1992, Lemma 2.2], it is currently unclear to us how to extend Iusem’s analysis to the quantum setup.
2 Proof of Theorem 1
For convenience, let
be a random matrix following the empirical probability distribution of444Notice the derivation here is not restricted to the empirical probability distribution.. Then, we have , where denotes the mathematical expectation. Define
where the inner product, as well as all inner products in the rest of this section, is the Hilbert-Schmidt inner product. We start with an error upper bound.
For any density matrix such that exists,
Notice that is always positive definite and is always well-defined. Otherwise, suppose there exists some vector such that . Then, as are positive semi-definite, we have for all , violating the assumption that .
Proof (Lemma 1)
By Jensen’s inequality, we write
Deriving the following lemma is the major technical challenge in our convergence analysis. The lemma shows that the mapping is operator convex.
For any density matrix , the function is convex.
Equivalently, we want to show that for all and Hermitian . Define . By [Hiai2014, Example 3.22 and Exercise 3.24], we have
. Recall the chain rule for the second-order Fréchet derivative (see, e.g.,[Bhatia1997, p. 316]):
Then, we write
Since is obviously positive semi-definite, it suffices to show that is positive semi-definite for all . We write
which is positive semi-definite by an extension of the Cauchy-Schwarz inequality due to Lavergne [Lavergne2008]555Whereas Lavergne considers the real matrix case, we notice the proof directly extends to the Hermitian matrix case..
Now, we are ready to prove Theorem 1.
Proof (Theorem 1)
By the Golden-Thompson inequality, we write
Therefore, by the iteration rule of the proposed algorithm and operator motononicity of the matrix logarithm, we have
Then, for any , we write
C.-M. Lin and Y.-H. Li are supported by the Young Scholar Fellowship (Einstein Program) of the Ministry of Science and Technology of Taiwan under grant numbers MOST MOST 109-2636-E-002-025 and MOST 110-2636-E-002-012. H.-C. Cheng is supported by the Young Scholar Fellowship (Einstein Program) of the Ministry of Science and Technology in Taiwan (R.O.C.) under grant number MOST 109-2636-E-002-001 & 110-2636-E-002-009, and is supported by the Yushan Young Scholar Program of the Ministry of Education in Taiwan (R.O.C.) under grant number NTU-109V0904 & NTU-110V0904.