## I Introduction

Recent advances in quantum information processing often require characterizing quantum states prepared during various stages of a procedure. As a result, the problem of characterising a quantum state, more specifically, a density matrix, from measurements on an ensemble of identical states, known as quantum state tomography (QST), has seen a surge of interest [Torlai_2018, Carrasquilla_2019, Huang_2020]. One of the key challenges is that, for

-qubit quantum systems, the density matrix is of size

. As the number of qubits become large, inferring the density matrix from a limited number of measurements becomes difficult.Can we get away without fully characterizing the quantum state, but by constructing an approximate classical description that predicts many different functions of the state accurately? Shadow Tomography [aaronson2018shadow] precisely aims to do this, namely, predict a power law number of observations in number of qubits, , from copies of the density matrix . This idea was taken further by Huang et al. [Huang_2020] who have constructed such a description of low sample complexity via classical shadows (), related to states without any entanglement in the appropriate basis, corresponding to each copy of .

Quantum measurement requires specifying a set of Positive Operator Valued Measures (POVMs) [Nielsen] which is a generalization of a complete set of projection operators. The work by Huang et al. [Huang_2020] involves measurements via projection operators. Since projection operators are not informationally complete (see Sec. II.1), Huang et al. employ a set of random unitary transformations before taking measurements. In the work that follows, we directly employ a complete or overcomplete POVM system and perform shadow tomography. This simplification also gives us insight into the optimality of prediction over the choice of POVMs.

## Ii Generalized Measurements

A projective measurement is described by an observable, , a Hermitian operator on the state space of the system being observed. The observable has a spectral decomposition, where

is the projector onto the eigenspace of

with eigenvalue

. The possible outcomes of the measurement corresponding to the eigenvalues,, of the observable and the outcome probability is

. Projection Valued Measures (PVMs) are a special case of general measurements, where the measurement operators are Hermitian and orthogonal projectors. A set of Positive Operator Valued Measures (POVMs) [Nielsen] forms a generalization of PVMs. The index in the POVM element refers to the measurement outcomes that may occur in the experiment. The probability of the measurement outcome is given by and the post measurement density matrix can be written as , where are the Kraus operators [Nielsen] corresponding to the POVM, with . The operators form a complete set of Hermitian non-negative operators. Namely, they satisfy ,for any vector

and . Such a POVM could be thought of as a partition of unity by non-negative operators.### ii.1 Informational completeness

The density matrix () is a Hermitian and unit trace operator. If we have a -dimensional system, will be a complex square matrix represented by real parameters. The operator space for this -dimensional operator will be however spanned by linearly independent basis operators. Note that PVMs only have projection operators. They are capable of providing only the diagonal elements of in a particular orthonormal basis, leaving out potential entanglement-related information from the off-diagonal elements. Thus, PVMs are examples of POVMs that are informationally undercomplete.

If the number of outcomes satisfies , and we can form exactly linearly independent operators by linearly combining the set of POVMs, such POVMs will be called informationally complete. However, in the most common terminology, informationally complete actually refers to the minimally complete POVM (). If we proceed to reconstruct the density matrix for an informationally complete POVM, we can expand as

(1) |

If we have a informationally complete (minimally complete) basis set. However, if we have it forms an informationally overcomplete set [Renes_2004].

We start out by giving the example of a rather simple overcomplete set in the single-qubit Hilbert space, . Pauli-6 POVM has 6 outcomes where , , and stand for the eigenbases of the Pauli operators , , and , respectively. Experimentally, it can be implemented directly by first randomly choosing , , or , and then measuring the respective Pauli operator, which justifies the factor. However, other probabilities will also be valid for this example of an overcomplete POVM.

Now, let us give an example of a minimally complete POVM, the Pauli-4 POVM:. As a sanity check for the completeness relation, one can see
. The experimental procedure will be similar to that of Pauli-6 POVM, with an additional step where three different outcomes of Pauli-6 are identified as the single element of Pauli-4, . Thus, this set contains an element which is not a rank-1 projector.

The third one is the tetrahedral POVM , whose outcomes correspond to sub-normalized rank-1 projectors along the directions
, ,
,
and in the Bloch sphere. Since the tetrahedron formed is regular, it forms an example of a symmetric informationally complete (SIC) POVM. The experimental implementation
of relies on Neumark’s dilation theorem. The theorem implies that
can be physically realized by coupling the system qubit to an ancillary qubit and performing
a von Neumann measurement on the two qubits (see Ref. [Carrasquilla_2019, PhysRevA.86.062107] for explicit constructions).

## Iii Classical Shadows with POVMs

Aaronson introduced the idea of “pretty good tomography”[Aaronson_2007], with the focus on predicting many observations accurately, based on copies of the density matrix. This idea parallels the “learnability” of quantum states in a Probably Approximately Correct (PAC) sense [PAC]. Proceeding along this line, he later introduced the concept of Shadow Tomography [aaronson2018shadow], where from copies of the density matrix , we want to predict different linear target functions up to an additive error less than .

Huang et al. [Huang_2020] build their methods on the idea of Shadow Tomography [aaronson2018shadow]. They repeatedly perform a measurement procedure, i.e. apply a random unitary to rotate the state () and perform a computational-basis measurement. Then, after the measurement, they apply the inverse of to the resulting computational basis state. This procedure collapses to a snapshot , producing a quantum channel , which depends on the ensemble of (random) unitary transformations.

If the collection of unitaries is defined to be tomographically complete, namely, if the condition i.e. for each , there exist and such that is met, then — viewed as a linear map — has a unique inverse . Huang et al. [Huang_2020] set

(2) |

Although the inverted channel is not physical (it is not completely positive), one can still apply to the (classically stored) measurement outcome in a completely classical post-processing step. Even if an individual sample of is not a density matrix, the expectation of ’s is the original density matrix . One can use this property to get a good prediction of measurements performed on .

If, instead of working with the computational basis measurements, we decide to use an informationally complete set of POVMs (Sec. II.1), we can avoid dealing with particular random unitary ensembles. The only thing we need to make sure is that the resulting channel is invertible.

### iii.1 POVMs for the the -qubit system

From single qubit POVMs , we introduce

operators by taking tensor products and form POVMs for the

-qubit system: . The outcomes of this measurements in this system are of the form . Now, we discuss how to form shadows from such an observation.### iii.2 A synthetic measurement channel

Let the POVM elements be diagonalised as follows: , since . Let be a strictly monotonic function which will be applied to the eigenvalues of the POVM elements. The function is defined on since the eigenvalues are non-negative. The probability outcome ‘’ is given as

(3) |

Each time we perform a measurement and get an outcome ‘’, we construct a pure output state with probability . We assume each to be non-zero, guaranteeing that the denominator . Although this is a synthetic channel, we will refer to it as the measurement channel, in analogy with the case where are projections.

The measurement channel, for a single qubit, can be defined as

(4) |

For simplicity, in the following discussion, we consider the case where the highest eigenvalue of each is non-degenerate. The modifications needed for the general case are obvious. If a particular POVM element is not a rank one projector and the function is very steeply increasing, then the overwhelmingly likely output is , where

is the eigenvector corresponding to the highest eigenvalue of

. An example of such a function is in the large limit. In the large limit, as we perform a measurement, the output is (snapshots) with probability . The measurement channel can be defined using a convex combination of the snapshots as(5) |

In a more general scheme, like the one mentioned in the beginning of the subsection,

is a random vector chosen according to a probability distribution. For example, in the current scheme, if the largest eigenvalue of

is degenerate, we choose any one of the corresponding eigenvectors with equal probability.In the formalism developed in [Huang_2020], the channel and its inversion were related to the ensemble of (random) unitary transformations (e.g. Clifford unitary ensemble). The condition of tomographical completeness depended on the existence of a unitary transformation in the chosen ensemble to distinguish different density matrices [Huang_2020]. However, with our reformulation of the measurement channel, we need to use an informationally complete set POVMs (e.g. Pauli-6, see Sec. II).

In the example of a single qubit measured using the 6 projectors coming from the 3 Pauli matrices i.e. Pauli-6 POVM, the channel and its inverse can be explicitly computed. Similar to the classical shadows built out of random Pauli measurements [Huang_2020], we get a depolarizing channel i.e. a channel that contracts a pure state (lying on the surface of the Bloch sphere) towards the ‘center’ of the Bloch sphere, namely, the maximally mixed state . The inverse (a non-physical map) can be computed, which can map a point inside the Bloch ball to the outside.

Multi-qubit system: For local measurements (not necessarily the depolarizing channel), the inverse channel for the -qubit system can be written as

(6) |

We can now reformulate the shadows with our overcomplete POVM set and its corresponding channel. For instance, when we work with Pauli-6 POVM, we will get

(7) |

where (see Sec. A.2). Note that the matrix need not be constructed explicitly. We just need to store for each qubit .

Since the inverted channel is not physical (it is not completely positive), the in Eq. (7) need not be physical. In other words, there is no guarantee the output of the inverse channel is positive semidefinite. See Fig. 1 for a schematic description. We recover the true density matrix only in expectation. However, if the shadow matrix is forced to be positive semidefinite, we can see how the observations such as fidelity changes (see Sec. IV.1).

### iii.3 Noisy shadow

Earlier, we defined our measurement channel, Eq. (5). However, we can also let each of our qubits pass through a previously characterized noise channel and then take the measurements [koh2020classical]. The combined channel is given by

(8) |

We used informationally complete set of POVMs to ensure that the measurement channel was invertible. As long as the action of the noise channel itself is invertible, is also invertible. We will work with an -qubit noise channel of the form . Thus, we can still write the inverse of the new noisy measurement channel for the -qubit system in terms of the single qubit inverse shadow channel :

(9) |

If we choose an amplitude damping channel with damping parameter , one of the Kraus operator representations can be given as

(10) |

where , .

### iii.4 Predicting linear functions with classical shadows

Using the statistical properties of a single shadow, we can predict linear functions in the unknown state as

(12) |

In practice, using an array of shadows (i.e.

snapshots), we can estimate the expectation

. Given an array of independent classical snapshots (each defined as in Eq. (7)) :(13) |

The sample mean is This sample mean will fluctuate around the true prediction, with .

### iii.5 The algorithm and the guarantee of performance

We want to predict the expected value of multiple -local observables based on shadows using the two algorithms below.

The existence of the bound is guaranteed by the following theorem.

###### Theorem 1.

With samples of , we can predict different linear target functions up to additive error with maximum failure probability .

The constant bound will depend on the measurement channel (which depends on the choice of POVM) and on the operator set ). The important thing is that is bounded for so called -local operators, as defined in [Huang_2020].

## Iv Numerical Results

For many quantum systems in Condensed Matter Physics, one of the objects of interest is the two-point correlation function. Two-point correlators could be efficiently estimated using classical shadows based on Pauli-6 POVM. The predictions of two-point functions for the GHZ states with varying degree of noise is shown in Fig. 2.

We can write the action of the single qubit depolarizing noise [Nielsen] on an arbitrary written in the Bloch sphere representation:

(14) |

Applying this channel to every qubit, we generate a noise GHZ state [greenberger2007going] from a pure one.
The expected two-point correlations varies as with the noise parameter .

While predicting multiple , two-point or -point correlations, we monitor the maximum possible error among all the observables. This measure of error is expected to go down with increasing number of samples. This scaling, as seen in Fig. 3, gives us some idea of the appropriateness of a POVM set for a particular task.

#### iv.0.1 1D Transverse Field Ising Model

We take antiferromagnetic ( in Eq. (15) ) transverse field Ising model in 1D:

(15) |

The quantum critical point at will be exhibited by the power-law decay of the correlations. See Fig. 4 for results in the three regimes: critical, ordered and paramagnetic. The exact numerical correlations are plotted using the matrix product representations of the ground states. [Orus_2014]. In [Carrasquilla_2019] and [Luchnikov_2019]

, POVM-based measurements, followed by a neural-network-centric approach for constructing the ground state, and computing the resulting two-point correlations were presented for the same system.

#### iv.0.2 1D Disordered Heisenberg Model

The Hamiltonian for the 1D disordered Heisenberg model is given by,

(16) |

The properties of spin- antiferromagnetic chains with various types of random exchange coupling has been studied in an exact decimation renormalization-group (strong-disorder) schemes, some of which involve generalization or modifications of the scheme introduced by Dasgupta and Ma [1980PhRvB..22.1305D]. The numerical studies done by R.N. Bhatt and P.A. Lee [Bhatt_Lee] indicate that the system could be in a random-singlet phase. In such a phase, each spin is paired with another spin that may be far away on the lattice. We perform exact diagonalization, obtain the ground state and then compute two-point quantum correlations. The 2d plot of the correlation matrix will also inform us about the locations of the singlet formations in the chain. We can also reconstruct these behavior of a ground-state corresponding to one particular disorder realization of the XXZ-Heisenberg model Eq. (16) (, ) with sufficient number of shadows. See Fig. 5, where the singlet formations are indicated by the schematics drawn on the axes of the matrix visualization plots and the results from the two methods are compared.

### iv.1 Exploring quantum fidelity

In our approach to construct shadows using local POVMs, we ensure prediction of local observables. However, we can also explore non-local observables such as fidelity. Using sample mean as an estimator, we can construct a hypothesis state ():

(17) |

When our target state is pure, we can rewrite quantum fidelity as a linear prediction with our target observable given as . Starting from this definition of quantum fidelity i.e. , using we get . Further simplification of the fidelity gives us

The measure is equivalent to fidelity, only when the latter is defined, i.e. when . That property is likely to hold only when the number of samples is large. We expect to fluctuate around its mean value 1, as seen in Fig. 6, even when the typical is not a physical state, meaning it is not positive semidefinite. Also, the fluctuation around this mean keeps on growing exponentially with the number of qubits (see Fig. 6). This growth cannot be dealt with even by the median of means (MoM) procedure [Huang_2020] within the shadow formalism. Numerical computations using MoM also show no advantage over sample means here.

Hence, we need a procedure to find the ‘closest’ physical state to . The trace condition ensures that once , some of the eigenvalues will be greater than 1 to compensate for the negative eigenvalues. Thus, we cannot just throw away the negative eigenvalues, as would be done for projecting a Hermitian matrix to the space of positive semidefinite matrices.

We define the the convex set of physical states to be . Our nonlinear projection to is

(18) |

We achieve this by diagonalizing , projecting the eigenvalues of onto a canonical simplex , using the recipe from Ref. [proj_wang], while leaving the eigenvectors untouched. Here, where is the total number of qubits. The projected state is a biased estimator. We can hope that the price paid by accepting some bias comes with the benefit of reduced variance. This expectation seems to be born out in Fig. 7. However, as number of qubits increase, the bias itself reduces fidelity. To compensate this effect, we need larger sample sizes (). Fig. 7 shows all these trends.

## V Discussions

We provide an approach to predict expectations of local observables without having to apply random unitary transformations, which sometimes require complex circuits of its own, and can become a practical bottleneck. We show that this can rather be done using an informationally complete set of POVMs. For illustrations, we show faithful reconstruction properties of low energy states coming from different many body Hamiltonians relevant to near-term applications of quantum devices. When we have additional information about the possible noisy channels we also adapt the shadow channel as a composition of the noise channel and the measurement channel. The invertibility becomes straightforward in the proposed framework. We also comment on why the mean as an estimator is sufficient throughout our discussion. And as long as we are dealing with local observables, we can provide efficient sample complexity using Hoeffding’s inequality directly.

We provided instances where the choice of POVM impacts the sample complexity for predicting 2-point correlators in certain quantum states for fixed maximum error. We noted that the different POVMs work better for different states. It is an exciting endeavour to understand which sets of POVM would be ideal for different classes of quantum states and observables.

Although, an exploration, we attempt to reconstruct fidelity using the locally built shadows and show that we cannot benefit from median of means as an estimator, since variance of fidelity becomes exponential in number of qubits. Additionally, when presented with few samples we raise the issue of unphysical i.e. not positive semidefinite and then provide a projection tecnique, similar to [Struchalin_2021], to estimate fidelity. Unfortunately, the estimator no longer remains unbiased. Addressing this issue would require methods to deal with non-local observables.

We did not provide an effective analog of the global Clifford unitary transformation-based method in [Huang_2020]. There has been work which provides description of global alternatives using stabilizer states [Struchalin_2021]. Whether there can be a scheme based on such states that is competitive with the classical shadows method [Huang_2020] remains to be seen.

The use of generalized measurement to unambiguously discriminate non-orthogonal states with lower failure probability is well known [Barnett:09, Nielsen, Chefles_2000]. Efficient prediction of expectations of local observables combined with the generalized measurement scheme to obtain the shadows can be used as an optimal framework in the discrimination of non-orthogonal states. In the future, it is a promising direction of exploration.

## Acknowledgement

We would like to thank Shagesh Sridharan, James Stokes, Miles Stoudenmire for insightful discussions.

## Appendix A Appendix

### a.1 The Measurement Channel for Pauli-6

We can take the simple rank-1 Pauli-6 POVMs to see the action of a measurement channel:

(19) |

where we use the Bloch representation .

The contribution of the first two POVM elements of Pauli-6 only gets contribution from and , generating

Using and this expression becomes: Following similar steps for pairs and , we get that:

making a depolarizing channel.

### a.2 Inverse of the measurement channel

Given any single qubit channel, the inverse can be easily computed using the Bloch-sphere representation. We can write any 2 dimensional (single qubit) quantum operation () as . Any arbitrary trace-preserving quantum operation is given as . The map is equivalent to,

(20) |

The components of displacement () is given as . The affine map between the Bloch sphere and itself is given by

, and its meaning is understood better by doing a singular value decomposition i.e.

whereare orthogonal matrices. The singular values capture the deformation of the Bloch sphere about its principal axes. A superoperator

can be defined as(21) |

Computing inverse of the channel is equivalent to writing from i.e. computing .

In the main text, the Pauli measurement channel () turns out to be a depolarizing channel, and its inverse that acts on the local qubit is given as

We take a more general example following our definition of a measurement channel:

If a particular POVM element is not rank one, can be taken as the eigenvector corresponding to the highest eigenvalue of . For Pauli-4, except for , all other elements are rank-1. Since is rank-2, when the outcome is , we take the eigenvector corresponding to eigenvalue instead of the other corresponding to . Rewriting Eq. (A.2), we get

(22) |

The inverse of the channel can be written as

(23) |

When we are working with a known noise channel , the inverse is given as . If we choose an amplitude damping channel with a damping parameter , the inverse can be given as

(24) |

### a.3 Sample complexity

#### a.3.1 Variance of the Estimate for a Single Observable

Given an array of independent, classical snapshots (each defined as Eq. (7)) :

(25) |

The sample mean is The bound on probability of deviation of the sample mean is given by Chebyshev’s inequality:

(26) |

where is the true density matrix. Fluctuations of around this desired expectation are controlled by the variance. . However, since the classical shadows are unit trace by construction, the variance depends only on the trace-less part of the observable i.e. . The minimum number of samples needed to assure a maximum failure probability () using Eq. (26) is

(27) |

#### a.3.2 Dependence on POVM

Given a measurement channel and an observable, we can bound the variance of its estimator, using familiar maneuvers with superoperators [Huang_2020],

We broadly define a -local Pauli-observable as an operator which acts nontrivially only on qubits. Traceless local operators can be expressed as linear conbination of tensor products of indentity matrices and or less Pauli matrices. Hence, we need to focus only on special class of -local operators. Denoting as one of the Pauli matrices acting on the th qubit, we focus on of tenor products like , where, without loss of generality, we assume that the operator acts non-trivially on only the first qubits.

For Pauli-6 POVM, the inverse of the measurement channel is a self-adjoint map, and thus one can verify its action as:

where denotes a Pauli matrix and . Given a -local observable, we can further compute the bound on variance:

Now, we take up Pauli-4 POVM. One can verify the action of as:

where denotes a Pauli matrix. Using the fact that is a trace preserving map, one can say its adjoint has to be unital . Given a -local Pauli-observable, one can again compute the bound on variance:

Clearly, the above bound on variance is dependent on the state , unlike the bound we obtained using Pauli-6 POVM. Since is a density matrix and the operator is a PSD operator, one gets the minimum value for the bound when is of the form:

(28) |

where is the projector into the eigenvector corresponding to the lowest eigenvalue of the operator , and is a valid density matrix in the Hilbert space of qubits on which the -local Pauli-observable acts trivially. For the above , it is simple to verify that the value of the variance bound is 1 (independent of ). Thus, for example, if the unknown state is the all spin down state, then Pauli-4 POVM works better than Pauli-6 POVM in predicting two-point correlators , since the variance is higher in the latter.

#### a.3.3 Improved Bound Using Hoeffding’s Inequality

Furthermore, we can use Hoeffding’s inequality to provide theoretical bounds when we are dealing with

-local Pauli observable, since we are working with bounded random variables. If

, for all , where , we can write,(29) |

The minimum number of samples needed to assure a maximum failure probability () among all the observables using Eq. (31) is

(30) |

depends on the locality of the observable and the maximum eigenvalue of the inverse channel acting on the observable. The bound on random variable can be found as the range of the Rayleigh quotient of the inverse of the measurement channel, acting on the observable over all possible states. For instance, if we choose Pauli-6, the bounds can be shown to lie within in which case . Using the action of , one can verify that the value of the random variable belongs to the set for any Pauli matrix when is the inferred state for Pauli-4. Thus, the random variable is contained in the range which is exponential on the locality rather than the number of qubits.

#### a.3.4 The Guarantee of Performance for Multiple Observables

If we have different -local Pauli observables with the sample mean corresponding to the observable defined as . If , for all , where , we can combine the union bound with Hoeffding’s inequality to write

(31) |

The minimum number of samples needed to assure a maximum failure probability () among all the observables using Eq. (31) is

(32) |

The scaling is logarithmic in the number of observables , instead of linear behavior we get using Chebyshev’s inequality. We do not need to use MoM procedure [Huang_2020], which would have been necessary if we were dealing with estimate distributions with long tails (unlike the bounded estimates for -local Pauli observables).

Comments

There are no comments yet.