Quantum theory (QT) is one of the most fundamental, and accurate, mathematical descriptions of our physical world. It dates back to the 1920s, and in spite of nearly one century passed by since its inception, we do not have a clear understanding of such a theory yet. In particular, we cannot fully grasp the meaning of the theory: why it is the way it is. As a consequence, we cannot come to terms with the many paradoxes it appears to lead to — its so-called “quantum weirdness”.
This paper aims at finally explaining QT while giving a unified reason for its many paradoxes. We pursue this goal by having QT follow from two main postulates:
The theory should be logically consistent.
Inferences in the theory should be computable in polynomial time.
The first postulate is what we essentially require to each well-founded mathematical theory, be it physical or not: it has to be based on a few axioms and rules from which we can unambiguously derive its mathematical truths. The second postulate will turn out to be central. It requires that there should be an efficient way to execute the theory in a computer.
QT is an abstract theory that can be studied detached from its physical applications. For this reason, people often wonder which part of QT actually pertains to physics. In our representation, the answer to this question shows itself naturally: the computation postulate defines the physical component of the theory. But it is actually stronger than that: it states that computation is more primitive than physics.
Let us recall that QT is widely regarded as a “generalised” theory of probability. In this paper we make the adjective “generalised” precise. In fact, our coherence postulate leads to a theory of probability, in the sense that it disallows “Dutch books”: this means, in gambling terms, that a bettor on a quantum experiment cannot be made a sure loser by exploiting inconsistencies in their probabilistic assessments. But probabilistic inference is in general NP-hard. By imposing the additional postulate of computation, the theory becomes one of “computational rationality”: one that is consistent (or coherent), up to the degree that polynomial computation allows. This weaker, and hence more general, theory of probability is QT.
As a result, for a subject living inside QT, all is coherent. For us, living in the classical, and somewhat idealised, probabilistic world (not restricted by the computation postulate), QT displays some inconsistencies: precisely those that cannot be fixed in polynomial time. All quantum paradoxes, and entanglement in particular, arise from the clash of these two world views: i.e., from trying to reconcile an unrestricted theory (i.e., classical physics) with a theory of computational rationality (quantum theory). Or, in other words, from regarding physics as fundamental rather than computation.
But there is more to it. We show that the theory is “generalised” also in another direction, as QT turns out to be a theory of “imprecise” probability: in fact, requiring the computation postulate is similar to defining a probabilistic model using only a finite number of moments; and therefore, implicitly, to defining the model as the set of all probabilities compatible with the given moments. In QT, some of these compatible probabilities can actually besigned, that is, they allow for “negative probabilities”. In our setting, these have no meaning per se, they are just a mathematical consequence of polynomially bounded coherence (or rationality).
De Finetti’s subjective foundation of probability [finetti1937] is based on the notion of rationality (consistency or coherence). This approach has then been further developed in [williams1975, walley1991], giving rise to the so-called theory of desirable gambles (TDG). This is an equivalent reformulation of the well-known Bayesian decision theory (à la Anscombe-Aumann [anscombe1963]) once it is extended to deal with incomplete preferences [zaffalon2017a, zaffalon2018a]. In this setting probability is a derived notion in the sense that it can be inferred via mathematical duality from a set of logical axioms that one can interpret as rationality requirements in the way a subject, let us call her Alice, accepts gambles on the results of an uncertain experiment. It goes as follow.
Let denote the possibility space of an experiment (e.g., or in QT). A gamble on is a bounded real-valued function of , interpreted as an uncertain reward. It plays the traditional role of variables or, using a physical parlance, of observables. In the context we are considering, accepting a gamble by an agent is regarded as a commitment to receive, or pay (depending on the sign), utiles (abstract units of utility, we can approximately identify it with money provided we deal with small amounts of it [finetti1974, Sec. 3.2.5]) whenever occurs. If by we denote the set of all the gambles on , the subset of all non-negative gambles, that is, of gambles for which Alice is never expected to lose utiles, is given by . Analogously, negative gambles, those gambles for which Alice will certainly lose some utiles, even an epsilon, is defined as . In what follows, with we denote a finite set of gambles that Alice finds desirable (we will comment on the case when may not be finite): these are the gambles that she is willing to accept and thus commits herself to the corresponding transactions.
The crucial question is now to provide a criterion for a set of gambles representing assessments of desirability to be called rational. Intuitively Alice is rational if she avoids sure losses: that is, if, by considering the implications of what she finds desirable, she is not forced to find desirable a negative gamble. This postulate of rationality is called “no arbitrage” in economics and “no Dutch book” in the subjective foundation of probability. In TDG we formulate it thorough the notion of logical consistency which, despite the informal interpretation given above, is a purely syntactical (structural) notion. To show this, we need to define an appropriate logical calculus (characterising the set of gambles that Alice must find desirable as a consequence of having desired in the first place) and based on it to characterise the family of consistent sets of assessments.
For the former, since non-negative gambles may increase Alice’s utility without ever decreasing it, we first have that:
should always be desirable.
This defines the tautologies of the calculus. Moreover, whenever are desirable for Alice, then any positive linear combination of them should also be desirable (this amounts to assuming that Alice has a linear utility scale, which is a standard assumption in probability). Hence the corresponding deductive closure of a set is given by:
Here “” denotes the conic hull operators. When is not finite, 1 requires in addition that is closed.
In the betting interpretation given above, a sure loss for an agent is represented by a negative gamble. We therefore say that:
Definition 1 (Coherence postulate).
A set of desirable gambles is coherent if and only if
Note that is coherent if and only if ; therefore can be regarded as playing the role of the Falsum and A can be reformulated as .
. Based on it we derive the axioms of classical probability theory. Assumeis coherent. We give a probabilistic interpretation by observing that the mathematical dual of is a closed convex set of probabilities:
where is the set of all probabilities in , and the set of all charges (a charge is a finitely additive signed-measure [aliprantisborder, Ch.11]) on . We have derived the axioms of probability—a non-negative function that integrates to one—from the the coherence postulate A. Hence, whenever an agent is coherent, Equation (1) states that desirability corresponds to non-negative expectation (for all probabilities in ). When is incoherent, turns out to be empty—there is no probability compatible with the assessments in . As simple as it looks, expression A alone therefore captures the coherence postulate as formulated in the Introduction in case of classical probability theory.
The problem of checking whether or not is coherent can be formulated as the following decision problem:
If the answer is “yes”, then the gamble belongs to , proving ’s incoherence. Actually any inference task can ultimately be reduced to a problem of the form (2), as discussed in the Supplementary 2.2. Hence, the above decision problem unveils a crucial fact: the hardness of inference in classical probability corresponds to the hardness of evaluating the non-negativity of a function in the considered space (let us call this the “non-negativity decision problem”).
When is infinite (in this paper we consider the case ) and for generic functions, the non-negativity decision problem is undecidable. To avoid such an issue, we may impose restrictions on the class of allowed gambles and thus define on a appropriate subspace of (see Appendix B in Supplementary). For instance, instead of , we may consider : the class of multivariate polynomials of degree at most (we denote by the subset of non-negative polynomials and by the negative ones). In doing so, by Tarski-Seidenberg quantifier elimination theory [tarski1951decision, seidenberg1954new], the decision problem becomes decidable, but still intractable, being in general NP-hard. If we accept the so-called “Exponential Time Hypothesis” (that PNP) and we require that inference should be tractable (in P), we are stuck. What to do? A solution is to change the meaning of “being non-negative” for a function by considering a subset for which the membership problem in (2) is in P.
In other words, a computationally efficient TDG, which we denote by , should be based on a logical redefinition of the tautologies, i.e., by stating that
should always be desirable,
in the place of A. The rest of the theory can develop following the footprints of . In particular, the deductive closure for is defined by:
And the coherence postulate, which now naturally encompasses the computation postulate, states that:
Definition 2 (P-coherence).
A set of desirable gambles is P-coherent if and only if
P-coherence owes its name to the fact that, whenever contains all positive constant gambles, a can be checked in polynomial time by solving:
where denotes the constant function such that for all .
Hence, and have the same deductive apparatus; they just differ in the considered set of tautologies, and thus in their (in)consistencies.
is a topological vector space, we can consider its dual spaceof all bounded linear functionals . Hence, with the additional condition that linear functionals preserve the unitary gamble, the dual cone of a P-coherent is given by
To we can then associate its extension in , that is, the set of all charges on extending an element in . In general however this set does not yield a classical probabilistic interpretation to . This is because, whenever , there are negative gambles that cannot be proved to be negative in polynomial time:
Assume that includes all positive constant gambles and that it is closed (in ). Let be a P-coherent set of desirable gambles. The following statements are equivalent:
includes a negative gamble that is not in .
is incoherent, and thus is empty.
is not (the restriction to of) a closed convex set of mixtures of classical evaluation functionals.
The extension of in the space of all charges in includes only charges that are not probabilities (they have some negative value).
Theorem 1 is the central result of this paper. It states that whenever
includes a negative gamble (item 1), there is no classical probabilistic interpretation for it (item 2). The other points suggest alternatives solutions to overcome this deadlock: either to change the notion of evaluation functional (item 3) or to use quasi-probabilities (probability distributions that admit negative values) a means for interpreting(item 4).
In what follows, we are going to show that QT can be deduced from a particular instance of the theory . As a consequence, we get that the computation postulate, and in particular a, is not only the unique non-classical postulate of QT, regarded as a theory of probability, but also the unique reason for all its paradoxes, which all boil down to a rephrasing of the various statements of Theorem 1 in the considered quantum context.
QT as computational rationality
Consider first a single particle system with
-degrees of freedom and
We can interpret an element as “input data” for some classical preparation procedure. For instance, in the case of the spin- particle (), if is the direction of a filter in the Stern-Gerlarch experiment, then is its one-to-one mapping into (apart from a phase term). For spin greater than , the variable associated to the preparation procedure cannot directly be interpreted in terms only of “filter direction”. Nevertheless, at least on the formal level, plays the role of a “hidden variable” in our model and of the possibility space . This hidden-variable model for QT is also discussed in [holevo2011probabilistic, Sec. 1.7], where the author explains why this model does not contradict the existing “no-go” theorems for hidden-variables, see also Supplementary 7.3 and Supplementary 7.4.
In QT any real-valued observable is described by a Hermitian operator. This naturally imposes restrictions on the type of functions in (2):
where and , with being the set of Hermitian matrices of dimension . Since is Hermitian and is bounded (), is a real-valued bounded function ( in bra-ket notation).
More generally speaking, we can consider composite systems of particles, each one with degrees of freedom. The possibility space is the Cartesian product and the functions are -quadratic forms:
with , and where
denotes the tensor product between vectors regarded as column matrices. Notice that in our setting the tensor product is ultimately a derived notion, not a primitive one (see Supplementary 7.4), as it follows by the properties of-quadratic forms.
For (a single particle), evaluating the non-negativity of the quadratic form boils down to checking whether the matrix is Positive Semi-Definite (PSD) and therefore can be performed in polynomial time. This is no longer true for : indeed, in this case there exist polynomials of type (5) that are non-negative, but whose matrix
is indefinite (it has at least one negative eigenvalue). Moreover, it turns out that problem (2) is not tractable:
Proposition 1 ([gurvits2003classical]).
The problem of checking the non-negativity of functions of type (5) is NP-hard for .
What to do? As discussed previously, a solution is to change the meaning of “being non-negative” by considering a subset for which the membership problem, and thus (2), is in P. For functions of type (5), we can extend the notion of non-negativity that holds for a single particle to particles:
That is, the function is “non-negative” whenever is PSD (note that is the so-called cone of Hermitian sum-of-squares polynomials, see Supplementary 7.2, and that, in the non-negative constant functions have the form with ). Now, consider any set of desirable gambles satisfying a–a with the given definition of a: Eureka! We have just derived the first postulate of QT (see Postulate 1 in [nielsen2010quantum, p. 110]). Indeed, let be a finite set of assessments, and the deductive closure as defined by 1; it is not difficult to prove that the dual of is
where is the set of all density matrices. As before, whenever the set representing Alice’s beliefs about the experiment is coherent, Equation (6) means that desirability implies non-negative “expected value” for all models in . Note that in QT the expectation of is . This follows by Born’s rule, a law giving the probability that a measurement on a quantum system will yield a given result. The agreement with Born’s rule is an important constraint in any alternative axiomatisation of QT. Our theory agrees with it although this is a derived notion in our setting. In fact, in the view of a density matrix as a dual operator, is formally equal to
with defined in (3). Hence, when a projection-valued measurement characterised by the projectors is considered, then
Since and the polynomials for form a partition of unity, i.e.:
we have that
For this reason, is usually interpreted as a probability. But the projectors ’s are not indicator functions, whence, strictly speaking, the traditional interpretation is incorrect. This can be seen clearly in the special case where postulates A and a coincide, as in the case of a single particle, that is, where the theory can be given a classical probabilistic interpretation, see Supplementary 7.3. In such a case, the corresponding is just a (truncated) moment matrix, i.e., one for which there is at least one probability such that . In summary, our standpoint here is that should rather be interpreted as the expectation of the -quadratic form . This makes quite a difference with the traditional interpretation since in our case there can be (and usually there will be) more than one charge compatible with such an expectation, as we will point out more precisely later on.
Hence again, since both 1,a and 1,A are the same logical postulates parametrised by the appropriate meaning of “being negative/non-negative”, the only axiom truly separating classical probability theory from the quantum one is a (with the specific form of a), thus implementing the requirement of computational efficiency.
Entanglement is usually presented as a characteristic of QT. In this section we are going to show that it is actually an immediate consequence of computational tractability, meaning that entanglement phenomena are not confined to QT but can be observed in other contexts too. An example of a non-QT entanglement is provided in Supplementary 6.
To illustrate the emergence of entanglement from P-coherence, we verify that the set of desirable gambles whose dual is an entangled density matrix includes a negative gamble that is not in , and thus, although being logically coherent, it cannot be given a classical probabilistic interpretation.
In what follows we focus only on bipartite systems , with . The results are nevertheless general.
Let , where and . We aim at showing that there exists a gamble satisfying:
The first inequality says that is desirable in . That is, is a gamble desirable to Alice whose beliefs are represented by . The second inequality says that is negative and, therefore, leads to a sure loss in . By a–a, the inequalities in (8) imply that must be an indefinite Hermitian matrix.
Assume that and consider the entangled density matrix:
and the Hermitian matrix:
This matrix is indefinite (its eigenvalues are ) and is such that . Since , the gamble
is desirable for Alice in .
Let and with , for , denote the real and imaginary components of . Then
This is the essence of the quantum puzzle: is P-coherent but (Theorem 1) there is no associated to it and therefore, from the point of view of a classical probabilistic interpretation, it is not coherent (in any classical description of the composite quantum system, the variables appear to be entangled in a way unusual for classical subsystems).
As previously mentioned, there are two possible ways out from this impasse: to claim the existence of either non-classical evaluation functionals or of negative probabilities. Let us examine them in turn.
- (1) Existence of non-classical evaluation functionals:
From an informal betting perspective, the effect of a quantum experiment on is to evaluate this polynomial to return the payoff for Alice. By Theorem 1, there is no compatible classical evaluation functional, and thus in particular no value of the variables , such that . Hence, if we adopt this point of view, we have to find another, non-classical, explanation for . The following evaluation functional, denoted as , may do the job:
Note that, and together imply that , which contradicts . Similarly, and together imply that , which contradicts . Hence, as expected, the above evaluation functional is non-classical. It amounts to assigning a value to the products but not to the single components of and separately. Quoting [holevo2011probabilistic, Supplement 3.4], “entangled states are holistic entities in which the single components only exist virtually”.
- (2) Existence of negative probabilities:
Negative probabilities are not an intrinsic characteristic of QT. They appear whenever one attempts to explain QT “classically” by looking at the space of charges on . To see this, consider , and assume that, based on (11), one calculates:
Because of Theorem 1, there is no probability charge satisfying these moment constraints, the only compatible being signed ones. Box 1 reports the nine components and corresponding weights of one of them:
Note that some of the weights are negative but , meaning that we have an affine combination of atomic charges (Dirac’s deltas).
The charge described in Box 1 is one among the many that satisfy (11) and has been derived numerically. Explicit procedure for constructing such negative-probability representations have been developed in [schack2000explicit, sperling2009necessary].
Again, we want to stress that the two above paradoxical interpretations are a consequence of Theorem 1, and therefore can emerge when considering any instance of a theory of P-coherence in which the hypotheses of this result hold.
The issue of local realism in QT arises when one performs measurements on a pair of separated but entangled particles. This again shows the impossibility of a peaceful agreement between the internal logical consistency of a P-coherent theory and the attempt to provide an external coherent (classical) interpretation. Let us discuss it from the latter perspective. Firstly, notice that, since , the linear operator
satisfies the properties:
Hence, by summing up some of the components of the matrix (13), we can recover the marginal linear operator
where the last equality holds when , i.e., is the reduced density matrix of on system . The operation we have just described, when applied to a density matrix, is known in QT as partial trace. Given the interpretation of as a dual operator, the operation of partial trace simply follows by Equation (14).
Similarly, we can obtain
Matrix (analogously to ) is compatible with probability: there are marginal probabilities whose is the moment matrix, an example being
In other words, we are brought to believe that marginally the physical properties of the two particles have a meaning, i.e., they can be explained through probabilistic mixtures of classical evaluation functionals. We can now ask Nature, by means of a real experiment, to decide between our common-sense notions of how the world works, and Alice’s one. Experimental verification of this phenomenon can be obtained by a CHSH-like experiment, which aims at experimentally reproducing a situation where (8) holds, as explained in Box 2. In this interpretation, the CHSH experiment is an entanglement witness, we discuss the connection between (8) and the entanglement witness theorem in Section 2.1.
The situation we have just described is the playground of Bell’s theorem, stating the impossibility of Einstein’s postulate of local realism.
The argument goes as follows. If we assume that the physical properties of the two particles (the polarisations of the photons) have definite values that exist independently of observation (realism
), then the measurement on the first qubit must influence the result of the measurement on the second qubit. Vice versa if we assumelocality, then cannot exist independently of the observations. To sum up, a local hidden variable theory that is compatible with QT results cannot exist [bell1964einstein].
But is there really anything contradictory here? The message we want to convey is that this is not the case. Indeed, since Theorem 1 applies, there is no probability compatible with the moment matrix . Ergo, although they may seem to be compatible with probabilities, the marginal matrices are not moment matrices of any probability.
The conceptual mistake in the situation we are considering is to forget that come from . A joint linear operator uniquely defines its marginals but not the other way round. There are infinitely many joint probability charges whose are the marginals, e.g.,
but none of them satisfy Equation (11). In fact, such an intrinsic non-uniqueness of the compatible joints is another amazing characteristic of QT: from this perspective, it is not only a theory of probability, but a theory of imprecise probability, see Supplementary 4.3 and Supplementary 7.1.
The take-away message of this subsection is that we should only interpret as marginal operators and keep in mind that QT is a logical theory of P-coherence. We see paradoxes when we try to force a physical interpretation upon QT, whose nature is instead computational. In other words, if we accept that computation is more primitive than our classical interpretation of physics, all paradoxes disappear.
2.1 Entanglement witness theorem
In the previous Subsections, we have seen that all paradoxes of QT emerge because of disagreement between its internal coherence and the attempt to force on it a classical coherent interpretation.
Do quantum and classical probability sometimes agree? Yes they do, but when at play there are density matrices such that Equation (8) does not hold, and thus in particular for separable density matrices. We make this claim precise by providing a link between Equation (8) and the entanglement witness theorem [horodecki1999reduction, horodecki2009quantum].
We first report the definition of entanglement witness [heinosaari2011mathematical, Sec. 6.3.1]:
Definition 3 (Entanglement witness).
A Hermitian operator is an entanglement witness if and only if is not a positive operator but for all vectors .111In [heinosaari2011mathematical, Sec. 6.3.1], the last part of this definition says “for all factorized vectors ”. This is equivalent to considering the pair .
The next well-known result (see, e.g., [heinosaari2011mathematical, Theorem 6.39, Corollary 6.40]) provides a characterisation of entanglement and separable states in terms of entanglement witness.
A state is entangled if and only if there exists an entanglement witness such that . A state is separable if and only if for all entanglement witnesses .
The first inequality states that the gamble is strictly desirable for Alice (in theory ) given her belief . Since the set of desirable gambles (1) associated to is closed, there exists such that is still desirable, i.e, and
where we have exploited that . Therefore, (15) is equivalent to
which is the same as (8).
Hence, by Theorem 1, we can equivalently formulate the entanglement witness theorem as an arbitrage/Dutch book:
Let be the set of desirable gambles corresponding to some density matrix . The following claims are equivalent:
is not coherent in .
This result provides another view of the entanglement witness theorem in light of P-coherence. In particular, it tells us that the existence of a witness satisfying Equation (15) boils down to the disagreement between the classical probabilistic interpretation and the theory on the rationality (coherence) of Alice, and therefore that whenever they agree on her rationality it means that is separable.
This connection explains why the problem of characterising entanglement is hard in QT: it amounts to proving the negativity of a function, which is NP-hard.
Since its foundation, there have been two main ways to explain the differences between QT and classical probability. The first one, that goes back to Birkhoff and von Neumann [birkhoff1936logic], explains this differences with the premise that, in QT, the Boolean algebra of events is taken over by the “quantum logic” of projection operators on a Hilbert space. The second one is based on the view that the quantum-classical clash is due to the appearance of negative probabilities [dirac1942bakerian, feynman1987negative].
Recently, there has been a research effort, the so-called “quantum reconstruction”, which amounts to trying to rebuild the theory from more primitive postulates. The search for alternative axiomatisations of QT has been approached following different avenues: extending Boolean logic [birkhoff1936logic, mackey2013mathematical, jauch1963can], using operational primitives [hardy2011foliable, hardy2001quantum, barrett2007information, chiribella2010probabilistic], using information-theoretic postulates [barrett2007information, barnum2011information, van2005implausible, pawlowski2009information, dakic2009quantum, fuchs2002quantum, brassard2005information, mueller2016information], building upon the subjective foundation of probability [Caves02, Appleby05a, Appleby05b, Timpson08, longPaper, Fuchs&SchackII, mermin2014physics, pitowsky2003betting, Pitowsky2006, benavoli2016quantum, benavoli2017gleason] and starting from the phenomenon of quantum nonlocality [barrett2007information, van2005implausible, pawlowski2009information, popescu1998causality, navascues2010glance].
A common trait of all these approaches is that of regarding QT as a generalised theory of probability. But why is probability generalised in such a way, and what does it mean? We have shown that the answer to this question rests in the computational intractability of classical probability theory contrasted to the polynomial-time complexity of QT.
Note that there have been previous investigations into the computational nature of QT but they have mostly focused on topics of undecidability (these results are usually obtained via a limiting argument, as the number of particles goes to infinity, see, e.g., [cubitt2015undecidability]; this does not apply to our setting as we rather take the stance that the Universe is a finite physical system) and of potential computational advantages of non-standard theories involving modifications of quantum theory [bacon2004quantum, aaronson2004quantum, aaronson2005quantum, chiribella2013quantum].
The key postulate that separates classical probability and QT is a: the computation postulate. Because of a, Theorem 1 applies and thus the “weirdness” of QT follows: negative probabilities, existence of non-classical evaluation functionals and, therefore, irreconcilability with the classical probabilistic view. The formulation of Theorem 1 points to the fact that there are three possible ways out to provide a theoretical foundation of QT: (1) redefining the notion of evaluation functionals (algebra of the events), which is the approach adopted within the Quantum Logic [mackey2013mathematical, Axiom VII]; (2) the algebra of the events is classical but probabilities are replaced by quasi-probabilities (allowing negative values), see for instance [schack2000explicit, ferrie2011quasi]; (3) the quantum-classical contrast has a purely computational character. The last approach starts by accepting PNP to justify the separation between the microscopic quantum system and the macroscopic world. We quote Aaronson [aaronson2005guest]:
… while experiment will always be the last appeal, the presumed intractability of NP-complete problems might be taken as a useful constraint in the search for new physical theories.
The postulate of computational efficiency embodied by a (through a) may indeed be the fundamental law in QT, similar to the second law of thermodynamics, or the impossibility of superluminal signalling.
Appendix A Supplementary
The Supplementary Material of this manuscript can be found at https://arxiv.org/abs/1902.03513