# Witnessing Bell violations through probabilistic negativity

Bell's theorem shows that no hidden-variable model can explain the measurement statistics of a quantum system shared between two parties, thus ruling out a classical (local) understanding of nature. In this work we demonstrate that by relaxing the positivity restriction in the hidden-variable probability distribution it is possible to derive quasiprobabilistic Bell inequalities whose sharp upper bound is written in terms of a negativity witness of said distribution. This provides an analytic solution for the amount of negativity necessary to violate the CHSH inequality by an arbitrary amount, therefore revealing the amount of negativity required to emulate the quantum statistics in a Bell test.

## Authors

• 1 publication
• 4 publications
• 1 publication
• 1 publication
• ### Causal networks and freedom of choice in Bell's theorem

Bell's theorem is typically understood as the proof that quantum theory ...
05/12/2021 ∙ by Rafael Chaves, et al. ∙ 0

• ### Quantum projective measurements and the CHSH inequality in Isabelle/HOL

We present a formalization in Isabelle/HOL of quantum projective measure...
03/15/2021 ∙ by Mnacho Echenim, et al. ∙ 0

• ### An Analytic Semi-device-independent Entanglement Quantification for Bipartite Quantum States

We define a property called nondegeneracy for Bell inequalities, which d...
03/13/2019 ∙ by Zhaohui Wei, et al. ∙ 0

• ### Analytic Semi-device-independent Entanglement Quantification for Bipartite Quantum States

We define a property called nondegeneracy for Bell inequalities, which d...
03/13/2019 ∙ by Zhaohui Wei, et al. ∙ 0

• ### A Sequence of Relaxations Constraining Hidden Variable Models

Many widely studied graphical models with latent variables lead to nontr...
06/08/2011 ∙ by Greg Ver Steeg, et al. ∙ 0

• ### Setting up experimental Bell test with reinforcement learning

Finding optical setups producing measurement results with a targeted pro...
05/04/2020 ∙ by Alexey A. Melnikov, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

It has now been 60 years since John Stewart Bell wrote his famous paper on the Einstein-Podolsky-Rosen (EPR) paradox bell1964einstein , and 50 years since the first experimental Bell test freedman1972experimental . The majority of physicists are perfectly happy to concede that in the lab we see experimental results consistent with the postulates of quantum mechanics. However, the implications of these mathematical postulates on the ‘reality’ of the wavefunction is still very much up for debate caves2002quantum ; harrigan2010einstein ; pusey2012reality ; colbeck2012system ; mermin2014physics ; ringbauer2015measurements .

These Bell experiments remain as some of the most important demonstrations for the reality of the quantum state and the death of a ‘local realism’ picture of nature. In such an experiment a physical system is distributed between spatially separated observers, and we allow these observers to perform measurements on their local system. The emerging statistics prove that physical systems are not bound to behave locally (in accordance to local hidden-variable models). Rather, the statistics are consistent with the postulates governing quantum mechanics.

In this work we remove the postulates of quantum mechanics and instead allow a physical system to be distributed according to a quasiprobability (hidden-variable) distribution that is allowed to take negative values. Although we are perfectly content with real negative numbers in physics, negative quasiprobabilities– despite receiving support from individuals such as Dirac dirac1942bakerian and Feynman feynman1987negative and having a solid mathematical foundation ruzsa1988algebraic ; khrennikov2007generalized – have been a long debated issue in theoretical physics muckenheim1986review . See, for example, the extensive discussion surrounding the interpretation of negative values in the Wigner distribution ferrie2011quasi ; veitch2012negative . In the majority of considerations, quasiprobability distributions are used to describe states that are not directly observed; that is, all observable measurement statistics must be governed by ordinary probability distributions. As an example, a Wigner function may assign a negative quasiprobability to a particle having a particular position/momentum combination, but any physical measurement, constrained by Heisenberg uncertainty, will have an all-positive outcome distribution. This feature ensures that no outcome is ever predicted to be seen occurring a negative number of times feynman1987negative , and similarly protects the quasiprobability physicist from falling victim to ‘Dutch book’ arguments (de2017theory, , Ch.3).

An important motivator for this work is the result of Al-Safi and Short al2013simulating (expanded upon by the authors of oas2014exploring ) which showed that it is possible to simulate all non-signalling correlations, (those which adhere to the principles of special relativity) popescu1994quantum ; peres2004quantum , if one allows negative values in a probability distribution. However, physical reality does not explore this full set of correlations – but rather, is restricted to those achievable by quantum correlations. Therefore the question that we pose in this work is:

What are the restrictions on the negativity in a hidden-variable probability distribution such that it can emulate the statistics seen in a physical Bell experiment?

In order to answer this question we construct CHSH inequalities for two parties clauser1969proposed whose degree of violation is witnessed by the amount of negativity present in the hidden-variable probability distribution. Our witness yields a value of 0 for a quasiprobability distribution which is entirely positive, such as that which would describe an ordinary classical system.

We start by describing the setup necessary for the construction of these nonlocal experiments, introducing the probability distributions admitted by classical, quantum, and non-signalling theories, and giving the famous Bell scores that these distributions respectively allow one to reach in such nonlocal experiments. We then give the definition of a quasiprobability distribution and motivate negativity witnesses as a quantitative method of detecting negativity in said distributions. Our main result is that the violation of the CHSH inequality (and -measurement generalisations) can be exactly characterised by a negativity witness of the hidden-variable distribution defined over the local states, and that there exists quasiprobability distributions which can saturate (up to the no-signalling limit) any such violation, whilst still having well-defined local statistics. This shows that it is possible to recapture the nonlocal features of Bell experiments through having a finite amount of negativity allowed in a hidden-variable distribution over scenarios which are, in themselves, entirely local and classical.

## 2 Setup

Let us consider the following experimental setup. A source distributes a system between observers, the th observer can choose some measurement and record some outcome , the possible values of being . A specific experimental setup is characterised by the conditional probability,

 P(yA,yB|xA,xB). (1)

The physical theory governing the behaviour of the system and experiment determines the achievability of certain conditional probability distributions resulting from these experiments. We are interested in the following three physical theories:

Classical theory admits probability distributions of the following form,

 P(yA,yB|xA,xB)=∑λA,λBPA(yA|xA,λA)PB(yB|xB,λB)PΛ(λA,λB), (2)

where is a joint probability distribution defined over local hidden variables. With each choice of hidden variables we associate a local scenario governed by ordinary local probability distributions for the observables . The hidden-variable probability distribution determines how such local scenarios are mixed, and the probability distributions are called “-local” because they belong to the scenario associated with a particular value of , not to be confused with the (observable) marginal probability distributions that are obtained by marginalising the total probability distribution, equation (2).

Quantum theory endows us with a Hilbert space structure for our quantum states that admits probability distributions of the following form,

 P(yA,yB|xA,xB)=Tr[(M(A)yA|xA⊗M(B)yB|xB)ρ], (3)

where and are Positive Operator-Valued Measures (POVMs) 2000quantum for each .

No-signalling principle, our third physical theory, prohibits the sending of information faster than the speed of light popescu1994quantum ; peres2004quantum . Such a theory has the conditions on its conditional probability distribution that for any ,

 ∑ykP(yA,yB|xA,xB) (4)

is independent of . These three physical theories ranging from the most restrictive(classical), to the least restrictive (no-signalling), with quantum theory existing somewhere between the two popescu1994quantum .

Representing the full set of correlations that the quantum conditional probability distribution in equation (3) allows one to reach is a notorious problem, and the set has recently been shown to be not closed slofstra2019set . Therefore we instead restrict ourselves to studying the achievable bounds that these conditional probabilities allow one to reach in nonlocal experiments ; the original and most famous of which being the Bell inequality bell1964einstein .

###### Definition 1 (Bell inequality).

Given observers and , each with measurement choice with outcomes , experiments performed on the systems adhere to the bound,

 |E(0A,0B)−E(0A,1B)+E(1A,0B)+E(1A,1B)|≤X, (5)

where both the correlation measure and bound are theory dependent. The left-hand side of that inequality is often called the score of the experiment.

Each physical theory admits a different conditional probability distribution, and hence a different achievable bound . Classical theory has the CHSH bound clauser1969proposed , quantum theory has the Tsirelson bound of cirel1980quantum and non-signalling distributions popescu1994quantum . We are interested in the achievable bounds of a classical system’s probability distribution when the hidden-variable distribution in said probability distribution can be negative.

## 3 Results

We now define an important object for this work, the quasiprobability distribution.

###### Definition 2 (Quasiprobability distribution).

We define a quasiprobability distribution as where and , that is properly normalised, such that,

 ∑λ1,…,λN~PΛ(λ1,…,λN)=1. (6)

It can be seen that the collection of functions adhering to the above definition forms a convex set, which we will denote , a super set of the convex set of positive probability distributions . We must now determine how to quantify the presence of negativity in our quasiprobability distributions. To this end we will use a well-known method for quantitatively detecting properties of a quantum state, witnesses terhal2000bell ; lewenstein2000optimization ; eisert2007quantitative ; chabaud2021witnessing . Let us therefore proceed by defining a negativity witness:

###### Definition 3 (Negativity witness).

Given some properly normalised probability distribution , a well-defined negativity witness is one which,

 N(P)=0∀P∈P. (7)

We may additionally require such a witness to ‘faithfully’ detect negativity,

 N(P)>0∀P∈~P∖P. (8)

In the following, we consider classical local hidden-variable models as defined in equation (2), but we replace the hidden-variable probability distribution with a quasiprobability distribution ,

 P(yA,yB|xA,xB)=∑λA,λBP1(yA|xA,λA)P2(yB|xB,λB)~PΛ(λA,λB). (9)

This corresponds to a scenario where different local statistics of observations, governed by -local (ordinary) probability distributions , are mixed according to a quasiprobability distribution . However, when takes negative values, we should no longer think of the model as an ignorance mixture of valid local scenarios but rather as a nonlocal model al2013simulating . Furthermore, when compared with ordinary hidden-variable models not all combinations of hidden-variable and -local probability distributions are valid; only those combinations which lead to well-defined are valid, i.e., comprised of values between 0 and 1 (the normalisation condition is always fulfilled).

In addition to the correlation function between two measurements and , , it will also be useful to define -local expectation values corresponding to an imagined scenario where observer is able to perform measurement in the local scenario corresponding to ,

 ⟨k⟩xkλk:=∑ykykPk(yk|xk,λk). (10)

This -local expectation value will be useful to formulate our results, but does not correspond to the actual observations which are themselves governed by equation (9).

We are now in a position to state the main result of this work, the quasiprobabilistic Bell inequality.

###### Theorem 1 (Quasiprobabilistic Bell inequality).

Given observers and , each with measurement choice with outcomes whose systems are distributed according to some quasiprobability distribution , then the quasiprobabilistic Bell inequality holds:

 |E(0A,0B)−E(0A,1B)+E(1A,0B)+E(1A,1B)|≤2+N(~PΛ), (11)

where

 N(~PΛ)\coloneq{N+(~PΛ) if E(1A,0B)+E(1A,1B)<0,N−(~PΛ) else, (12)

is a negativity witness, and

 N±(~PΛ):=∑λA,λB[2±(⟨A⟩1AλA⟨B⟩1BλB+⟨A⟩1AλA⟨B⟩0BλB)](∣∣~PΛ(λA,λB)∣∣−~PΛ(λA,λB)).

The proof of this theorem begins analogously with Bell’s proof of the CHSH bound bell2004speakable , but diverges when the assumption is made in Bell’s proof. The above result shows that if an arbitrary amount of negativity is allowed in the hidden-variable probability distribution then the upper bound of equation (11) can be arbitrarily large. However, it should be noted that a natural limit of in the relevant Bell tests (i.e., for the upper bound in the quasiprobabilistic Bell inequality) is imposed by the requirement that is a well-defined, valid probability distribution 111This can be easily seen as for well-defined experimental setups where , i.e. for valid , each term on the left hand side of equation (11) can be at most and least . .

The previous result of Al-Safi and Short al2013simulating showed that it was possible to violate said inequality up to this no-signalling bound of . Therefore, in order to emulate the physical results seen in Bell tests (Tsirelson bound) one needs a negative probability distribution whose witness equals . In section 4 we show that for any , there exist quasiprobabilistic hidden-variable models with valid local measurement statistics that saturate inequality (11). We would hope that if a physical mechanism was discovered that allowed a joint hidden-variable probability distribution to have the appearance of negativity, one would expect that said physical mechanism was limited in such a way that it resulted in the Tsirelson bound and more generally was able to reconstruct the limits on quantum correlations.

It is also important to note that although said witness is a valid witness according to definition 3 it is not necessarily a ‘faithful’ one. However this can be rectified, at the cost of loosening the bound, by redefining said witness. For example the function , is defined to be both a valid and ‘faithful’ witness.

There are numerous generalisations of the famous CHSH inequalities, such as multiple parties perez2008unbounded , arbitrary outcomes cope2019bell , etc. These would no doubt be interesting to study but we leave it to future work to explore these other generalisations and instead focus on the scenario in which Alice and Bob have access to an arbitrary number of measurement settings wehner2006tsirelson .

###### Theorem 2.

Given observers and , each with measurements with outcomes whose systems are distributed according to some quasiprobability distribution ,

 ∣∣ ∣∣n−1∑i=0E(iA,iB)+n−1∑i=1E(iA,i−1B)−E(0A,n−1B)∣∣ ∣∣≤2n−2+Nn(~PΛ), (13)

where is a negativity witness with

 N(x)(~PΛ)\coloneq{N(x)+(~PΛ) if E(0A,xB)+E(0A,x−1B)<0,N(x)−(~PΛ) else, (14)

where .

The proof of the above theorem can be found in appendix B, it utilises proof by induction by chaining together the inequalities from theorem 1. In section 4 we show that the bound in theorem 2 can be saturated. Namely, for any , there exist well-defined , characterised by a quasiprobability hidden-variable distribution , that saturate inequality (13). In addition, analogously to the two measurement result, at the cost of loosening the bound we can ensure that the above witness is also ‘faithful’ by choosing for all , .

## 4 Example

In order to understand how to saturate the Bell inequality from Theorem 1, we rewrite the left-hand side of equation (11) as,

 ∣∣∣∑λM(λ)~PΛ(λ)∣∣∣. (15)

We replaced the hidden variables and with a single hidden variable because our example only uses a single hidden variable . Further, are the scores of each of the -local distributions , that is holds.

For a given value of the negativity witness, we exceed the local bound maximally by the simple strategy of weighting classical distributions with with positive quasiprobability, while simultaneously taking a classical distribution with with negative weight. To ensure that the total probability distribution is well-defined we make a choice of three deterministic classical distributions with positive weight and a fourth with negative weight. Our four deterministic classical distributions can be denoted and . Here, our notation means that the distributions can be produced by assigning the first pair of symbols to Alice and the second to Bob. Each party chooses to read either the first or second of the symbols given to them (this choice reflects their measurement setting ) while the outcome of their measurement is determined by the symbol itself; that is, () for a plus (minus) sign. This experimental description of distributing classical information makes clear that these distributions are local, with our hidden variable indicating which of these sets the source actually produces.

The source produces each of the distributions according to the following quasiprobability distribution,

 ~PΛ(λ)=⎧⎨⎩4+N12 for λ=1,2,3,−N4 for λ=4, (16)

where if and if . We can use tables to represent -local probability distributions, and the total probability distribution is then given as the weighted sum of such tables:

 4+N12⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣[c|cccc]\omit\span\omit\span\omit\span\omityAyBxAxB−−−++−++001000010100101000110100⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦+⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣[c|cccc]\omit\span\omit\span\omit\span\omityAyBxAxB−−−++−++000010010010101000111000⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦+⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣[c|cccc]\omit\span\omit\span\omit\span\omityAyBxAxB−−−++−++000001010001100001110001⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠−N4⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣[c|cccc]\omit\span\omit\span\omit\span\omityAyBxAxB−−−++−++000010010001101000110100⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦ =112⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣[c|cccc]\omit\span\omit\span\omit\span\omityAyBxAxB−−−++−++004+N04−2N4+N0104+N4+N4−2N108−N004+N114+N4−2N04+N⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦. (17)

The requirement that the resulting total probability distribution must be valid implies which corresponds to the no-signalling limit. Furthermore, it is easy to check that said distribution indeed gives a value of for the negativity witness.

The quasiprobabilistic Bell inequality score for this experiment is , which upon substituting equation (16) into the negativity witness, can be seen to saturate the bound. In appendix C we discuss how one can generalise the above to the -measurement scenario.

## 5 Conclusion

We have shown that there exists a relationship between the amount of negativity allowed in a joint hidden-variable distribution, and the degree to which said distribution can demonstrate nonlocality in a Bell experiment. In particular, theorem 2 introduces a quasiprobabilistic Bell inequality, which gives us a sharp bound in the scenario of two parties with inputs (corresponding to a choice between measurements) and can be used straightforwardly to reconstruct quantum statistics using nothing more than local, separable classical probability distributions and a quasiprobability distribution over them – granted an appropriately well spent budget of negativity.

Our work sits within the long-established tradition of trying to understand quantum theory through interpretative lenses which remove some particular aspect from a classical worldview. Such approaches are wide and varied, including superdeterminism hossenfelder2020rethinking ; adlam2018quantum ; retro-causality price2012does ; invoking an irreducible role for subjectivity in physics mermin2014physics ; fuchs2014introduction ; mueller2020law ; taking physical reality to consist of interacting, separate realms goldstein2001bohmian ; esfeld2014ontology ; allowing the relativity of pre and post-selection bacciagaluppi2020reverse ; taking Hilbert space to be literal carroll2021reality , and so on. Here we add to this list, in that we present an additional way to re-capture the nonlocal features of quantum theory: through having a finite amount of negativity allowed in a hidden-variable distribution over scenarios which are, in themselves, entirely local and classical. We are not claiming that such quasi distributions are ‘real’ – only, more modestly, that such a perspective could not be ruled out at this stage.

Pursuing this line of reasoning, we would hope that our results may help to determine the fundamental restrictions on a system’s quasiprobability hidden-variable distribution such that it captures the full character of physical correlations. Put another way; we know that zero negativity can capture the set of classical correlations, whilst un-bounded negativity can capture the non-signalling set. Given that the set of quantum correlations lies between these two – what are the restrictions on the quasiprobability hidden-variable distribution which would suffice to identify the full set of quantum correlations? We hope to explore this question in further work.

###### Acknowledgements.
We are grateful to Benjamin Yadin, Richard Moles and Thomas Veness for helpful discussions and the Physical Institute for Theoretical Hierarchy (PITH) for encouraging an investigation into this topic. BM acknowledges financial support from the Engineering and Physical Sciences Research Council (EPSRC) under the Doctoral Prize Grant (Grant No. EP/T517902/1). LF acknowledges financial support from the Austrian Science Fund (FWF) through SFB BeyondC (Grant No. F7102). BL is supported by Leverhulme Trust Research Project Grant (RPG-2018-213). D.G. acknowledges support from FQXI (grant no. RFP-IPW-1907)

## Appendix A proof of theorem 1

###### Theorem.

Given observers and , each with measurement choice with outcomes whose systems are distributed according to some quasiprobability distribution , then the quasiprobabilistic Bell inequality holds:

 |E(0A,0B)−E(0A,1B)+E(1A,0B)+E(1A,1B)|≤2+N(~PΛ), (18)

where

 N(~PΛ)\coloneq{N+(~PΛ) if E(1A,0B)+E(1A,1B)<0,N−(~PΛ) else, (19)

is a negativity witness, and

 N±(~PΛ):=∑λA,λB[2±(⟨A⟩1AλA⟨B⟩1BλB+⟨A⟩1AλA⟨B⟩0BλB)](∣∣~PΛ(λA,λB)∣∣−~PΛ(λA,λB)).
###### Proof.

The first part of the proof follows Bell’s 1971 derivation of the CHSH inequality bell2004speakable . For brevity in the proof we will just write as .

We start by rewriting the correlation function,

 E(xA,xB):= ∑yA,yByAyB∑λA,λBPA(yA|xA,λA)PB(yB|xB,λB)P(λA,λB) = ∑λA,λB⟨A⟩xAλA⟨B⟩xBλBP(λA,λB), (20)

where , defined in equation (10), is the -local expectation value for observer performing measurement . Starting with the following difference between correlation functions,

 E (0A,0B)−E(0A,1B) = ∑λA,λB(⟨A⟩0AλA⟨B⟩0BλB−⟨A⟩0AλA⟨B⟩1BλB)P(λA,λB) = ∑λA,λB(⟨A⟩0AλA⟨B⟩0BλB−⟨A⟩0AλA⟨B⟩1BλB±⟨A⟩0AλA⟨B⟩0BλB⟨A⟩1AλA⟨B⟩1BλB∓⟨A⟩0AλA⟨B⟩0BλB⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB) = ∑λA,λB⟨A⟩0AλA⟨B⟩0BλB(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)−∑λA,λB⟨A⟩0AλA⟨B⟩1BλB(1±⟨A⟩1AλA⟨B⟩0BλB)P(λA,λB), (21)

where the “” in equation (21) is to be understood as either “” in all terms or “” in all terms. Taking the absolute value of both sides and using the triangular inequality,

 ∣∣E (0A,0B)−E(0A,1B)∣∣ ≤ ∣∣∑λA,λB⟨A⟩0AλA⟨B⟩0BλB(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)∣∣+∣∣∑λA,λB⟨A⟩0AλA⟨B⟩1BλB(1±⟨A⟩1AλA⟨B⟩0BλB)P(λA,λB)∣∣. (22)

Starting with the first term on the right-hand side of inequality (22), we again apply the triangular inequality,

 ∣∣∑λA,λB⟨A⟩0AλA⟨B⟩0BλB(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)∣∣≤ ∑λA,λB∣∣⟨A⟩0AλA⟨B⟩0BλB(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)∣∣ = ∑λA,λB∣∣⟨A⟩0AλA⟨B⟩0BλB∣∣∣∣(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)∣∣. (23)

As we can say , we can write,

 ∣∣∑λA,λB⟨A⟩0AλA⟨B⟩0BλB(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)∣∣≤ ∑λA,λB∣∣(1±⟨A⟩1AλA⟨B⟩1BλB)P(λA,λB)∣∣ = = ∑λA,λB(1±⟨A⟩1AλA⟨B⟩1BλB)∣∣P(λA,λB)∣∣ (24)

where we have used the fact that

is necessarily non-negative because of the choice of eigenvalues,

.

Similarly, we find for the second term on the right-hand side of inequality (22)

 ∣∣∑λA,λB⟨A⟩0AλA⟨B⟩1BλB(1±⟨A⟩1AλA⟨B⟩0BλB)P(λA,λB)∣∣≤ ∑λA,λB(1±⟨A⟩1AλA⟨B⟩0BλB)∣∣P(λA,λB)∣∣. (25)

By adding inequalities (24) and (25) we find the following upper bound for the left-hand side of inequality (22),

 ∣∣E(0A,0B)−E(0A,1B)∣∣≤ (26)

So far the proof followed Bell’s 1971 derivation bell2004speakable of the CHSH inequality. In Bell’s derivation, one assumes that the joint probability distribution is positive, , which, using the definition of the correlation function and the triangle inequality, leads to the well-known CHSH inequality, .

We have to take another approach because here can be a quasiprobability distribution and thus take negative values. For each of the two inequalities (26) (corresponding to the choice for “”), we define a negativity witness for some normalised distribution as the difference obtained by replacing with in the right-hand side of inequality (26),

 N±(P):=∑λA,λB[2±(⟨A⟩1AλA⟨B⟩1BλB+⟨A⟩1AλA⟨B⟩0BλB)][|P(λA,λB)|−P(λA,λB)]. (27)

Note that although this negativity witness is perfectly valid according to definition 3, it is not faithful because may be zero for , i.e., may be zero for a quasiprobability distribution. Nevertheless we can now write inequality (26) as,

 ∣∣E(0A,0B)−E(0A,1B)∣∣≤ ∑λA,λB[2±(⟨A⟩1AλA⟨B⟩1BλB+⟨A⟩1AλA⟨B⟩0BλB)]P(λA,λB)+N±(P). (28)

The first term on the right-hand side of inequality (28) can then be simplified using the definition of the correlation function (20) and that is normalized,

 ∑λA,λB[2±(⟨A⟩1AλA⟨B⟩1BλB+⟨A⟩1AλA⟨B⟩0BλB)]P(λA,λB) =∑λA,λB2P(λA,λB)±∑λA,λB(⟨A⟩1AλA⟨B⟩1BλB+⟨A⟩1AλA⟨B⟩0BλB)P(λA,λB) =2±[E(1A,0B)+E(1A,1B)]. (29)

Thus, inequality (28) becomes

 ∣∣E(0A,0B)−E(0A,1B)∣∣≤ 2±[E(1A,0B)+E(1A,1B)]+N±(P). (30)

Now, we choose the inequality corresponding to “” if is negative, and the inequality corresponding to “” else. This allows us to write

 ∣∣E(0A,0B)−E(0A,1B)∣∣≤ 2−|E(1A,0B)+E(1A,1B)|+N(P), (31)

where we defined

 N(P)\coloneq{N+(P)% if E(1A,0B)+E(1A,1B)<0,N−(P) else. (32)

From inequality (31), we obtain

 ∣∣E(0A,0B)−E(0A,1B)∣∣+|E(1A,0B)+E(1A,1B)|≤ 2+N(P), (33)

and with one final use of the triangular inequality we find a CHSH-type inequality for arbitrary ,

 |E(0A,0B)−E(0A,1B)+E(1A,0B)+E(1A,1B)|≤ 2+N(P), (34)

completing the proof.

## Appendix B proof of theorem 2

###### Theorem.

Given observers and , each with measurements with outcomes whose systems are distributed according to some quasiprobability distribution ,

 ∣∣ ∣∣n−1∑i=0E(iA,iB)+n−1∑i=1E(iA,i−1B)−E(0A,n−1B)∣∣ ∣∣≤2n−2+Nn(~PΛ), (35)

where is a negativity witness with

 N(x)(~PΛ)\coloneq{N(x)+(~PΛ) if E