I Introduction
The problem of allocating indivisible commodities, that is, resources that cannot be divided into multiple parts, such as houses and people, has been studied for a long time [25, 29, 2, 7]. One of the most wellknown studies is the Top Trading Cycle (TTC) proposed by Shapley and Scarf [25]. TTC is a deterministic allocation method that deals with situations in which players have deterministic preference rankings over their options. From a gametheoretic perspective, TTC achieves what is called core, namely, a situation in which exchanging options among arbitrary players does not lead to a more preferencesatisfying allocation.
In the previous study, we extended the preference from a deterministic to a probabilistic one and mathematically discussed how a joint selection that satisfies players’ probabilistic preferences should be made [27]
. In uncertain situations, people in the real world or agents in reinforcement learning are often torn between the desire to choose the best current option and the desire to explore other options
[21]. In such situations, they will not be satisfied with obtaining only the top preference option all the time. Instead, they will be satisfied if the proportion of options obtained through multiple allocations matches their probabilistic preferences.Specifically, let be the joint probability of assigning the th choice to the first player A and the th choice to the second player B, and let the matrix of these joint probabilities be called the joint selection probability matrix. By definition, the probability of the first player choosing each option as a result of the joint selection probability matrix can be calculated by
(1) 
where is the number of options. For the allocation to satisfy player A’s preference, the list of should coincide with his/her probabilistic preference.
In Ref. [27], a joint selection probability matrix that maximizes the satisfaction of the probabilistic preferences of two players is mathematically derived. However, as we will confirm later, two concerns exist. First, the computational cost of obtaining the optimal joint selection probability matrix is in the worst case if we follow the algorithm presented in the paper, which implies that computing the matrix becomes more difficult as the number of options becomes huge. The second problem is that of confidentiality. Since the construction of the optimal joint selection probability matrix requires information on the preferences of both players, each player must disclose their preference to the other or third parties.
Now, when we encounter situations of collective decisions without choice conflict, it is not always necessary to explicitly calculate the values of the joint selection probability matrix. Instead, employing a sampling method that converges to those values over many repetitions is often sufficient. Although an efficient sampling method that is based on the optimal joint selection probability matrix has not been established yet, this paper proposes several sampling methods each of which converges to a heuristic joint selection probability matrix over many repetitions. In particular, we demonstrate that the application of quantum systems can significantly reduce the two problems mentioned above, that is, the explosion of computational cost and the lack of confidentiality.
Optical computing, which flourished around the 1980s [32], smoldered somewhat due to the rapid advances in electronic technology, but is now drawing attention again due to the increasing demand for computational resources caused by AI and so on in recent years [16, 26, 12]
. It is now being considered for a wide range of applications, including deep learning and computational science, taking advantage of not only the highspeed and broadband nature of light but also its quantum nature
[18, 30, 24]. For example, a lot of combinatorial optimization problems are regarded as NPhard, and thus it is difficult for a digital computer to solve them as the size of the problem increases. However, for some types of problems, even huge combinatorial optimization problems can be solved within short time by mapping them to corresponding Ising models
[5, 19] and solving the models using Ising machines, for example, with networks of optical parametric oscillators [22, 8]. Optically implemented Ising machines have their advantages over other types of Ising machines, including their ability to work at room temperature [20], high efficiency due to light’s broad bandwidth [14], and so on. While Ising models are often used to solve combinatorial problems such as the MaxCut problem and the traveling salesman problem, other photonic implementations can be considered for different types of problems. Among them, recent attempts have been made to utilize the quantum nature of light to solve a decisionmaking problem called the multiarmed bandit problem [28].The multiarmed bandit problem is one of the simplest reinforcement learning problems, and it is a question of how decisions should be made in uncertain situations. Specifically, given multiple slot machines, each with its own probability of generating a reward, the question is how to maximize the cumulative rewards by drawing one of these machines at each time step. Since the player does not know the hit probabilities a priori, one of the efficient algorithms is to make decisions based on a probabilistic preference so that he/she can both exploit the current best option and explore other options.
The problem is further complicated when multiple players participate in the bandit problem [6, 17, 15]. In the competitive bandit problem, when multiple players draw the same machine and that machine generates a reward, the reward is split and distributed among them. In such a situation, if we consider the expected value of the total reward, we can see that it is always better if the players’ choices do not overlap.
What typically happens in the competitive bandit problem is that if each player draws a machine according to only his/her own selection probability, selection conflicts will occur frequently and the final cumulative rewards will be reduced. However, the quantum nature of light can be used to link individual reward maximization with total reward maximization. Chauvet et al. devised a system that uses entanglement of polarization to prevent selection conflicts without direct communication between the players, and experimentally demonstrated the effectiveness of this system to tackle the competitive multiarmed bandit problem [11, 10].
The advantage of utilizing the quantum nature of light here is twofold. The first is that it enables the players to conduct probabilistic decisionmaking through the observation of polarized light. Specifically, the stochasticity associated with the observation of polarization can be linked to probabilistic decisionmaking by mapping the choice of the machine to the polarization observed in a way that if the photon is detected by an avalanche photodiode corresponding to the horizontally polarized light, the player will select the first machine and vice versa [23]. The second advantage of using the quantum nature is that entanglement guarantees that the players’ choices never overlap. The team reward will not be diminished thanks to the nonconflict decisions by the players [11]. Furthermore, Amakasu et al. theoretically showed that conflictfree collective decisionmaking is possible over an arbitrary number of choices by employing orbital angular momentum of light to overcome the limitations of the number of choices in the case of polarizationbased approaches [4]
. Orbital angular momentum is another degreeoffreedom associated with photons. It carries theoretically infinite numbers of states, and is widely utilized in applications such as optical communications
[3, 33].Those previous studies aimed to maximize total cumulative rewards by making conflictfree decisions. Conversely, the present research, as well as the related former work [27], exclude external factors such as rewards. Instead, the focus is on how to accomplish the maximization of preference satisfaction; that is, how well the player’s preference is reflected in the joint decision.
In this study, we demonstrate that quantum systems can be utilized in the preference satisfaction problem. In Sec. II.1, we first review the problem settings of probabilistic preference satisfaction. In the subsequent Sec. II.2, we review theorems about the optimal joint selection matrix and clarify the issues related to the construction of the optimal joint selection matrix. After that, we propose and demonstrate sampling methods that converge to heuristic joint selection probability matrices. In particular, Sec. III covers two sampling methods through quantum interference, each of which is analyzed in detail in terms of implementation, computational cost, confidentiality, and the joint selection probability matrix it converges to. In Sec. IV, we compare the losses (defined in Sec. II) of the joint selection probability matrices to which the sampling methods converge through numerical calculations. Finally, Sec. V provides a summary of this research and future perspectives.
Ii Preference satisfaction by conflictfree joint decisions
ii.1 Problem settings
In this section, we review the problem settings of the conflictfree probabilistic preference satisfaction proposed in Ref. [27]. Suppose that two players, player A and B, have probabilistic preferences over options (). Let be player A’s preference for option and be player B’s preference for option . Since and are probabilities, the following constraints are satisfied:
(2)  
(3) 
Let be the probability of player A choosing option and player B choosing option as a result of collective decisionmaking. must satisfy the following conditions:
(4) 
Then, we define the joint selection probability matrix such that the element of is :
(5) 
The diagonal elements are all zero because we deal with collective decisionmaking without choice conflict.
The property we demand for is the following. The probability that player A can choose option as a result of is obtained by summing over columns :
(6) 
We call the satisfied preference, and if this value is consistent with player A’s original preference , it means that the preference is satisfied for option . Similarly, we can obtain the satisfied preference for player B by summing over rows :
(7) 
Our goal is to determine that make the satisfied preference as close as possible to the original preferences of both players for all options. In other words, our objective is to find so that they realize
(8) 
for all and .
Figure 1 schematically illustrates the problem settings. Players A and B have probabilistic preferences over options when , where in this case,
(9)  
(10) 
Then, an algorithm calculates a joint selection probability matrix . A decent algorithm should output a matrix in such a way that the satisfied preferences match the players’ preferences. For example, if we take the sum of the second row of the joint selection probability matrix in the red shaded area, we should get a value close to player A’s preference towards the second option. Similarly, if we take the sum of the third column in the green shaded area, we should get a value close to player B’s preference towards the third option. An example of a joint selection probability matrix that satisfies the players’ preferences perfectly is
(11) 
Finally, to quantify the degree of preference satisfaction, we define the degree of deviation between the satisfied preferences and the original preferences as the loss. The loss is defined in a manner analogous to the norm as follows:
(12) 
which is composed of the sum of squares of the gap between the satisfied preference and the original preference. The smaller the loss is, the more successfully the preferences are satisfied, and when the loss is zero, the player’s preferences are perfectly satisfied.
ii.2 Optimal joint selection probability matrix
In this section, we review principal theorems on the optimal joint selection probability matrix from [27] and clarify two problems associated with them. We define a score called the popularity, which represents how much each option is favored by the players.
Definition II.1
The popularity is defined as the sum of the preferences of player A and player B for option .
(13) 
Since the preferences and are probabilities, it holds that
(14) 
Two theorems have been found regarding the popularity .
Theorem II.1
Assume that all the popularities are smaller than or equal to 1. Then, it is possible to construct a joint selection probability matrix that makes the loss equal to zero.
(15) 
Theorem II.2
If any value of is greater than 1, it is not possible to make the loss equal to zero.
In a case when the th option is the most popular, that is, , the minimum loss is
(16) 
The following joint selection probability matrix is one of the matrices that minimize the loss.
(17)  
(18) 
The existence of the optimal joint selection probability matrices and their specific construction methods were presented for both cases where the maximum popularity is less than or equal to one, and where it is greater than one.
However, there are two concerns that need to be resolved. The first is the computational cost of constructing the optimal joint selection probability matrix in Theorem II.1. To fill in one row and column of the matrix, we need to:

Determine the maximum and minimum values of the popularities.

Fill in at most elements.
Determining the maximum and minimum values requires a computational cost of each, and filling in at most elements requires a computational cost of because every time each element is filled we need one subtraction. Therefore, it requires to fill all the rows and columns in the joint selection probability matrix. Thus, when becomes huge, it is difficult to compute the optimal joint selection probability matrix.
The second concern, common to the construction of the optimal joint selection probability matrix in both Theorems II.1 and II.2, is the lack of confidentiality. In both cases, constructing the optimal joint selection probability matrix requires the players’ preferences and . This means that the players must disclose their preferences to each other or a third party. In the real world, the necessity of preference disclosure is undesirable if the players do not know each other or have no means of communication. For example, few people feel like telling their preferences over sensitive matter to someone they do not know. Even if they do, they may not have a way to connect.
Here, we consider how to deal with these two problems. Practically, if we consider conflictfree collective decisionmaking, we do not necessarily need to calculate the values of the joint selection probability matrix explicitly. Instead, it will be enough if there is a sampling method that converges to that matrix over repeated draws. In this study, we demonstrate two quantum sampling methods that converge to heuristic joint selection probability matrices with relatively small losses and analyze how each of them deals with the above problems. In particular, we show how quantum interference effects can realize conflictfree joint sampling while highly satisfying individual preference profiles and resolving the confidentiality issue. It should be emphasized that the physical processes, not computers, play the role of establishing conflictfree joint sampling while taking account of individual preferences. The computing cost is replaced by the physical nature of light. We have not yet devised a sampling algorithm that always converges to the optimal joint selection probability matrix. However, as demonstrated later in Sec. IV, one of the proposed quantum samplings realizes almost comparable performances to the optimal cases under certain conditions.
Iii Joint sampling methods through quantum interference
In this section, we propose the following two sampling methods, each of which converges to a joint selection probability matrix with relatively small loss :

[label=.]

Pure HongOuMandel (Pure HOM)

Orbital Angular Momentum Attenuation
(OAM Attenuation)
Both of them employ orbital angular momentum (OAM) of light, which is a degreeoffreedom that consists of a theoretically infinite number of states, and they are relatively easy to implement using basic equipment such as spatial light modulators and beam splitters [4].
As introduced in Sec. II, there were two problems in constructing the optimal joint selection probability matrix: high computational cost and low confidentiality. For each of the above sample methods, we analyze the following four features:

Implementation

Computational cost

Confidentiality

Joint selection probability matrix that it converges to
iii.1 Pure HongOuMandel
iii.1.1 Implementation
The method we call “Pure HongOuMandel (Pure HOM)” employs a system based on the HongOuMandel effect involving quantum interference of orbital angular momentum of photons [13, 9]. Figure 2 schematically illustrates the quantum system we use to make conflictfree joint decisions.
First, a photon pair is created by a twophoton generator, and split into two paths by a beam splitter. At this point, each photon does not carry orbital angular momentum. Orbital angular momentum states can be induced by a spatial light modulator (SLM), which displays computergenerated holograms on its surface [34, 31].
In the proposed system, player A controls SLM1 to encode his/her probabilistic preference to a photon. Since, SLMs can manipulate both amplitude and phase terms of OAM states, the resulting photon can be described by
(19) 
Similarly, player B controls SLM2 to encode his/her preference to the other photon:
(20) 
Then, the photon pair is simultaneously injected into a beam splitter, where the HongOuMandel effect happens. The output OAM states are the tensor product of the OAM states of the two injected photons:
(21) 
As shown in Fig. 2, photodetectors are placed immediately after the beam splitter. The probability of detecting on side A and on side B is
(22) 
(23) 
Substituting verifies that the same absolute values of the OAM number will never be observed on sides A and B, whatever parameters the players use for the input states.
Now, in order to realize joint sampling, the observed OAM number is mapped to the index of choice. For example, if the OAM of is observed on side A, player A will select the first option, and if is observed on side B, player B will select the second option. Selection conflicts between the two players will never happen under this rule, because the absolute values of the observed OAM number on sides A and B always differ thanks to the HongOuMandel effect. Therefore, with Pure HOM realized by Fig. 2, collective decisionmaking without selection conflicts can be achieved.
Note that there are cases where two photons come out on the same side. For example, according to Eq. (21), the probability of both photons, whose OAM states are respectively and , coming out on side A is
(24) 
In such cases, we discard the photon pair and regenerate a new one.
iii.1.2 Computational cost
iii.1.3 Confidentiality
In this study, we assume a situation where each player adjusts the SLM using only their own preference; that is, player A cannot take into account to determine and , and vice versa. Under this assumption, neither player is required to disclose their probabilistic preference to the other or to a third party. Even though player A does not know player B’s preference, he/she must make an assumption to determine the input parameter . For now, we let player A assume that player B has the same preference and amplitude terms. This is a reasonable assumption when preferences are similar, but does not hold in general. Specifically, using the amplitude term , he/she can compute the joint selection probability matrix from which the output OAM states are sampled from, so he/she can optimize numerically by viewing the loss of the joint selection probability matrix as a function of . The optimization of the amplitude terms are conducted numerically using the SLSQP optimizer.
Moreover, the HongOuMandel effect allows them to avoid conflicts without having to inform the other party of which option they have selected. Therefore, with Pure HOM, the players’ preferences and their choices are highly secure. Also, there is no need to trust a third party since the whole procedures can be carried out between the two players. As we will see later, this property is unique to Pure HOM and cannot be achieved by the other quantum sampling method we propose in this paper.
iii.1.4 Joint selection probability matrix
By using Eq. (22), we can calculate the probability of player A selecting option and player B selecting option as
(25)  
(26) 
This joint probability depends on the input parameters, and this section analyzes the characteristics of the joint selection probability matrix that consists of and discusses what parameters the players should use. Note that we assume that each player controls the SLM using only his/her own preference.
We can confirm from Eqs. (25), (26) that the joint selection probability matrix is symmetric for any input parameters , and . This means that the satisfied preferences of players A and B on option , that is, and , are always equal. Thus, when players A and B have similar preferences, Pure HOM is likely to result in a low loss, but it is expected to work very poorly when they have reversed preferences.
Regarding situations where the players have the same preferences, the following theorem holds.
Theorem III.1
When there are three options, if the players have the same preferences and all the popularities are less than 1, by setting the amplitude terms as follows:
(27) 
the resulting joint selection probability matrix achieves the theoretical minimum loss.
Proof III.1
When , Eq. (26) can be rewritten as
(28) 
Now, let be the “unnormalized satisfied preference,” which is defined by
(29) 
Also, the unnormalized satisfied preference for player B is defined by
(30) 
With the relation Eq. (25), it follows that
(31) 
Using the amplitude terms described in Eq. (27), we get
(32)  
(33)  
(34)  
(35) 
Here,
(36) 
Similarly, it follows that
(37) 
Normalizing leads to
(38) 
Therefore, the loss for player A is zero, and the same argument can apply to player B, which results in
(39) 
This property cannot be realized by OAM Attenuation, that will be explained in the subsequent Sec. III.2 and other simple sampling methods shown in Sec. IV, highlighting the importance of Pure HOM. Furthermore, in Sec. IV, we present the results of numerical simulations that show the nearoptimality of Pure HOM under less restricted conditions, including when the number of options is more than three and when one of the popularities is greater than 1. There, Pure HOM is found to be quite effective as long as the players have the same preferences.
iii.2 Orbital Angular Momentum Attenuation
iii.2.1 Implementation
This method, which we call “Orbital Angular Momentum Attenuation (OAM Attenuation),” also utilizes quantum interference of orbital angular momentum, and the quantum system to be considered is proposed by Amakasu et al. [4], but for a different purpose. As with the case in Pure HOM, we perform probabilistic decisionmaking by mapping the observed OAM number to the index of choice. We use a system described in Fig. 3, and the whole system works in the following way.
After a photon pair is generated by a twophoton generator, it is split into two paths by a beam splitter, and OAM states of the photons are adjusted by SLMs as follows:
(40) 
In [4], only phase modulations are applied, and we follow this setup in OAM Attenuation. This corresponds with the situation where we set
(41) 
in Eqs. (19) and (20). As a result, the probability of detecting on side A and on side B is
(42)  
(43) 
Here again, we can confirm that the observation probability is always zero when . Also, there are cases where two photons come to the same side. Then, we discard the photon pair and regenerate a new one.
After the HongOuMandel effect, each photon is sent to an attenuation system owned by each player, as shown in Fig. 3. There, the photon is divided into paths by beam splitters, and in each path, a phase factor of is added to the state , which changes the state to , by a hologram. Then, the probability amplitude is reduced by an attenuator, and only an photon is filtered through by a singlemode optical fiber. If the photon is detected by a photodetector placed in the same line as the hologram with the phase factor , the OAM of the incoming photon is revealed to be .
If there are options, holograms whose phase factors are respectively , are used. For example, when the number of options is three, three holograms that transform to , and , respectively, are placed.
Now, in the th path, an attenuator with the attenuation rate is placed after the hologram and before the singlemode optical fiber. In the end, if the input photon of the attenuation system carries the same probability amplitude for each OAM, the detection rate of each OAM is denoted by
(44) 
The sum over all is not equal to unity because some photons are lost by the attenuators and the fibers.
The detected OAM number is mapped to the index of the option. As a result, the selection probability of the th machine is as a result of the attenuation system.
In this study, the part of the system that generates photon pairs using the HongOuMandel effect is considered as the “source,” and we assume that this is controlled by a neutral third party with no knowledge of the players’ preferences. If only one of the players had the authority to manipulate the source, they would be able to adjust and so that some selection pair could happen more at the source than other pairs. The generated pair of photons are then injected into the system shown in Fig. 3, each controlled by player A and B, respectively.
Each player embeds information about his/her own probabilistic preference in the attenuators in his/her system. Specifically, player A adjusts the attenuation rates of the attenuators in his/her system to
(45) 
Similarly, player B sets
(46) 
Each player chooses the option whose index is equal to the OAM number they have observed.
iii.2.2 Computational cost
One remarkable aspect of this system is that it requires zero computational cost since the player only needs to set the attenuation rates. As we will see later, the computational cost of independently computing the effective joint selection probability matrix that this sampling method naturally converges to would be . However, we need no computational cost if we need only to sample from it with the quantum system. Note that, even though the players do not need to calculate anything, they sometimes have to repeat the observation process because the photon pairs are lost probabilistically due to two reasons. The first reason is the loss at the source level. We ignore cases where two photons come out on the same side of the beam splitter at the source. The second reason is the absorption by the attenuators and the filtering through the optical fiber. The players can detect only photons that pass through the attenuators and the singlemode optical fiber.
iii.2.3 Confidentiality
Since each player sets only his/her own attenuators, there is no need to disclose their preferences. In addition, thanks to the HOM effect, selection conflicts do not occur in principle, and thus conflict avoidance is achieved without each party having to communicate their own selection to the other. However, they need to have a way to make sure that both of them detected a photon because each of them alone cannot discriminate the following two situations.

Both players detected a photon, which means that the joint selection is valid.

One of the players detected a photon, but the other photon is absorbed by the attenuator, which means that the joint selection is invalid.
If they have a direct connection, they can just ignore the cases where one of them does not detect a photon. If they do not have a direct connection, a third person or system that executes the joint decision only when both players send their choices is needed. Another pitfall is that the final value of the joint selection probability matrix to which this sampling method converges to depends not only on the players’ preferences, but also on the phase settings of the source part, as we will examine in the next section. Thus, the player has to trust that the third party managing the source determines the phase from a fair distribution.
iii.2.4 Joint selection probability matrix
When there are options and the input states of the source are set as in Eq. (40), the general formula of the elements in the joint selection probability matrix is denoted by
(47)  
(48)  
(49) 
Although the values of the element in the joint selection probability matrix depend on the source’s phase settings as described above, this sampling method converges to the following joint selection probabilities over many repetitions, assuming that the source determines the value of uniformly at random:
(50)  
(51) 
This is actually the same joint probability matrix that is introduced as “Simultaneous Renormalization” in [27]. If this joint selection probability matrix were to be computed on a computer, it would require a computational cost of . However, if the sampling is carried out by the quantum system described in Fig. 3, as explained in Sec. III.2.2, the cost is 0. Although the quantum system guarantees low computational cost and high confidentiality, OAM Attenuation generally cannot achieve the optimal loss, as will be discussed in the following section.
Iv Performance comparison
iv.1 Objectives
In this section, we compare the losses under various preference settings to clarify the extent to which the joint selection probability matrices of the proposed methods approximate the optimal joint selection probability matrix. A total of five models were compared. The first two models were introduced in Sec. III, that is, Pure HOM and OAM Attenuation. The other models are Random Order, Uniform Random and the optimal joint selection probability matrix. Details of Random Order are explained in the subsequent Sec. IV.2. Uniform Random is a method that samples cases with no selection conflicts with equal probabilities. As Eqs. (50), (51) imply, the joint selection probability matrix that OAM Attenuation converges to is the same as the one introduced as “Simultaneous Renormalization” in Ref. [27], albeit without considering the properties of a physical implementation, such as the absorption of photons. Since we have already compared Uniform Random, Random Order, Simultaneous Renormalization, and the optimal joint selection probability matrix in the previous study, this section focuses primarily on the performance of Pure HOM. To be clear about the contribution of this paper to OAM Attenuation, it is the analysis of the implementation method, computational cost, confidentiality, and properties of the joint selection probability matrix it converges to, as presented in Sec. III.2, and this section uses it to compare with Pure HOM.
iv.2 Random Order
One straightforward and classical method to realize conflictfree decisionmaking is what we call “Random Order.” It is similar to the random priority mechanism proposed by Abdulkadiroğlu et al. [1], except that it takes into account probabilistic preferences. A big advantage of it is its simplicity. First, players decide uniformly at random in which order they choose options. Then, based on the order, the first player makes a probabilistic decision according to his/her preference. For there to be no choice conflict, the first player notifies the second player which option has already been chosen. The second player then configures the preference of the alreadyselected option to zero, and normalizes his/her preference so that the sum of the remaining probabilities becomes 1. After that, the second player executes a probabilistic choice based on his/her preference. Note that this method can sometimes fail when there is a zero preference. For example, in an extreme case, when the number of options is two, and the players’ preferences are , and . If they decide player A to be the first and he/she chooses option 1, player B will have no options with a positive preference to select, and the algorithm stops.
This simple algorithm can also reduce the problems with the optimal matrix. Regarding the first problem, that is, the computational cost, the first player does not need to make any calculations. However, the second player needs to normalize his/her preference after setting the preference of the already selected option to 0, which requires the computational cost of .
As for the second problem, neither player is required to directly disclose their probabilistic preference to the other, but the first player has to tell the second player which option he or she has chosen to avoid decision conflict. This will indirectly expose their preference profiles over many trials. In addition, it is an undesirable property if the two players do not trust each other or have limited means of communication.
Finally, iterating this sampling method leads to convergence to a certain joint selection probability matrix, and the general formula of its elements can be expressed as follows:
(52) 
The first term corresponds to the probability of player A selecting option and player B selecting option under the condition that player A draws first, and the second term corresponds to the same probability under the condition that player B draws first. We can easily confirm that these joint probabilities give the optimal loss when the number of options is two. However, in more general cases where , the joint selection probability matrix of Random Order cannot achieve the optimal loss, except in special circumstances, such as when all the preferences are equal.
iv.3 Comparison of the heuristics
The preference settings used in this section are the same as the ones used in Ref. [27]. Considering 2player option () situations, we compare the loss of Uniform Random, OAM Attenuation, Random Order, Pure HOM, together with the optimal joint selection probability matrix under the following four preference settings.

[label=()]

Arithmetic progression and same preference.
(53) 
Modified geometric progression with common ratio 2 and same preference.
(54) 
Modified geometric progression with common ratio 2 and reversed preference.
(55) 
Geometric progression with common ratio 3 and same preference.
(56)
Note that in cases (i)–(iii), the optimal satisfaction matrix achieves since , whereas in case (iv), it is not possible to achieve since .
Figure 4 shows how the loss for each of the five models changes as the number of options increases. First, as expected, Uniform Random (blue line in Fig. 4) performs poorly under all of the conditions, as it does not take preferences into account.
Next, as mentioned in the previous study, OAM Attenuation performs slightly worse than Random Order in all of the cases (orange and green lines in Fig. 4). This results in a tradeoff between preference satisfaction and confidentiality. While OAM Attenuation has larger losses, Random Order requires the first player to inform the other player of his/her choice.
Next, it is remarkable that Pure HOM has a very small loss compared to other heuristics in cases when both players have the same preferences (red line in Fig. 4 cases (i), (ii) and (iv)). In particular, for , it can be mathematically proven that the loss is strictly zero when the maximum popularity is less than 1, as Theorem III.1 suggests, but numerical calculations show that there is some residual loss. This is due to the numerical errors which happen in the optimization of the amplitudes.
Over all, both quantum based sampling methods show promising performance. These methods do not require the direct calculation of a joint preference matrix or the disclosure of player’s preferences, yet they can achieve very small losses.
Remarkably, although case (iv) breaks , the loss by Pure HOM is very close to the optimal loss. This implies that Pure HOM can work well in much less restricted conditions than those assumed in Theorem III.1, as long as the players have the same preferences. In the subsequent section, we examine the optimality of Pure HOM in more general preference settings.
On the other hand, in case (iii), where the players have reversed preferences, Pure HOM performs worse than Uniform Random. The reason is twofold. First, in Pure HOM, we let player A assume that player B has the same preference and amplitude terms, but this assumption is strongly broken in case (iii). Second, even if player A could take player B’s preference into account, the joint selection probability matrix would always be symmetric in Pure HOM, so trying to satisfy the preference of one player will always lead to deviation from the preference of the other player.
iv.4 Optimality of Pure HOM
In Theorem III.1, we proved that when the players have the same preferences over three options and the popularities are all less than 1, Pure HOM can make the loss zero. Moreover, case (iii) of Sec. IV.3 implies the possibility of Pure HOM being close to the optimal in more general settings. This section further examines the loss of Pure HOM under other less restricted settings.
We are also interested in the efficiency of the physical sampling process. Therefore, we define the usage rate
(57) 
to measure the rate of photon pairs successfully used to make conflictfree decisions. In other words, of photon pairs are discarded in Pure HOM because the collective decisionmaking via Pure HOM works only when one photon is observed on side A and the other on side B in Fig. 2. Thus, bigger means that we are utilizing photon pairs efficiently.
Going beyond the cases studied in Sec. IV.3, we consider more general cases where the players have the same preferences and all the popularities are less than 1, but the number of options is 3–50. For each number of options , 1000 preferences are randomly chosen so that the maximum popularity is less than 1. Then, for each preference setting, the loss and the usage rate are calculated as a result of Pure HOM. Finally, the average loss and the average usage rate are calculated over 1000 results. Figure 5LABEL:sub@subfig:phom_loss_0loss shows how the average loss changes as the number of options increases, and Fig. 5LABEL:sub@subfig:phom_usage_0loss shows the change in the average usage rate .
The average loss stays small for all numbers of options. The losses for a smaller number of options become relatively bigger because

The scale of each preference is bigger compared to the cases where is large, and so is the loss.

There is more chance of the biggest preference being close to 0.5, which destabilizes the numerical optimization.
Also, if we look at how the usage rate changes, it remains above about 0.35 and for bigger number of options, it approaches 0.5, meaning that about half of the photon pairs generated can be utilized.
Finally, we examine cases where the players have the same preferences, but the maximum popularity is greater than 1, meaning even the optimal satisfaction matrix cannot achieve 0loss. Figure 6LABEL:sub@subfig:phom_loss_non0loss shows the average loss and 6LABEL:sub@subfig:phom_usage_non0loss is the change in the average usage rate .
Again, the average loss stays close to the minimum loss, although the gap becomes wider when the number of options is large. This is because optimization becomes difficult as the number of parameters becomes larger. Together with the result shown in Fig. 5LABEL:sub@subfig:phom_loss_0loss, for both cases where the maximum popularity is greater than or less than 1, Pure HOM can achieve losses that are quite close to the theoretical minimum.
However, when the maximum popularity is greater than 1, the usage rates are significantly low, on the order of , as demonstrated in Fig. 6LABEL:sub@subfig:phom_usage_non0loss, meaning that we have to discard a lot of pairs of photons.
V Conclusion
In this paper, we deal with a situation in which multiple players have probabilistic preferences and consider the problem of satisfying their preferences. The previous study explicitly computed the joint selection probability matrix that maximized players’ satisfaction. However, there were two concerns with the previous approach: high computational cost and low confidentiality. This paper proposes two sampling methods that are implemented in quantum ways, each of which converges to a particular joint selection probability matrix accomplishing a relatively low loss. We examined the implementation method, computational cost, confidentiality, and the joint selection probability they converge to. Specifically, OAM Attenuation allows sampling with zero computational cost and also guarantees a high degree of confidentiality, as players do not need to disclose their preferences or choices. We also showed that Pure HOM can exclude the necessity to trust a third party while reducing losses to nearoptimal values in situations where players have the same preferences. The property of favoring similar preferences is useful in many real situations. For example, in the competitive multiarmed bandit problem, since the machines have fixed reward probabilities over time, the players’ preferences are expected to converge to similar values for each machine.
Future studies include the mathematical or theoretical understanding of why Pure HOM can achieve a loss quite close to the theoretical minimum when players have the same preferences. Moreover, examining the possibilities of realizing an efficient sampling method that yields the optimal joint selection probability matrix is an interesting future topic. In the meantime, we considered the average joint selection probability matrix for each sampling method assuming an infinite number of repetitions. However, for example, the joint selection probability matrix for Random Order varies greatly depending on the order of players, especially in a small finite number of repetitions. In the case of OAM Attenuation, it also varies depending on the setting of
at the source point. The evaluation of such samplewise variance is also a future topic.
Acknowledgements
This work was supported in part by the CREST project (JPMJCR17N2) funded by the Japan Science and Technology Agency and GrantsinAid for Scientific Research (JP20H00233) funded by the Japan Society for the Promotion of Science.
References
 [1] (1998) Random serial dictatorship and the core from random endowments in house allocation problems. Econometrica 66 (3), pp. 689–701. Cited by: §IV.2.
 [2] (1991) Fair allocation of indivisible goods and criteria of justice. Econometrica: Journal of the Econometric Society, pp. 1023–1039. Cited by: §I.
 [3] (2016) Optical angular momentum. CRC press. Cited by: §I.
 [4] (2021) Conflictfree collective stochastic decision making by orbital angular momentum of photons through quantum interference. Scientific Reports 11 (1), pp. 21117. Cited by: §I, §III.2.1, §III.2.1, §III.
 [5] (1982) On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General 15 (10), pp. 3241. Cited by: §I.
 [6] (2018) Multiplayer bandits revisited. In Algorithmic Learning Theory, pp. 56–92. Cited by: §I.
 [7] (2001) A new solution to the random assignment problem. Journal of Economic theory 100 (2), pp. 295–328. Cited by: §I.
 [8] (2019) A poor man’s coherent ising machine based on optoelectronic feedback systems for solving optimization problems. Nature communications 10 (1), pp. 1–9. Cited by: §I.
 [9] (2020) Twophoton interference: the hong–ou–mandel effect. Reports on Progress in Physics 84 (1), pp. 012402. Cited by: §III.1.1.
 [10] (2020) Entangled nphoton states for fair and optimal social decision making. Scientific Reports 10 (1), pp. 20420. Cited by: §I.
 [11] (2019) Entangledphoton decision maker. Scientific Reports 9 (1), pp. 4832. Cited by: §I, §I.
 [12] (2021) Parallel convolutional processing using an integrated photonic tensor core. Nature 589 (7840), pp. 52–58. Cited by: §I.
 [13] (1987) Measurement of subpicosecond time intervals between two photons by interference. Physical review letters 59 (18), pp. 2044. Cited by: §III.1.1.
 [14] (2016) A coherent ising machine for 2000node optimization problems. Science 354 (6312), pp. 603–606. Cited by: §I.
 [15] (2016) Harnessing the computational power of fluids for optimization of collective decision making. Philosophies 1 (3), pp. 245–260. Cited by: §I.
 [16] (2019) Novel frontier of photonics for data processing—photonic accelerator. Apl Photonics 4 (9), pp. 090901. Cited by: §I.
 [17] (2010) Cognitive medium access: exploration, exploitation, and competition. IEEE transactions on mobile computing 10 (2), pp. 239–253. Cited by: §I.

[18]
(2018)
Alloptical machine learning using diffractive deep neural networks
. Science 361 (6406), pp. 1004–1008. Cited by: §I.  [19] (2014) Ising formulations of many np problems. Frontiers in physics, pp. 5. Cited by: §I.
 [20] (2014) Network of timemultiplexed optical parametric oscillators as a coherent ising machine. Nature Photonics 8 (12), pp. 937–942. Cited by: §I.
 [21] (1991) Exploration and exploitation in organizational learning. Organization science 2 (1), pp. 71–87. Cited by: §I.
 [22] (2016) A fully programmable 100spin coherent ising machine with alltoall connections. Science 354 (6312), pp. 614–617. Cited by: §I.
 [23] (2015) Singlephoton decision maker. Scientific reports 5 (1), pp. 1–9. Cited by: §I.
 [24] (2007) Optical quantum computing. Science 318 (5856), pp. 1567–1570. Cited by: §I.
 [25] (1974) On cores and indivisibility. Journal of mathematical economics 1 (1), pp. 23–37. Cited by: §I.

[26]
(2021)
Photonics for artificial intelligence and neuromorphic computing
. Nature Photonics 15 (2), pp. 102–114. Cited by: §I.  [27] (2022) Optimal preference satisfaction for conflictfree joint decisions. arXiv preprint arXiv:2205.00799. Cited by: §I, §I, §I, §II.1, §II.2, §III.2.4, §IV.1, §IV.3.
 [28] (2018) Reinforcement learning: an introduction. MIT press. Cited by: §I.
 [29] (1999) Strategyproof allocation of indivisible goods. Social Choice and Welfare 16 (4), pp. 557–567. Cited by: §I.
 [30] (2017) Advances in photonic reservoir computing. Nanophotonics 6 (3), pp. 561–576. Cited by: §I.
 [31] (2012) Terabit freespace data transmission employing orbital angular momentum multiplexing. Nature photonics 6 (7), pp. 488–496. Cited by: §III.1.1.
 [32] (2020) Inference in artificial intelligence with deep optics and photonics. Nature 588 (7836), pp. 39–47. Cited by: §I.
 [33] (2015) Optical communications using orbital angular momentum beams. Advances in optics and photonics 7 (1), pp. 66–106. Cited by: §I.
 [34] (2011) Orbital angular momentum: origins, behavior and applications. Advances in optics and photonics 3 (2), pp. 161–204. Cited by: §III.1.1.