1 Introduction
1.1 Motivation
Efficient simulation of randomness is a task with countless applications, ranging from cryptography to derandomization. In the setting of classical probabilistic computation, such simulation is straightforward in many settings. For example, a random function which will only be queried an a priori bounded number of times can be perfectly simulated using a wise independent function [31]. In the case of unbounded queries, one can use pseudorandom functions (PRFs), provided the queries are made by a polynomialtime algorithm [16]. These are examples of stateless simulation methods, in the sense that the internal memory of the simulator is initialized once (e.g., with the PRF key) and then remains fixed regardless of how the simulator is queried. Against arbitrary adversaries, one must typically pass to stateful simulation. For example, the straightforward and wellknown technique of lazy sampling suffices to perfectly simulate a random function against arbitrary adversaries; however, the simulator must maintain a list of responses to all previous queries.
Each of these techniques for simulating random classical primitives has a plethora of applications in theoretical cryptography, both as a proof tool and for cryptographic constructions. These range from constructing secure cryptosystems for encryption and authentication, to proving security reductions in a wide range of settings, to establishing security in idealized models such as the Random Oracle Model [6].
1.1.1 Quantum randomness.
As is wellknown, quantum sources of randomness exhibit dramatically different properties from their classical counterparts [23, 7]. Compare, for example, uniformly random bit classical states (i.e., bit strings) and uniformly random
qubit (pure) quantum states. A random string
is obviously trivial to sample perfectly given probabilistic classical (or quantum) computation, and can be copied and distributed arbitrarily. However, it is also (just as obviously) deterministic to all parties who have examined it before. By contrast, a random state would take an unbounded amount of information to describe perfectly. Even if one manages to procure such a state, it is then impossible to copy due to the nocloning theorem. On the other hand, parties who have examined many times before, can still extract almost exactly bits of randomness from any fresh copy of they receive – even if they use the exact same measurement procedure each time.The differences between random classical and random quantum maps are even more stark. The outputs of a classical random function are of course classical random strings, with all of the aforementioned properties. Outputs which have already been examined become effectively deterministic, while the rest remain uniformly random and independent. This is precisely what makes efficient simulation possible via lazy sampling. A Haarrandom unitary queried on two inputs and also produces (almost) independent and uniformly random states when queried, but only if the queries are orthogonal, i.e., . Unitarity implies that overlapping queries must be answered consistently, i.e., if then . This possibility of querying with a distinct pure state which is not linearly independent from previous queries simply doesn’t exist for classical functions.
We emphasize that the above differences should not be interpreted as quantum random objects simply being “stronger” than their classical counterparts. In the case of classical states, i.e. strings, the ability to copy is quite useful, e.g., in setting down basic security definitions [8, 3, 2] or when rewinding an algorithm [29, 30, 14]. In the case of maps, determinism is also quite useful, e.g., for verification in message authentication.
1.2 The problem: efficient simulation
Given the dramatic differences between classical and quantum randomness, and the usefulness of both, it is reasonable to ask if there exist quantum analogues of the aforementioned efficient simulators of classical random functions. In fact, given the discussion above, it is clear that we should begin by asking if there even exist efficient simulators of random quantum states.
1.2.1 Simulating random states.
The first problem of interest is thus to efficiently simulate the following ideal object: an oracle which contains a description of a perfectly Haarrandom qubit pure state , and which outputs a copy of whenever it is invoked. We first make an obvious observation: the classical analogue, which is simply to generate a random bitstring and then produce a copy whenever asked, is completely trivial. In the quantum case, efficient simulation is only known against limited query algorithms (henceforth, adversaries.)
If the adversary has an a priori bound on the number of queries, then state designs suffice. These are indexed families of pure states which perfectly emulate the standard uniform “Haar” measure on pure states, up to the first moments. State designs can be sampled efficiently, and thus yield a stateless simulator for this case [4]. A recent work of Ji, Liu and Song considered the case of polynomialtime adversaries [18]. They defined a notion of pseudorandom states (PRS), which appear Haarrandom to polynomialtime adversaries who are allowed as many copies of the state as they wish. They also showed how to construct PRS efficiently, thus yielding a stateless simulator for this class of constrained adversaries [18]; see also [9].
The case of arbitrary adversaries is, to our knowledge, completely unexplored. In particular, before this work it was not known whether simulating against adversaries with no a priori bound on query or time complexity is possible, even if given polynomial space (in and the number of queries) and unlimited time. Note that, while the state family constructions from [18, 9] could be lifted to the unconditional security setting by instantiating them with random instead of pseudorandom functions, this would require space exponential in regardless of the number of queries.
1.2.2 Simulating random unitaries.
In the case of simulating random unitaries, the ideal object is an oracle (n) which contains a description of a perfectly Haarrandom qubit unitary operator , and applies to its input whenever it is invoked. The classical analogue is the wellknown Random Oracle, and can be simulated perfectly using the aforementioned technique of lazy sampling. In the quantum case, the situation is even less wellunderstood than in the case of states.
For the case of querylimited adversaries, we can again rely on design techniques: (approximate) unitary designs can be sampled efficiently, and suffice for the task [10, 21]. Against polynomialtime adversaries, Ji, Liu and Song defined the natural notion of a pseudorandom unitary (or PRU) and described candidate constructions [18]. Unfortunately, at this time there are no provably secure constructions of PRUs. As in the case of states, the case of arbitrary adversaries is completely unexplored. Moreover, one could a priori plausibly conjecture that simulating might even be impossible. The nocloning property seems to rule out examining input states, which in turn seems to make it quite difficult for a simulator to correctly identify the overlap between multiple queries, and then answer correspondingly.
1.2.3 Extensions.
While the above problems already appear quite challenging, we mention several natural extensions that one might consider. First, for the case of repeatedly sampling a random state , one would ideally want some additional features, such as the ability to apply the twooutcome measurement (verification) or the reflection . In the case of pseudorandom simulation, these additional features can be used to create a (computationally secure) quantum money scheme [18]. For the case of simulating random unitaries, we might naturally ask that the simulator for a unitary also has the ability to respond to queries to .
1.3 This work
In this work, we make significant progress on the above problems, by giving the first simulators for both random states and random unitaries, which are convincing to arbitrary adversaries. We also give an application of our sampling ideas: the construction of a new quantum money scheme, which provides informationtheoretic security guarantees against both forging and tracing.
We begin by remarking that our desired simulators must necessarily be stateful, for both states and unitaries. Indeed, since approximate designs have elements (see, e.g., [26] which provides a more finegrained lower bound), a stateless approach would require superpolynomial space simply to store an index from a set of size for all polynomials .
In the following, we give a highlevel overview of our approach for each of the two simulation problems of interest.
1.3.1 Simulating random states.
As discussed above, we wish to construct an efficient simulator for the ideal oracle . For now we focus on simulating the procedure which generates copies of the fixed Haarrandom state; we call this . We first note that the mixed state observed by the adversary after queries to is the expectation of the projector onto copies of . Equivalently, it is the (normalized) projector onto the symmetric subspace of :
(1) 
Recall that is the subspace of
of vectors which are invariant under permutations of the
tensor factors. Our goal will be to maintain an entangled state between the adversary and our oracle simulator such that the reduced state on the side of is after queries. Specifically, the joint state will be the maximally entangled state between the subspace of the query output registers received by , and the subspace of registers held by . If we can maintain this for the first queries, then it’s not hard to see that there exists an isometry which, by acting only on the state of , implements the extension from the fold to the fold joint state.The main technical obstacle, which we resolve, is showing that can be performed efficiently. To achieve this, we develop some new algorithmic tools for working with symmetric subspaces, including an algorithm for coherent preparation of its basis states. We let denote an qubit register, its indexed copies, and many indexed copies (and likewise for .) We also let denote a particular orthonormal basis set for , indexed by some set (see Section 3 for definitions of these objects.)
Theorem 1.1
For each and , there exists a polynomialtime quantum algorithm which implements an isometry from to such that, up to negligible trace distance,
Above, is an operator defined to apply to a specific subset of registers of a state. When no confusion can arise, in such settings we will abbreviate —the application of this operator on the entire state—as simply .
It will be helpful to view as first preparing and then applying a unitary on . Theorem 1.1 then gives us a way to answer queries efficiently, as follows. For the first query, we prepare a maximally entangled state across two qubit registers and , and reply with register . Note that . For the second query, we prepare two fresh registers and , both in the state, apply on , return , and keep . For the th query, we proceed similarly, preparing fresh blank registers , applying , and then outputting the register .
With this approach, as it turns out, there is also a natural way to respond to verification queries and reflection queries . The ideal functionality . is to apply the twooutcome measurement corresponding to the Haarrandom state . To simulate this after producing samples, we apply the inverse of , apply the measurement to , reapply , and then return together with the measurement outcome (i.e., yes/no). For ., the ideal functionality is to apply the reflection through the state. To simulate this, we perform a sequence of operations analogous to , but apply a phase of on the state of instead of measuring.
Our main result on simulating random states is to establish that this collection of algorithms correctly simulates the ideal object , in the following sense.
Theorem 1.2
There exists a stateful quantum algorithm which runs in time polynomial in , , and the number of queries submitted to it, and satisfies the following. For all oracle algorithms ,
A complete description of our construction, together with the proofs of Theorem 1.1 and Theorem 1.2, are given in Section 3.
1.3.2 Application: untraceable quantum money.
To see that the efficient state sampler leads to a powerful quantum money scheme, consider building a scheme where the bank holds the ideal object The bank can mint bills by , and verify them using . As each bill is guaranteed to be an identical and Haarrandom state, it is clear that this scheme should satisfy perfect unforgeability and untraceability, under quite strong notions of security.
By Theorem 3.2, the same properties should carry over for a money scheme built on , provided is sufficiently small. We call the resulting scheme Haar money. Haar money is an informationtheoretically secure analogue of the scheme of [18], which is based on pseudorandom states. We remark that our scheme requires the bank to have quantum memory and to perform quantum communication with the customers. However, given that quantum money already requires customers to have largescale, highfidelity quantum storage, these additional requirements seem reasonable.
The notions of correctness and unforgeability (often called completeness and soundness) for quantum money are wellknown (see, e.g., [1].) Correctness asks that honestly generated money schemes should verify, i.e., should always accept. Unforgeability states that an adversary with bills and oracle access to should not be able to produce a state on which accepts. In this work, we consider untraceable quantum money (also called “quantum coins” [24].) We give a formal security definition for untraceability, which states that an adversary with oracle access to and cannot do better than random guessing in the following experiment:

outputs some candidate bill registers and a permutation ;

is sampled, and if the registers are permuted by ; each candidate bill is verified and the failed ones are discarded;

receives the rest of the bills and the entire internal state of the bank, and outputs a guess for .
Theorem 1.3
The Haar money scheme , defined by setting
is a correct quantum money scheme which satisfies informationtheoretic unforgeability and untraceability.
One might reasonably ask if there are even stronger definitions of security for quantum money. Given its relationship to the ideal state sampler, we believe that Haar money should satisfy almost any notion of unforgeability and untraceability, including composable notions. We also remark that, based on the structure of the state simulator, which maintains an overall pure state supported on two copies of the symmetric subspace of banknote registers, it is straightforward to see that the scheme is also secure against an “honest but curious” or “specious” [27, 15] bank. We leave the formalization of these added security guarantees to future work.
1.3.3 Sampling Haarrandom unitaries.
Next, we turn to the problem of simulating Haarrandom unitary operators. In this case, the ideal object initially samples a description of a perfectly Haarrandom qubit unitary , and then responds to two types of queries: , which applies , and , which applies . In this case, we are able to construct a stateful simulator that runs in space polynomial in and the number of queries , and is exactly indistinguishable from to arbitrary adversaries. Our result can be viewed as a polynomialspace quantum analogue of the classical technique of lazy sampling for random oracles.
Our highlevel approach is as follows. For now, suppose the adversary only makes parallel queries to . If the query count of is a priori bounded, we can simply sample an element of a unitary design. We can also do this coherently: prepare a quantum register in uniform superposition over the index set of the design, and then apply the design controlled on . Call this efficient simulator . Observe that the effect of parallel queries is just the application of the twirling channel to the input registers [10], and that simulates faithfully. What is more, it applies a Stinespring dilation^{1}^{1}1The Stinespring dilation of a quantum channel is an isometry with the property that the quantum channel can be implemented by applying the isometry and subsequently discarding an auxiliary register. [28] of with dilating register .
Now suppose makes an “extra” query, i.e., query number . Consider an alternative Stinespring dilation of , namely the one implemented by when queried times. Recall that all Stinespring dilations of a quantum channel are equivalent, up to a partial isometry on the dilating register. It follows that there is a partial isometry, acting on the private space of , that transforms the dilation of implemented by into the dilation of implemented by . If we implement this transformation, and then respond to as prescribed by , we have achieved perfect indistinguishability against the additional query. By iterating this process, we see that the a priori bound on the number of queries is no longer needed. We let denote the resulting simulator. The complete construction is described in Construction 4 below.
Our highlevel discussion above did not take approximation into account. All currently known efficient constructions of designs are approximate. Here, we take a different approach: we will implement our construction using exact designs
. This addresses the issue of adaptive queries: if there exists an adaptivequery distinguisher with nonzero distinguishing probability, then by postselection there also exists a parallelquery one via probabilistic teleportation. This yields that the ideal and efficient unitary samplers are perfectly indistinguishable to arbitrary adversaries.
Theorem 1.4
For all oracle algorithms ,
The existence of exact unitary designs for all is a fairly recent result. It follows as a special case of a result of Kane [19], who shows that designs exist for all finitedimensional vector spaces of wellbehaved functions on pathconnected topological spaces. He also gives a simpler result for homogeneous spaces when the vector space of functions is invariant under the symmetry group action. Here, the number of elements of the smallest design is bounded just in terms of the dimension of the space of functions. The unitary group is an example of such a space, and the dimension of the space of homogeneous polynomials of degree in both and can be explicitly derived, see e.g. [26]. This yields the following.
Corollary 1
The space complexity of for queries is bounded from above by .
1.3.4 An alternative approach.
We now sketch another potential approach to lazy sampling of unitaries. Very briefly, this approach takes a representationtheoretic perspective and suggests that the Schur transform [5] could lead to a polynomialtime algorithm for lazy sampling Haarrandom unitaries. The discussion below uses tools and language from quantum information theory and the representation theory of the unitary and symmetric groups to a much larger extent than the rest of the article, and is not required for understanding our main results.
We remark that the analogous problem of lazy sampling a quantum oracle for a random classical function was recently solved by Zhandry [32]. One of the advantages of Zhandry’s technique is that it partly recovers the ability to inspect previously made queries, an important feature of classical lazy sampling. The key insight is that the simulator can implement the Stinespring dilation of the oracle channel, and thus record the output of the complementary channel.^{2}^{2}2The complementary channel of a quantum channel maps the input to the auxiliary output of the Stinespring dilation isometry. As the classical function is computed via XOR, changing to the Fourier basis makes the recording property explicit. It also allows for an efficient implementation.
In the case of Haarrandom unitary oracles, we can make an analogous observation. Consider an algorithm that makes parallel queries to
. The relevant Fourier transform is now over the unitary group, and is given by the Schur transform
[5]. By SchurWeyl duality (see e.g. [12]), the decomposition of into irreducible representations is given by(2) 
Here means is any partition of into at most parts, is the Specht module of , and is the Weyl module of , corresponding to the partition , respectively. By Schur’s lemma, the twirling channel acts as
(3) 
where is the identity channel, and with the maximally mixed state is the depolarizing channel. We therefore obtain a Stinespring dilation of the twirling channel as follows. Let be registers with Hilbert spaces
(4) 
and denote the subregisters by and , respectively. Let further be the standard maximally entangled state on these registers, and let be a register whose dimension is the number of partitions of (into at most parts). Define the isometry
(5) 
In the above equation and are understood to be subspaces of , the identity operators on , are omitted and is the swap operator. By (3), a Stinespring dilation of the twirling channel is then given by
(6) 
By the equivalence of all Stinespring dilations, the exists an isometry that transforms the state register of after parallel queries so that the global state is the same as if the Stinespring dilation above had been applied to the input registers. But now the quantum information that was contained in the subspace of the algorithm’s query registers can be found in register .
1.4 Organization
The remainder of the paper is organized as follows. In Section 2, we recall some basic notation and facts, and some lemmas concerning coherent preparation of certain generic families of quantum states. The proofs for these lemmas are given in Appendix 0.A. We also describe stateful machines, which will be our model for thinking about the aforementioned ideal objects and their efficient simulators. In Section 3 we describe our efficient simulator for Haarrandom states, and in Section 4 we describe our polynomialspace simulator for Haarrandom unitaries. We end by describing the Haar money scheme and establishing its security in Section 5.
1.5 Acknowledgments
The authors thank YiKai Liu, Carl Miller, and Fang Song on helpful comments on an earlier draft. CM thanks Michael Walter for discussions about designs. CM was funded by a NWO VIDI grant (Project No. 639.022.519) and a NWO VENI grant (Project No. VI.Veni.192.159). GA acknowledges support from NSF grant CCF1763736.
2 Preliminaries
2.1 Some basics
Given a fixedsize (e.g., qubit) register , we will use to denote indexed copies of . We will use to denote a register consisting of indexed copies of , i.e., . Unless stated otherwise, distances of quantum states are measured in the trace distance, i.e.,
Distances of unitary operators are measured in the operator norm.
We will frequently apply operators to some subset of a larger collection of registers. In that context, we will use register indexing to indicate which registers are being acted upon, and suppress identities to simplify notation. The register indexing will also be suppressed when it is clear from context. For example, given an operator and some state on registers and , we will write in place of to denote the state on resulting from applying to the register of .
We let denote the maximally entangled state on registers and . For a linear operator and some basis choice, we denote its transpose by .
Lemma 1 (Mirror lemma; see, e.g., [22])
For a linear operator,
2.2 Unitary designs
Let be the Haar measure on the unitary group . We define the Haar twirling channel by
(7) 
For a finite subset , we define the twirling map with respect to as
(8) 
An qubit unitary design is a finite set such that
(9) 
Another twirling channel is the mixed twirling channels with applications of the unitary and applications of it’s inverse,
(10) 
The mixed twirling channel for a finite set is also defined analogous to Equation (8). As our definition of unitary designs is equivalent to one based on the expectation values of polynomials (see, e.g., [21]), we easily obtain the following.
Proposition 1
Let be an qubit unitary design and . Then
(11) 
Finite exact unitary designs exist. In particular, one can apply the following theorem to obtain an upper bound on their minimal size. Here, a design for a function space on a topological space with measure is a finite set such that the expectation of a function is the same whether it is taken over according to
or over the uniform distribution on
.Theorem 2.1 ([19], Theorem 10)
Let be a homogeneous space, an invariant measure on and a dimensional vector subspace of the space of real functions on that is invariant under the symmetry group of , where . Then for any , there exists a design for of size . Furthermore, there exists a design for of size at most .
The case of unitary designs is the one where is acting on itself (e.g., on the left), is the Haar measure, and is the vector space of homogeneous polynomials of degree in both and ^{3}^{3}3The output of the twirling channel (7) is a matrix of such polynomials.. The dimension of this space is
(12) 
see e.g. [26]. We therefore get
Corollary 2
For all , there exists an exact qubit unitary design with a number of elements which is at most
2.3 Real and ideal stateful machines
We will frequently use stateful algorithms with multiple “interfaces” which allow a user to interact with the algorithm. We will refer to such objects as stateful machines. We will use stateful machines to describe functionalities (and implementations) of collections of oracles which relate to each other in some way. For example, one oracle might output a fixed state, while another oracle reflects about that state.
Definition 1 (Stateful machine)
A stateful machine consists of:

A finite set , whose elements are called interfaces. Each interface has two fixed parameters (input size) and (output size), and a variable initialized to (query counter.)

For each interface , a sequence of quantum algorithms . Each has an input register of qubits, an output register of qubits, and is allowed to act on an additional shared work register (including the ability to add/remove qubits in .) In addition, each increments the corresponding query counter by one.
The typical usage of a stateful machine is as follows. First, the work register is initialized to be empty, i.e., no qubits. After that, whenever a user invokes an interface and supplies qubits in an input register , the algorithm is invoked on registers and . The contents of the output register are returned to the user, and the new, updated work register remains for the next invocation. We emphasize that the work register is shared between all interfaces.
We remark that we will also sometimes define ideal machines, which behave outwardly like a stateful machine but are not constrained to apply only maps which are implementable in finite space or time. For example, an ideal machine can have an interface that implements a perfectly Haarrandom unitary , and another interface which implements .
2.4 Some state preparation tools
We now describe some algorithms for efficient coherent preparation of certain quantum state families. The proofs for the following lemmas can be found in Appendix 0.A. We begin with state families with polynomial support.
Lemma 2
Let be a family of quantum states whose amplitudes have an efficient classical description , and such that . Then there exists a quantum algorithm which runs in time polynomial in and and satisfies
Given a set , we let
denote the states supported only on and its set complement , respectively. Provided that has polynomial size, we can perform coherent preparation of both state families efficiently: the former by Lemma 2 and the latter via the below.
Lemma 3
Let be a family of sets of size with efficient description , and let . There exists a quantum algorithm which runs in time polynomial in and and satisfies
Finally, we show that if two orthogonal quantum states can be prepared, then so can an arbitrary superposition of the two.
Lemma 4
Let be two familes of qubit quantum states such that for all , and such that there exists a quantum algorithm which runs in time polynomial in and and satisfies for .
For such that , let denote a classical description of to precision at least . There exists a quantum algorithm which runs in time polynomial in and and satisfies
(13) 
3 Simulating a Haarrandom state oracle
3.1 The problem, and our approach
We begin by defining the ideal object we’d like to emulate. Here we deviate slightly from the discussion above, in that we ask for the reflection oracle to also accept a (quantum) control bit.
Construction 1 (Ideal state sampler)
The ideal qubit state sampler is an ideal machine with interfaces , defined as follows.

takes no input; samples a description of an qubit state from the Haar measure.

takes no input; uses to prepare a copy of and outputs it.

receives qubit input; uses to apply the measurement ; return the postmeasurement state and output in the first case and in the second.

receives qubit input; uses to implement the controlled reflection about .
We assume that is called first, and only once; the remaining oracles can then be called indefinitely many times, and in any order. If this is inconvenient for some application, one can easily adjust the remaining interfaces to invoke if that has not been done yet. We remark that can be implemented with a single query to .
Lemma 5
can be simulated with one application of .
Proof
Prepare an ancillary qubit in the state and apply the reflection on the input controlled on the ancillary qubit. Then apply to the ancilla qubit and measure it. Output all the qubits, with the ancilla interpreted as and . ∎
Our goal is to devise a stateful simulator for Construction 1 which is efficient. Efficient here means that, after total queries to all interfaces (i.e., , , , and ), the simulator has expended time polynomial in , , and .
As described in Section 1.3.1, our approach will be to ensure that, for every , the state shared between the adversary and our stateful oracle simulator will be maximally entangled between two copies of the fold symmetric subspace : one held by , and the other by . The extension from the fold to the fold joint state will be performed by an isometry which acts only on the state of and two fresh qubit registers and initialized by . After is applied, will be given to . As we will show, can be performed efficiently using some algorithmic tools for working with symmetric subspaces, which we will develop in the next section. This will yield an efficient way of simulating . Simulation of and will follow without much difficulty, as outlined in Section 1.3.1.
3.2 Some tools for symmetric subspaces
3.2.1 A basis for the symmetric subspace.
We recall an explicit orthonormal basis of the symmetric subspace (see, e.g., [18] or [17].) Let
(14) 
be the set of lexicographicallyordered tuples of bit strings. For each , define the unit vector
(15) 
Here, is the number of times the string appears in the tuple . The set is an orthonormal basis for . We remark that the Schmidt decomposition of with respect to the bipartition formed by the th register vs. the rest is given by
(16) 
where is the tuple with one copy of removed.
3.2.2 Some useful algorithms.
We now describe some algorithms for working in the above basis. Let and denote qubit registers. Recall that denotes indexed copies of and that denotes , and likewise for . In our setting, the various copies of will be prepared by the oracle simulator and then handed to the query algorithm at query time. The copies of will be prepared by, and always remain with, the oracle simulator.
Proposition 2
For each , and , there exists an efficiently implementable unitary on such that for all , up to trace distance .
Proof
Clearly, the operation
(17) 
is efficiently implementable exactly, by XORing the classical sort function of the first register into the second register.
Let us now show that the operation is also efficiently implementable (up to the desirable error) by exhibiting an explicit algorithm. We define it recursively in , as follows. For , for all , so this case is simply the map . Suppose now the operation can be implemented for any . The th level algorithm will begin by applying
Since is nonzero for only many , this can be implemented efficiently by Lemma 2. Next, we perform . Using the algorithm for , we then apply , and uncompute . By (16), we have in total applied so far. To finish the th level algorithm for approximating , we simply apply (17) to uncompute from the first register. ∎
Theorem 3.1 (Restatement of Theorem 1.1)
For each , and , there exists an efficiently implementable isometry from to such that, up to trace distance ,
Proof
We describe the algorithm assuming all steps can be implemented perfectly. It is straightforward to check that each step we use can in reality be performed to a sufficient accuracy that the accuracy of the entire algorithm is at least .
We will need a couple of simple subroutines. First, given and , we define to be the element of produced by inserting at the first position such that the result is still lexicographically ordered. One can perform this reversibly via .
Second, we will need to do coherent preparation of the state
(18) 
For any given , the state can be prepared by using the preparation circuit for the two orthogonal components of the state whose supports are and . These two components can also be prepared coherently using Lemma 2 and Lemma 3, respectively. Their superposition can be prepared with Lemma 4. Putting it all together, we get an algorithm for .
The complete algorithm is a composition of several efficient routines. We describe this below, explicitly calculating the result for the input states of interest. For readability, we omit overall normalization factors.
add working registers  
apply to  
insert into  
apply to 
To see that the last line above is the desired result, we observe that we can index the sum in the last line above in a more symmetric fashion: the sum is just taken over all pairs such that the latter can be obtained from the former by adding one entry (i.e., the string ). But that is the same as summing over all pairs , such that the former can be obtained from the latter by removing one entry.
Here, the last equality is (16), and the prefactor is the square root of the quotient of the dimensions of the  and copy symmetric subspaces, as required for a correct normalization of the final maximally entangled state.∎
3.3 State sampler construction and proof
Construction 2 (Efficient state sampler)
Let be a positive integer and a negligible function of . The efficient qubit state sampler with precision is a stateful machine with interfaces , defined below. For convenience, we denote the query counters by and in the following.

prepares the standard maximally entangled state on qubit registers and , and stores both and .

On the first query, outputs register . On query , takes as input registers and produces registers by applying the isometry from Theorem 3.1 with accuracy ; then it outputs and stores .

On query with input registers , do the following controlled on the qubit register : apply , a unitary implementation of , with accuracy , in the sense that , with playing the role of . Subsequently, apply a phase on the allzero state of the ancilla registers and , and reapply , this time with accuracy .
We omitted defining since it is trivial to build from , as described in Lemma 5. By Theorem 3.1, the runtime of is polynomial in , and the total number of queries that are made to its various interfaces.
We want to show that the above sampler is indistinguishable from the ideal sampler to any oracle algorithm, in the following sense. Given a stateful machine and a (not necessarily efficient) oracle algorithm , we define the process as follows:

is called;

receives oracle access to and ;

outputs a bit .
Theorem 3.2
For all oracle algorithms and all that can depend on in an arbitrary way,
(19) 
Proof
During the execution of , the th call of (for any ) incurs a trace distance error of at most . The trace distance between the outputs of and is therefore bounded by . It is thus sufficient to establish the theorem for .
For any fixed , there exists a stateful machine which is perfectly indistinguishable from to all adversaries who make a maximum total number of queries. The procedure of samples a random element from an exact unitary design . Queries to are answered with a copy of , and is implemented by applying . It will be helpful to express in an equivalent isometric form. In this form, the initial oracle state is
(20) 
queries are answered using the controlled isometry
(21) 
queries are answered by
(22)  
(23) 
Now suppose is an arbitrary (i.e., not boundedquery) algorithm making only queries. We will show that after queries, the oracles and are equivalent, and that this holds for all . We emphasize that does not depend on ; as a result, we can apply the equivalence for the appropriate total query count after has produced its final state, even if is determined only at runtime. It will thus follow that is equivalent to .
To show the equivalence betwen and , we will demonstrate a partial isometry that transforms registers of (after queries and no queries) into the register of , in such a way that the corresponding global states on and are mapped to each other. The isometry is partial because its domain is the symmetric subspace of . It is defined as follows:
(24) 
To verify that this is indeed the desired isometry, we calculate:
(25)  
(26)  
(27)  
(28) 
Here we have used the fact that is in the symmetric subspace in the second equality, and the third and forth equality are applications of the Mirror Lemma (Lemma 1) with , and