I Introduction
Memcomputing stands for “computing in and with memory” [1]. It is a novel computing paradigm whereby memory (time nonlocality) is employed to perform both tasks: storing and processing of information on the same physical location. This paradigm is substantially different than the one implemented in our modernday computers [2]. In these machines there is a separation of tasks between a memory (storage) unit and one that performs the processing of information. Modernday computers are the closest physical approximation possible to the wellknown (mathematical) Turing paradigm of computation that maps a finite string of symbols into a finite string of symbols in discrete time [3].
The memcomputing paradigm has been formalized by Traversa and Di Ventra in Ref. [4]. In that paper it was shown that universal memcomputing machines (UMMs) can be defined as digital (socalled digital memcomputing machines (DMMs) [5]) or analog [6], with the digital ones offering an easy path to scalable (in terms of size) machines.
UMMs have several features that make them a powerful computing model, most notably intrinsic parallelism, information overhead, and functional polymorphism [4]. The first feature means that they can operate collectively
on all (or portions of) the data at once, in a cooperative fashion. This is reminiscent of neural networks, and indeed neural networks can be viewed as special cases of UMMs. However, neural networks do not have “information overhead”. This feature is related to the
topology (or architecture) of the network of memory units (memprocessors). It means the machine has access, at any given time, to more information (precisely originating from the topology of the network) than what is available were the memprocessors not connected to each other. Of course, this information overhead is not necessarily stored by the machine (the stored information is the Shannon one) [4]. Nevertheless, with appropriate topologies, hence with appropriate information overhead, UMMs can solve complex problems very efficiently [4]. Finally, functional polymorphism means that the machine is able to compute different functions without modifying the topology of the machine network, by simply applying the appropriate input signals [4].In Ref. [4] it was shown that UMMs are Turingcomplete, meaning that UMMs can simulate any Turing machine. Note that Turingcompleteness means only that all problems a Turing machine can solve, can also be solved by a UMM. It does not imply anything about the resources (in terms of time, space and energy) required to solve those problems.
The reverse, namely that all problems solved by UMMs are also solved by Turing machines (namely equivalence between the two models of computation) has not been proved yet. Nevertheless, a practical realization of digital memcomputing machines, using electronic circuits with memory [5], has been shown to be efficiently simulated with our modernday computers (see, e.g., [7]
). In other words, the ordinary differential equations representing DMMs and the problems they are meant to solve can be efficiently simulated on a classical computer.
In recent years, other computational models, each addressing different types of problems, have attracted considerable interest in the scientific community. On the one hand, quantum computing, namely computing using features, like tunneling or entanglement, that pertain only to quantum phenomena, has become a leading candidate to solve specific problems, such as prime factorization [8], annealing [9], or even simulations of quantum Hamiltonians [10]. This paradigm has matured from a simple proposal [11] to fullfledged devices for possible commercial use [12]. However, its scalability faces considerable practical challenges, and its range of applicability is very limited compared to even classical Turing machines.
Another type of computing model pertains to a seemingly different domain, the one of spiking (recurrent) neural networks, in which spatiotemporal patterns of the network can be used to, e.g., learn features after some training
[13]. Although somewhat different realizations of this type of networks have been suggested, for the purposes of this paper, here we will focus only on the general concept of “reservoircomputing” [14], and in particular on its “liquidstate machine” (LSM) realization [15], rather than the “echostate network” one [16]. The results presented in this paper carry over to this other type of realization as well. Therefore, we will use the term “reservoircomputing” as analogous to “liquidstate machine” and will not differentiate among the different realizations of the former.Our goal for this paper is to study the relation between these seemingly different computational models. In particular, we will show that UMMs encompass both quantum machines and liquidstate machines, in the sense that, on a theoretical level, they can simulate any quantum computer or any liquidstate machine. Again, this does not say anything about the resources needed to simulate such machines, only that such a mapping exists. In other words, we prove here that UMMs are not only Turingcomplete, but also “liquidcomplete” (or “reservoircomplete”) and “quantumcomplete”.
In order to prove these statements, we will use set theory and cardinality arguments to show that the LSM and quantum computers can be mapped to subsets of the UMM. This methodology is general. Therefore, we expect it to be applicable to any other type of computational model, other than the ones we consider in this work.
Our paper is organized as follows. In Sec. II we briefly review the mathematical definition of the machines we consider in this work, starting with UMMs. In Sec. III we introduce the basic ingredients of set theory that are needed to prove the completeness statements that we will need later. In Sec. IV we show how to define a general computing model in terms of set theory and cardinality arguments. We will use these results in Sec. V to rewrite UMMs, quantum computers and liquidstate machines in this settheory language. This allows us to show explicitly in Sec. VI that UMMs are not only Turingcomplete, but also quantumcomplete and liquidstate complete. In Sec. VII we offer our conclusions and thoughts for future work.
Ii Review of Machine Definitions
We first provide a brief review of the definitions of the three machines we will be discussing in this paper, so the reader will have a basis of reference.
Iia Universal Memcomputing Machines
The UMM is an ideal machine formed by interconnected memory cells (“memcells” for short or memprocessors). It is set up in such a way that it has the properties we have anticipated of intrinsic parallelism, functional polymorphism, and information overhead [4].
We can define a UMM as the eighttuple [4]:
(1) 
where is the set of possible states of a single memory cell, and is the set of all transition functions:
(2) 
where is the number of cells used as input (being read), is the final state, is the set of input pointers, is the number of cells used as output (being written), is the Cartesian product of the set of output pointers and the set of input pointers for the next step, and is the set of indices .
Informally, the machine does the following: every transition function has some label , and the chosen transition function directs the machine to what cells to read as inputs. Depending on what is being read, the transition function then writes the output on a new set of cells (not necessarily distinct from the original ones). The transition function also contains information on what cells to read next, and what transition function(s) to use for the next step.
Using this definition, we can better explain what the two properties of intrinsic parallelism and functional polymorphism mean. Unlike a Turing machine, the UMM does not operate on each input cell individually. Instead, it reads the input cells as a whole, and writes to the output cells as a whole. Formally, what this means is that the transition function cannot be written as a Cartesian product of transition functions acting on individual cells, namely . This is the property of intrinsic parallelism. We will later show that the set of transition functions without intrinsic parallelism is, in fact, a small subset of the set of transition functions with intrinsic parallelism.
Furthermore, the transition function of the UMM is dynamic, meaning that it is possible for the transition function to change after each time step. This is shown as the set at the output of the transition function, whose elements indicate what transition functions to use for the next time step. This is the property of functional polymorphism, meaning that the machine can admit multiple transition functions. Finally, the topology of the network is encoded in the definition of the transition functions which map a given state of the machine into another. A more in depth discussion of these properties can be found in the original paper [4].
IiB LiquidState Machines
Informally, we can think of the LSM as a reservoir of water [16]. The process of sending the input signals into the machine is analogous to us dropping stones into the water, and the evolution of the internal states of the machine is the propagation of the water waves. Different waveforms will be excited depending on what stones were being dropped, how and when they were dropped, and the properties of the waveforms will encompass the information of the stones being dropped. Therefore, we can train a function that maps the waveforms to the corresponding output states that we desire.
Formally, we can define the machine using a set of filters and a trained function [13]. A series of filters defines the evolution of the internal states, and the trained function maps the internal states to some output.
The set of filters must satisfy the pointwise separation property. This is defined as follows:
Definition 1.
A class of basis filters has the pointwise separation property if for any two input functions and , with for some , there is a basis filter such that .
This means that we can choose a series of filters such that the evolution of the internal states will be unique to any given signal, at any given time. In other words, this property ensures that different ”stones” excite different ”waveforms”.
The trained output function must satisfy the “fading memory” property. This is defined as follows:
Definition 2.
has fading memory if for every internal state and every , there exist and so that for all with for all .
Intuitively, this means that we do not need to know the evolution of the internal states from the infinite past to determine a unique output. Instead, we only need to know the evolution starting from a finite time .
IiC Quantum Computers
There are many ways in which one can define a quantum computer. For simplicity sake, we consider the most general model of quantum computing and ignore its specific operations.
Consider a quantum computer with
identical qubits, and each qubit can be expressed as a linear combination of
basis states. The choice of the basis states can be arbitrary. However, they have to span the entire Hilbert space of the system. If we look at one single qubit, then every state that it admits can be expressed in terms of a linear combination of the basis states. (Typically is chosen equal to 2, but we do not restrict the quantum machine to this basis number here.) In other words, , where simply labels the basis state, and is some complex number. Note that we have to impose the normalization condition .Now, let us consider the whole system. In general, the total state can be expressed as . Here, denotes that the th qubit is in the
th basis state. A basis state of the total wavefunction can be expressed as the tensor product of the basis states of the individual qubits. Then it is not hard to see that the total state will have a total number of
basis states. Each basis state is associated with some complex factor where, again, the normalization condition has to be imposed.Any quantum algorithm can be expressed as a series of unitary operations [17]. Since the product of multiple unitary operations is still a unitary operation, we can then express the total operation after time with a single operator , so that . In other words, the state of the quantum system at any point in time can be expressed as some unitary operation on the initial state. Note that can be either continuous or discrete in time. Either way, can be considered as the “transition function” of the quantum computer.
Finally, we have to make measurements on the system in order to convert the quantum state into some meaningful output for the observable we are interested in. The process of measurements can be considered as finding the expectation value of some observable , so the output function of a quantum computer can be written as . Of course, to obtain an accurate result of the expectation value, many measurements have to be made. In fact, the initial state has to be prepared multiple times, evolved multiple times, and the corresponding expectation value at a given time, must be measured multiple times. A quantum computer is thus an intrinsically probabilistictype of machine.
Iii Mathematical Tools
After the description of the three machines we consider in this work, we now introduce the necessary mathematical tools that will allow us to construct a general model of a computing machine using set theory and cardinality arguments. Most of the definitions and theorems in this section, with their detailed proofs can be found in the literature on the subject (see, e.g., the textbook [18]).
We denote the cardinality of the set of all natural numbers as (), and the cardinality of the set of all real numbers as (). Then, one can prove the following theorems [19]:
Theorem III.1.
The power set of natural numbers has cardinality c, .
Theorem III.2.
Any open interval of real numbers has cardinality .
We can generalize the concept of infinity by introducing Beth numbers [20], defined as follows:
Definition 3.
Let , and for all .
By this definition, we see that and . Each Beth number is strictly greater than the one preceding it. The following theorem [18] allows us to perform arithmetics on infinite cardinal numbers, and derive relationships between Beth numbers.
Theorem III.3.
Given any two numbers, and , if at least one of them is infinite, then .
Using this theorem, we can prove the following properties of Beth numbers:
Corollary III.3.1.
for all , where .
Corollary III.3.2.
if . if
Proof.
Corollary III.3.1 can be proven trivially if we note that each Beth number is greater than its predecessor. ∎
Proof.
In the following section, we will define computing models using Cartesian products and mapping functions. The following two theorems will be helpful [18]:
Theorem III.4.
Let the set be the Cartesian product of the sets and , or . Then .
Theorem III.5.
Let be a function that maps set to set , and let be the set of all possible functions . Then .
Finally, we introduce the following theorem, which can be derived directly from the definition of cardinality [18]:
Theorem III.6.
Two sets have the same cardinality if and only if there is a bijection between them.
This theorem implies that there is a bijection between any two real coordinate spaces regardless of their dimensions:
Corollary III.6.1.
There is a bijection between and , for any .
Proof.
From Corollary III.3.2, we see that . Similarly, . Therefore, the two sets have the same cardinality, so there is a bijection between the two. ∎
Note that for complex coordinate spaces, . Therefore, there is a bijection between any two complex coordinate spaces as well. Furthermore, there is a bijection between any complex coordinate space and any real coordinate space.
Iv General Computing Model
We have now introduced all the mathematical tools necessary to define a computing machine using set theory. We begin by describing the Cartesian product of two sets  the set of all internal states and the set of all transition functions.
Iva Internal States
Consider a general computing machine. We let the state variable describe the full internal state of the machine, which belongs to a set of states. The internal state should encompass all the necessary information, such that given this state variable and any transition function (which will be defined shortly), the machine can fully determine the next internal state. The important thing to note is that not all the elements of are necessarily possible internal states of the machine. In other words, the set of all possible internal states is only a subset of . The set and this subset are however not necessarily equivalent. The reason for this will be explained in greater detail later in Sec. IVC.
As an example, consider the full internal state of the Turing machine. The internal state should include the Cartesian product of three states  the register state of the control (), the tape symbols written on the cells (), and the current address of the head () [21]. Given these three states and some transition function (which depends on how the machine is coded), then the processor will know what to write (thereby changing ), in what direction to move (thereby changing ), and what new state to be in (thus changing ).
We see that the set is expressible as a Cartesian product of three sets, , then . Consider a Turing machine with tape symbols, tape cells, and register states. It is easy to see that , , and . Therefore, we can calculate . The strict inequality comes from the fact that , , and are all finite numbers. In other words, this is a finite digital/discrete machine, and in general, the is true for a finite digital machine.
On the other hand, if we consider the theoretical model of a Turing machine with infinite tape cells , then the calculation changes significantly. First, note that because we are working with infinite discrete cells and we can map each cell to an integer in . Then, we have , (from Corollary III.3.2), and . Therefore, we see that (from Corollary III.3.1). For the purpose of proving Turingcompleteness, we use this model of infinite tape.
IvB Transition Function
IvB1 General Definition of Transition Function
The operation of any computing machine is defined using a transition function [21] (or a set of transition functions such as in a general UMM [4]). However the transition functions we consider are defined as “deterministic” in contrast to the “nondeterministic” transition functions used for example to define the nondeterministic Turing Machine [3]. Here, we give a much more general definition of the (deterministic) transition function. Essentially, we are throwing away constraints such as initial states, accepting states, tape symbols, and so forth, and simply define the transition function as a mapping from some state to some other (or the same) state. We can then formally define the transition function as follows:
Definition 4.
Let be any set, then is said to be a transition function on if, for every , we have a unique . In other words, is a function that maps to itself, or . We denote the set of all transition functions on as .
From Theorem III.5, it is easy to see that the cardinality of is simply . Again, it is important to note that the set of the actual transition functions of a given machine is a subset of all possible transition functions . The two are generally not equivalent. This is because we are not defining the transition function based on the operation of the machine. Instead, we are defining the transition function as the mapping of a set to itself (see Fig. 1 for a schematic representation), so there are transitions that are impossible for the machine to support. The difference between the “possible” and “actual” transition functions will be formalized in section IVC.
IvB2 Turing Machine as Example
For a Turing machine, after the machine is coded, the transition function remains stationary and cannot be changed during the execution of an algorithm (unlike a UMM where the transition function is dynamic). In other words, the machine will take some initial internal state and apply some transition function to it recursively until the final state is reached and the machine halts. We can express this as , where is some integer. Furthermore, when the final state is reached, we should have , meaning that the transition function should not alter the final state, and this represents the termination of the algorithm.
The entire machine process can be fully described given some initial state and some transition function . To show this, we only have to consider a single step process, or show that there is an appropriate choice of such that given some state , will always give us the expected next state for any algorithm. The following argument will demonstrate why this is true.
From section IVA, we know that given a state , we can divide the state into three components, . Recall that the three components give the register state, the tape symbols on the tape cells, and the address of the head respectively. First, the machine reads the tape symbol under the head. This is equivalent to the transition function taking the input (locating the head) and (reading the tape symbol). Then, according to what is being read and the register state of the control, the head writes a new symbol on the tape cell. This corresponds to the transition function taking the input (reading the register state) and outputting (updating the tape symbol). Finally, the machine moves the head and changes the register state. This is equivalent to the transition function outputting (updating the register state) and (updating the address). Therefore, we see that a single state machine process is fully encompassed in the transition . For every , there exists a unique associated with it. Then we can choose such that is satisfied for every . Therefore, we see that fully describes the single step machine process, and this implies that recursions of can describe any Turing machine algorithm.
Even though one can find a for every Turing machine algorithm, the reverse is not true. In fact, there are transition functions in that the Turing machine cannot support. For example, consider the operation of simultaneously writing two tape symbols on two cells. This cannot be done by the Turing machine since by definition, a Turing machine can only write one tape symbol on one tape cell at a time. However, this transition function is not excluded a priori from , because we can always find a , such that and differ by two tape symbols for some .
IvB3 Proper Bijection
Let us first consider the following important theorem:
Theorem IV.1.
If , then .
The theorem can be easily proven using cardinality arguments (from Theorem III.6). In other words, if there exists a bijection between two sets, then there must also be a bijection between the sets of their respective transition functions.
Note that at this point, we have only proved that there is a bijection between the two transition function sets. However, in our case, we are really only interested in one particular bijection. For simplicity, let us define this bijection for the case. This definition will easily generalize to cases where .
If , then there is a bijection from to . Denote this bijective function as . Let be some transition function on , and for , let , where are just arbitrary real number labels. Then we can define a bijection from to such that and . Note that if we let , then . Now let be another set such that , then similarly, we can let and . Now if we let , then . We can then define as the proper bijection of , and we do this for every single element of the transition function set.
To put this informally, we first define some bijection from set to set , then if bijects to , then also bijects to . The proper bijection essentially maps all machine operations from one machine to another machine. See Fig. 2 for a schematic representation of a pairing of transition functions under the proper bijection.
The following theorem is a restatement of the above discussion:
Theorem IV.2.
If , then there is a proper bijection from to .
It is easy to see that a proper bijection implies that any algorithm that the machine can run, can also be run by machine . This essentially means that and are equivalent [22]. However, this is under the assumption that there are absolutely no constraints whatsoever on the transition functions and/or states of the machines. As we will see shortly, this is generally not the case.
IvC Constraints
The above naive argument essentially states that if the internal states of two different machines have the same cardinality, then the two machines are equivalent. This is obviously not the case. Any computational model has always some constraints, either on the internal states or the transition functions (or both) of the machine at hand.
In other words, the actual set of internal states is a subset of , and the actual set of transition functions is a subset of . In some sense, the computational structure of a machine is defined by its constraints, so we should carefully discuss what constraints actually are.
IvC1 Constraints on Internal States
We make the following distinction between the full set of internal states and the actual set of internal states.
Definition 5.
Let be the set of all possible internal states that a machine can support. We call this the actual set of internal states. Let be some arbitrary superset of . We call the full set of internal states.
As an example, consider a very simple computer that only consists of two bits, with each bit supporting two states, 0 and 1. However, the two bits are connected in such a way that they must have the same state at a given time. In other words, the actual set of internal states is , where denotes the first bit being in state and the second bit being in state .
A natural choice for the full set is instead , namely the set of internal states if the two bits were not connected (constrained). However, we may as well have chosen to be, for example, or . (Note that the last element of the second set, , represents a state that is not supported by the machine, even if the constraint were to be removed.) However, it is usually convenient for us to choose the full set such that it includes, and only includes, all the possible states when the constraints on the machine are removed.
IvC2 Constraints on transition function
Following a similar reasoning, we distinguish between the full set of transition functions and the actual set of transition functions. The full set of transition functions on is already defined in Def. 4, and the definition of the actual set is as follows:
Definition 6.
Let be the set of all transition functions of a machine when the constraints on the machine are included. We call this the actual set of transition functions.
Following the above example, and . For convenience, let us express the set elements in decimal representation. Then and .
We can then choose so that it only contains 4 transition functions, and we let them to be , , , and . Note that this is obviously not the full set of transition functions because, for example, there is no transition function such that (where is the logical AND) is satisfied. In fact, the full set of transition functions has cardinality of , and we merely considered 4 of them.
In addition, the transition functions and are not the same. is the full set of transition functions on the actual set of internal states .
To illustrate this difference, consider the previous example of . Let us try to find the full set of transition functions on this set. There should be possible transition functions. They are , , , and . Without any formal proof, we can already see that is very different from , and there is no proper bijection between and even though they have the same cardinality of 4. This is because the sets that the transition functions act on are different (they have different cardinality), so it makes no sense to pair them together.
An important aside is that the transition function is defined by its operation on every element of a particular set and not its analytic representation. For example, is different from even though they have the same analytic expression. (For example, is not defined while .) On the other hand, and are the same on set even though their analytic expressions are different. (They are, however, not the same on set .)
IvD Equivalence and Completeness
At this point we can recall the traditional definitions of “equivalence” and “completeness” among machines. The equivalence relationship is defined as follows [22]:
Definition 7.
Two machines, and , are said to be equivalent if can simulate , and can simulate .
Then the following theorem is clearly true:
Theorem IV.3.
Two machines, and , are equivalent if , and there is a proper bijection between and .
This theorem is fairly obvious. If there is a proper bijection between two sets of transition functions, then we can map every machine operation from the first machine to the second machine, and vice versa. This implies that the two machines can simulate each other. Note that from Section IVB3, it is clear that there is always a proper bijection between the two full sets of transition functions and if , so if two equalsized machines admit the full sets of transition functions, then the two machines are equivalent.
Now, the natural next step is to define the concept of completeness, but before we do so, let us first introduce the concept of a reduced machine.
Definition 8.
Consider a machine . If , then we call the reduced machine of .
It is easy to obtain mathematically a reduced machine from a full machine. First, we let , then we can express the full set as (note this is only true if ). It is not hard to show that , and from this subset we can simply “ignore” the factor. Then it is clear that simulates , but not the other way around. Informally, we are using a machine with greater “resources” to simulate another machine with lesser “resources”.
Let us then recall the conventional definition of completeness [22]:
Definition 9.
Machine, , is complete if the former can simulate the latter.
From this definition, it is clear that, for the example above, is complete. Note also that equivalence implies completeness, but the reverse is not true.
It is also obvious that can simulate , since is just a subset of the full set of machine processes (transition functions). Then we can also say that is complete. Informally, we are using an unconstrained machine to simulate a constrained machine.
Let us then recall a few obvious lemmas:
Lemma IV.4.
If machine is equivalent to machine , and machine is equivalent to machine , then machine is equivalent to machine .
Lemma IV.5.
If machine is complete, and machine is complete, then machine is complete.
With these lemmas and preliminaries, we can then prove this important theorem:
Theorem IV.6.
Machine is complete if , and if we can find a reduced machine such that and there is a proper bijection between and some subset of .
Proof.
The proof of this is quite simple. We know that is complete, since the latter is just the reduced machine of the former. Furthermore, we also know that is complete. This is because is just a subset of , or there is a “proper injection” from to , implying that any machine process of can be simulated by . Therefore, from Lemma IV.5, we see that is complete. ∎
From this theorem, we can immediately derive a corollary that will be useful for showing the universality of memcomputing machines in section VI:
Corollary IV.6.1.
Machine is complete if , where is the full set of transition functions and can either be the actual set or the full set of transition functions.
Proof.
First, we find a reduced machine of such that . Note that if a machine admits the full set of transition functions, then its reduced machine also admits the full set. Since , then we can find a proper bijection from to (since both sets of transition functions are full). Under the same proper bijective mapping, we can map (a subset of ) to (a subset of ). In other words, there is a proper bijection from to some subset of , and from Theorem IV.6, machine is complete. ∎
V Revised Machine Definitions within Set Theory
Up to this point, we have avoided the discussion of the concept of “output states” of a machine. However, under this new mathematical framework, this concept is easily expressible.
In general, the internal state of the machine must be “decoded” into an output state to be read by the user. We can denote the set of all possible output states as . Then it is obvious that , otherwise we would not be able to find a function that maps to . In other words, every internal state must correspond to a unique output state, and it is easy to show that the output function is expressible as a transition function.
To show this, we first choose a subset such that , then there is a bijection between and . Therefore, we can simply describe the output mapping function as a transition function that maps to . Then it is clear that the set of all output mapping functions is a subset of , so there is no need to redefine a new set to include the output functions.
The mathematical framework has now been fully established, and we are ready to redefine the three machines we are considering in this work within this framework.
Va Universal Memcomputing Machines
In the original definition of the UMM, there is the complication of input and ouput pointers (see Eq. (1)). Within the new settheory framework, we can avoid the concept of pointers, since the transition function always reads the full internal state (all the cells) and writes the full internal state. In other words, we consider the combination of all cell states as a whole, and make no effort in describing which cell admits which state. In this case, intrinsic parallelism is obviously implied.
Let us then discuss the cardinality of the set of internal states. For a UMM, there is a finite number of cells, , and each cell may admit a continuous state (with cardinality ). In this case, it is easy to show that .
Furthermore, in the original definition of the UMM [Eq. (1)], there are no constraints on the transition function . This means that we can use the full set of transition functions, , to describe the machine. In this case, functional polymorphism is obviously implied. Therefore, we can define the UMM machine as , with .
VB LiquidState Machine
It is not hard to see that the internal state structures for the LSM and the UMM are similar. Instead of memory cells, the LSM has neurons. But if we make the conservative assumption that there are no constraints on the internal states of the LSM, then the cardinality of the set of internal states for this machine is the same as that of a UMM,
. However, what real distinguishes the LSM from the UMM is that the set of transition functions for the LSM is not full.Recall that the LSM consists of a series of filters and an output function (see Sec. II). The set of filters satisfies the pointwise separation property, and the function satisfies the fadingmemory property. There is no need to express the two properties in the language of our new framework. Instead, it is enough to note that the pointwise property is a property of a set, while the fadingmemory property is a property of an element of the set.
Therefore, we can find a subset such that its elements represent the filters, with the subset itself satisfying the pointwise separation property. As discussed earlier, we can express the output function as a transition function, so we can find a subset such that its elements represent the output functions, and they satisfy the fadingmemory property. Then, we can take the union of the two subsets to get the actual set of transition functions on .
Therefore, we can describe the LSM as , with . The specific structure of the machine can be defined by expressing the two properties as constraints on . This is a slightly tedious process, so we will not be presenting it here, since it is irrelevant for our conclusions.
VC Quantum Computers
Again, consider a quantum computer with identical qubits, each having basis states. In subsection IIC, we have shown that the total number of basis states for the entire system is . Each basis state is associated with some complex factor . And these factors are constrained by the normalization condition . Furthermore, in practice we usually ignore an overall phase factor since it does not affect the expectation value of an observable.
At this point, it is clear that we can fully describe a quantum state as the Cartesian product of the complex factors for all basis states. In other words, we can represent an internal state as , where there are factors.
Given this information, we can calculate the cardinality of the full set of internal states to be . (The unimportant is from the normalization condition and factoring out the overall phase factor.) In addition, . Therefore, we have .
The full set of internal states contains quantum states with varying degrees of entanglement. It is worth stressing though that, in practice, it is extremely hard to construct a quantum computer that can support the full set of quantum states. For example, it is very challenging to prepare 100 qubits that are fully entangled. (The current record is on the order of tens of fullyentangled qubits [23][24].) Therefore, the actual set is a small subset of , or , unless and are both very small. However, since we are making here only theoretical arguments, let us just assume that the full set of all possible entangled states can be supported.
As discussed previously, the transition functions of a quantum computer can be expressed as unitary operations on some initial state. The set of all unitary operations is obviously a strict subset of the full set of transition functions. For example, you cannot find an unitary operation that collapses every single state to (setting , and setting all the other factors to 0), though this is included in . Therefore, we can describe the quantum computer as .
A few words on the “output function” of a quantum computer are also in order. The output function essentially represents the operation of taking the expectation value of some observable on the internal state, or . This maps the set of internal states to the set of output states (expectation values have to be real), so we obviously have .
Vi Universality of Memcomputing Machines
From the above discussions, we can summarize all the results we have obtained so far, and express all the machines we considered here in their most general form:

Turing Machine: , ,

LiquidState Machine: , ,

Quantum Computer: , ,

Universal Memcomputing Machine: , .
Therefore, by applying Corollary IV.6.1, we see that a UMM can simulate any Turing machine, any liquidstate machine, and any quantum computer.
Let us expand on this for each pair of machines separately. In particular, let us briefly discuss how a mapping between a UMM and the three other machines can be realized in theory. Of course, this mapping does not tell us anything about the resources required for a UMM to simulate the other machines. Hence, this is by no means a discussion on how to realize an efficient or practical mapping.
Vi1 UMM vs. Turing Machines
Let us look at the mapping between a Turing machine and a UMM. First, we map each tape cell to a memory cell (memcell). We can denote these memcells collectively as a “memtape”. The tape symbols can be mapped to the internal states of each memcell of the memtape.
Then, we can map the state register to another memcell which we will denote as “memregister”. The state of the Turing machine is then stored as the internal state of the memregister. Finally, we can store the current address of the head as an internal state of yet another memcell which we denote as “memaddress”.
We can then wire the memcells together into a circuit such that it simulates the operation of the Turing machine. Note that as a result of functional polymorphism, we do not have to rewire the circuit each time we choose to run a different algorithm. The circuit first reads the memregister and the memaddress, so that it knows which memcell of the memtape to modify and how to modify it. After that memcell is modified, the memregister and memaddress then update themselves to prepare for the next cycle. In short, we are replacing the tape, head, and control with memprocessors.
Vi2 UMM vs. LSM
The mapping between a LSM and a UMM is fairly obvious. We simply have to map each “reservoir cell” to a memcell, and wire the circuit such that the pointwise separation and fadingmemory properties are satisfied. The explicit construction of the circuit to realize such properties will not be explored here.
Although, in theory, it is possible to simulate an LSM with a UMM, it is not always efficient or necessary to do so in practice. The circuit topologies of the two machines are very different, and they are designed to perform different tasks.
For the LSM model, the connections between the reservoir cells are typically random, and the reservoir as a whole is not trained. The expectation of getting the correct output relies entirely on training the output function correctly. In the end, the operation of the machine relies on statistical methods, and is inevitably prone to making errors. In some sense, the machine as a whole is analogous to a “learning algorithm” [25].
On the other hand, for the UMM, we can connect the memcells into a circuit specific to the tasks at hand. One realization of this connection employs selforganizing logic gates [5] to control the evolution of the machine such that it will always evolve towards an equilibrium state representing the solution to a particular problem (the machine is deterministic).
In the general case, the UMM is an entirely different computing paradigm than the LSM, proven already to provide exponential speedup for certain hard problems [5, 7]. In other words, while possible, utilizing a UMM to simulate an LSM will not be exploiting all the properties of the UMM to its full use. In practical applications, it would then be more advantageous to use a UMM (and its digital version, a DMM) to tackle directly the same problems investigated by LSMs.
Vi3 UMM vs. Quantum Computers
Simulating a quantum computer with a UMM requires “compressing” the internal state of a quantum computer. Recall that a quantum computer has basis states, and each basis state is associated with some complex factor. All these complex factors would then need to be represented with only interacting memcells. In this work we have shown that this is in fact doable on a theoretical level.
However, as already mentioned, this result does not provide any information on the resources required for a UMM to simulate a quantum computer. Nevertheless, one of the features that makes UMMs a practical and powerful model of computation is precisely its “information overhead”.
Information overhead and quantum entanglement share some similarities: in some sense, both of them allow the machine to access a set of results of mathematical operations (without actually storing them) that is larger than that provided by simply the union of noninteracting processing units (cf. Refs. [4] and [8]). We could then argue that we may exploit the information overhead property of a UMM to precisely represent efficiently the entanglement of a quantum system. At this point, however, this question is still open.
Vii Conclusions
In conclusion, we have employed set theory and cardinality arguments to describe the relation between universal memcomputing machines and other types of computing models, in particular liquidstate machines and quantum computers. Using this mathematical framework we have confirmed that UMMs are Turingcomplete, a result already obtained in Ref. [4] using a different approach.
In addition, we have also shown that UMMs are liquidcomplete (or reservoircomplete), and quantumcomplete, namely they can simulate any liquidstate (or reservoircomputing) machine and any quantum computer. Of course, the results discussed here do not provide an answer to the question of what resources would be needed for a UMM to efficiently simulate such machines, only that such a mapping exists. Along these lines, it would be interesting to study the relation between information overhead and quantum entanglement. If such a relation exists and can be exploited at a practical level, it may suggest how to utilize UMMs to efficiently simulate quantum problems that are currently believed to be only within reach of quantum computers (such as the efficient simulation of quantum Hamiltonians). Further work is however needed to address this practical question.
Acknowledgments – MD acknowledges partial support from the Center for Memory and Recording Research at UCSD.
References
 [1] M. Di Ventra and Y. V. Pershin, “The parallel approach,” Nature Physics, vol. 9, pp. 200–202, 2013.
 [2] J. v. Neumann, “First draft of a report on the edvac,” tech. rep., 1945.
 [3] S. Arora and B. Barak, Computational Complexity: A Modern Approach. New York, NY, USA: Cambridge University Press, 1st ed., 2009.
 [4] F. L. Traversa and M. Di Ventra, “Universal memcomputing machines,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 11, p. 2702, 2015.
 [5] F. L. Traversa and M. Di Ventra, “Polynomialtime solution of prime factorization and npcomplete problems with digital memcomputing machines,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 27, p. 023107, 2017.
 [6] F. L. Traversa, C. Ramella, F. Bonani, and M. Di Ventra, “Memcomputing NPcomplete problems in polynomial time using polynomial resources and collective states,” Science Advances, vol. 1, no. 6, p. e1500031, 2015.
 [7] F. Traversa, P. Cicotti, F. Sheldon, and M. Di Ventra, “Evidence of an exponential speedup in the solution of hard optimization problems,” arXiv:1710.09278, 2017.
 [8] P. W. Shor, “Polynomialtime algorithms for prime factorization and discrete logarithms on a quantum computer,” SIAM J. Comput., vol. 26, pp. 1484–1509, Oct. 1997.
 [9] A. Finnila, M. Gomez, C. Sebenik, C. Stenson, and J. Doll, “Quantum annealing: A new method for minimizing multidimensional functions,” Chemical Physics Letters, vol. 219, pp. 343–348, Mar. 1994.
 [10] G. H. Low and I. L. Chuang, “Optimal hamiltonian simulation by quantum signal processing,” Physical Review Letters, vol. 118, Jan. 2017.
 [11] P. Benioff, “The computer as a physical system: A microscopic quantum mechanical hamiltonian model of computers as represented by turing machines,” Journal of Statistical Physics, vol. 22, pp. 563–591, May 1980.

[12]
D. Korenkevych, Y. Xue, Z. Bian, F. Chudak, W. G. Macready, J. Rolfe, and E. Andriyash, “Benchmarking quantum hardware for training of fully visible boltzmann machines,” 2016.
 [13] W. Maass, T. Natschläger, and H. Markram, “Realtime computing without stable states: A new framework for neural computation based on perturbations,” Neural computation, vol. 14, no. 11, pp. 2531–2560, 2002.
 [14] M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Computer Science Review, vol. 3, no. 3, pp. 127–149, 2009.
 [15] D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, “Isolated word recognition with the liquid state machine: a case study,” Information Processing Letters, vol. 95, no. 6, pp. 521–528, 2005.
 [16] H. Jaeger, “The echo state approach to analysing and training recurrent neural networkswith an erratum note,” Bonn, Germany: German National Research Center for Information Technology GMD Technical Report, vol. 148, no. 34, p. 13, 2001.
 [17] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information. Cambridge Series on information and the Natural Sciences, Cambridge University Press, 10th Aniversary ed., 2010.
 [18] H. B. Enderton, Elements of set theory. Academic Press, 1977.
 [19] G. Cantor, “Ueber eine eigenschaft des inbegriffs aller reellen algebraischen zahlen.,” Journal für die reine und angewandte Mathematik, vol. 77, pp. 258–262, 1874.
 [20] T. E. Forster, “Set theory with a universal set. exploring an untyped universe,” 1994.
 [21] J. E. Hopcroft, R. Motwani, and J. D. Ullman, “Introduction to automata theory, languages, and computation,” ACM SIGACT News, vol. 32, no. 1, pp. 60–65, 2001.

[22]
H. R. Lewis and C. H. Papadimitriou,
Elements of the Theory of Computation
. Prentice Hall PTR, 1997.  [23] T. Monz, P. Schindler, J. T. Barreiro, M. Chwalla, D. Nigg, W. A. Coish, M. Harlander, W. Hänsel, M. Hennrich, and R. Blatt, “14qubit entanglement: Creation and coherence,” Physical Review Letters, vol. 106, no. 13, p. 130506, 2011.
 [24] C. Song, K. Xu, W. Liu, C.p. Yang, S.B. Zheng, H. Deng, Q. Xie, K. Huang, Q. Guo, L. Zhang, et al., “10qubit entanglement and parallel logic operations with a superconducting circuit,” Physical Review Letters, vol. 119, no. 18, p. 180511, 2017.
 [25] Y. Zhang, P. Li, Y. Jin, and Y. Choe, “A digital liquid state machine with biologically inspired learning and its application to speech recognition,” IEEE transactions on neural networks and learning systems, vol. 26, no. 11, pp. 2635–2649, 2015.