The increasing demand for computational power and efficiency is driving the scientific community and industry to explore new and unconventional ways to compute. In this respect, new ideas and radically different paradigms may be the key to solve or mitigate the computational bottlenecks that affect present computing technology.
A new computing paradigm has been recently proposed called memcomputing (which stands for computing in and with memory) [1, 2], with the potential to increase the computational efficiency in the solution of hard combinatorial/optimization problems. The formal description of memcomputing rests on the concept of Universal Memcomputing Machines (UMMs) 
. UMMs are a collection of interconnected memprocessors (processors with memory) able to process and store information on the same physical location. UMMs are a class of non-Turing machines with specific properties as described in[2, 3]. In particular, they support intrinsic parallelism, i.e., interconnected (mem-)processors act collectively on data using their collective state to perform computation[2, 4, 5]. Moreover, memprocessors are able to exploit the information available through the topology of their connections. Indeed, specific network topologies may be designed to embed problems in a one-to-one mapping. This last property has been named information overhead .
We have recently shown that the digital (hence scalable) subclass of UMMs we call digital memcomputing machines (DMMs) can deliver substantial benefits in the solution of hard combinatorial/optimization problems compared to traditional algorithmic approaches. For instance, in Ref.  we have shown that the simulation of DMMs on a classical computer solves the search version of the subset-sum problem in polynomial time for the worst cases. In Ref. 
we have shown how to accelerate the pre-training of restricted Boltzmann machines using simulations of DMMs, and demonstrated their advantage over quantum-based machines (implemented in hardware), such as D-wave machines
, as well as state-of-the-art supervised learning[9, 10]. Finally, in Ref. , we have employed simulations of DMMs and shown substantial advantages over traditional algorithms for a wide variety of optimization problems such as the Random 2 Max-SAT, the Max-Cut, the Forced Random Binary problem, and the Max-Clique . In some cases, the memcomputing approach obtains the solution to problems on which the winners of the 2016 Max-SAT competition failed .
We have also performed scalability tests on finding approximate solutions to hard constructed instances of the Max-SAT problem . Max-SAT possesses an inapproximability gap , meaning that even calculating an approximate solution beyond a certain fraction of the optimum will require resources which grow exponentially with the input size in the worst case. We observed this exponential growth on the tested instances with some of the best solvers from the 2016 Max-SAT competition. Instead, our simulations done for these hard cases, and up to a certain problem size succeed in overcoming that gap in linear time employing memory (of the single processor used for the simulations) that also scales linearly with input size .
Our goal for the present paper is to understand how far we can push these simulations with the available resources at our disposal. We focus again on hard combinatorial optimization problems since these arise in nearly every industrial and scientific discipline. These problems involve the minimization or maximization of a function of many independent variables, often called the cost function, whose value represents the quality of a given solution [14, 15]. In industrial settings this may be the wiring cost of a computer chip layout, or the length of a delivery route . In scientific applications it may be searching for the ground state of a spin system or proteins .
We will show here that the simulations of DMMs on hard optimization problems can be pushed to tens of millions of variables, corresponding to about one billion literals, using a commercial and sequential MATLAB code (the Falcon simulator provided by MemComputing, Inc.) running on a single thread of an Intel Xeon E5-2680 v3 with 128 Gb DRAM shared on 24 threads. These results show once more the power of physics-inspired approaches to computation over traditional algorithmic methods.
Ii The problem
In this work, we focus on cost functions over a set of Boolean variables , into which a large number of problems may be cast.
A common formulation of these problems is given by a set of Boolean constraints where the cost function counts the number of constraints satisfied by an assignment. This is often expressed in conjunctive normal form (CNF) where each constraint (also called a clause) is expressed as the disjunction (logical OR denoted by ) of a set of literals (a variable or its negation ) which are then conjoined (logical AND denoted by ) together, e.g., in expressions of the type:
The CNF representation may be considered general in the sense that any Boolean formula may be expressed in this form .
The problem of determining the assignment satisfying the maximum number of clauses is known as Max-SAT and is NP-hard, i.e., any problem in the class non-deterministic polynomial (NP) may be reduced to it in polynomial time . As a result, algorithms for obtaining the solution will, in the worst cases, require a time that scales exponentially with the size of the instance, thus creating severe bottlenecks in the solution of complex industrial and scientific problems. The ubiquity and importance of this problem is exemplified by the yearly MAX-SAT competitions, where state-of-the-art solvers are tested on a variety of benchmarks to stimulate research and innovation .
In applications where the optimal solution is required exact algorithms must be used . These complete solvers typically proceed by first obtaining bounds on the quality of the solution and then using these bounds to prune the search tree in a backtracking search. This systematic approach guarantees that the resulting solution will be the optimum, but typically scales poorly and is impractical for large instances.
In these cases, heuristic or incomplete algorithms must be used [19, 20, 21]. Rather than systematically searching the solution space, these solvers generate an initial assignment and then iteratively improve upon it, using a variety of strategies to boost the efficiency of their local search. After a specified number of steps, the algorithm returns its best assignment. As randomness is often used to drive the search, this procedure is referred to as stochastic local search. While they can no longer guarantee optimatility, these solvers have proven very effective at approximating, and sometimes solving difficult Max-SAT instances. For instance, in the 2016 Max-SAT competition , incomplete solvers performed 2 orders of magnitude faster than complete solvers on random and crafted benchmarks.
We might hope that if we seek an approximation, rather than a solution of a Max-SAT instance, we could avoid the exponential scaling of the run-time. Unfortunately, it turns out that even approximating the solution of many difficult optimization problems is NP-hard. More precisely, for a maximization problem with optimum defined as the sum of the weights of all satisfied clauses, obtaining an assignment better than for the fraction greater than some critical fraction, , is an NP-hard problem [22, 13]. For example, obtaining an assignment for Max-E3SAT (a version of Max-SAT in which every clause has 3 literals) better than of the optimum is NP-hard, meaning that we cannot expect a polynomial algorithm to obtain the approximation for any instance of Max-E3SAT unless NP=P. Any known algorithm thus must show exponential scaling for a threshold past in the worst case. In  we showed a region in which some of the best algorithms based on stochastic local search show an exponential growth with input size, but where the memcomputing approach based on deterministic dynamical systems requires only linearly growing time to achieve the same threshold.
Iii Digital Memcomputing Machines
As mentioned in the Introduction, we present here a radically novel approach to these problems based on the simulation of DMMs on a standard classical computer [4, 2]. DMMs can be implemented in practice as specially-designed dynamical systems whose equilibrium (fixed) points represent the approximations to the computational problem at hand, and which can be realized with standard electrical components and those with memory.
The mathematical definition of DMMs is the eight-tuple 
where (although not strictly necessary) we consider the range of . Generalization to any finite number of states is trivial. is a set of functions
where is the number of memprocessors used as input of (read by) the function , and is the number of memprocessors used as output (written by) the function ; is the set of the arrays of pointers that select the memprocessors called by and is the set of indexes ; is the set of the initial states written by the input device on the computational memory; is the initial array of pointers; is the initial index , and for some is the set of final states.
The Boolean circuit representing the problem is constructed.
Traditional logic gates are replaced with self-organizing logic gates (SOLGs)  which are circuit elements designed to interact and collectively organize into a logically-consistent state.
The resulting self-organizing logic circuit (SOLC) is fed voltages along its boundary, and allowed to evolve until it reaches equilibrium where the results of a computation may be read out.
A detailed account of SOLGs may be found in  (see also Ref. ). In essence, they may be understood as dynamical components whose equilibrium points encode the truth table of a logic gate, and can self-organize into a consistent logical relation of such truth table irrespective of the terminal to which a given truth value (a literal) is assigned. Each terminal is equipped with a dynamic error correcting module which reads the states of its neighbors, and sets the local current and voltage to enforce logical constraints. These elements are specially designed so that the resulting dynamical system is point dissipative  (in the context of functional analysis), avoids periodic orbits  and the chaotic behavior  that is typical of other non-linear dynamical systems, and utilizes components with memory to allow gates to efficiently correlate and collectively transition between states  (see also below).
The dynamics of the system are deterministic and, since they allow for collective transitions of large numbers of variables, they are also non-local in the same sense that complete combinatorial solvers are. Therefore, our approach contrasts sharply with those based on a stochastic local search such as simulated annealing , and many incomplete solvers [19, 20]. We will comment further on the reasons for these differences and the role of long-range order later.
The procedure above may be followed to construct a hardware solver for a wide variety of combinatorial and optimization problems. However, since memcomputing machines are made of non-quantum elements, the resulting behavior is also captured by a set of nonlinear ordinary differential equations describing the circuit (see Ref. for an example of DMMs equations of motion). These equations can be efficiently simulated, constituting an algorithm for the same problem. While this approach may seem indirect, as already noted, surprisingly the simulation of these circuits using standard numerical packages  is sufficient to outperform the state-of-the-art combinatorial approaches on many benchmarks [11, 7, 6].
In subsequent sections we demonstrate this and stress test these simulations by showing the results of a direct comparison on the same hardware between the Falcon solver of MemComputing, Inc. and the solver DeciLS, an improved version of one of the best solvers of the 2016 Max-SAT competition . In all cases our simulations considerably outperform the other method tested, by orders of magnitude. Our solver indeed scales linearly in time and memory compared to the expected exponential scaling of the other solver.
As outlined above, we formulate our instances as Max-E3SATs, in which each clause contains exactly 3 literals and which has an inapproximability gap. A particular instance of Max-E3SAT may be characterized by the number of variables, , and the number of clauses, , or alternatively the clause density, . As increases, the relationships between the variables become increasingly constrained, and it becomes less likely that an assignment satisfying all clauses will exists, i.e., the problem is more likely UNSAT.
In the case of Random-3SAT, in which clauses are uniformly drawn from the set of possible clauses amongst variables, instances undergo a SAT/UNSAT transition at , below which an instance is almost certainly satisfiable, and above which it is almost certainly unsatisfiable (the transition being sharply defined as ). Different methods of generating instances however, will lead to different transitions: Random-3XORSAT, in which clauses of 3 literals are formed using the exclusive OR, , undergoes a SAT/UNSAT transition at .
To stress test the capabilities of simulating DMMs we utilized a set of Max-E3SAT instances based on a variant of Random-3XORSAT in which each variable was constrained to occur the same number of times (or as nearly as possible while satisfying the specified and ) . This particular set of instances were chosen for their low inter-instance variability in difficulty due to the constraint placed on variable occurrences, and the relevance of MAX-XORSAT to important problems in decoding [29, 30, 31] The balanced structure of the resulting SAT makes them difficult for local solvers since, whenever a variable assignment is changed, it flips the state of all XORSAT clauses in which it was included. Thus, in order to find transitions which satisfy a larger number of clauses, the system is forced to flip increasingly large numbers of literals concurrently.
To generate these hard instances, we first generated a random 3XORSAT instance with in which each variable was allowed to occur 3 or 4 times. These were then converted to a SAT instance by replacing each XORSAT clause with 4 CNF clauses which reproduce the truth table of the XORSAT clause. The resulting Max-E3SAT instances have a clause density of . We call this problem delta-Max-E3SAT .
To clearly show the superior performance of our approach compared to standard algorithms, we have set a threshold of of unsatisfiable clauses and tested how long DeciLS and Falcon take to overcome this limit with increasing number of variables. DeciLS is a compiled code that combines a unit propagation based decimation (Deci) and local search (LS) with restarts . Past useful assignments are used to break conflicts in the unit propagation and may be considered “messages” from the past.
The solver Falcon is a sequential MATLAB code that integrates forward the DMMs equations of motion using an explicit Euler method. All calculations have been performed on a single core of an Intel Xeon E5-2680 v3 with 128 GB DRAM. As expected the local solver requires an exponentially increasing time to reach that limit already at a few thousand variables (see squares symbols in Fig. 1). Instead, the simulations of DMMs scale linearly in time up to variables, which at a density of 5, corresponds to clauses, or about 1 billion literals. The largest case required a little over than seconds to complete. In fact the time was not a limitation, rather the memory of the processor.
In Fig. 2 we present the memory used in the computation by Falcon as a function of variables. It is clear that also the memory scales linearly up to variables. For comparison, we also display the memory size of the input file, showing that the MATLAB implementation of DMMs equations of motion has about an order of magnitude overhead in memory. Since the memory of the Intel Xeon used for these simulations was only 128 GB DRAM, that limit has set a hard stop to the actual size we could fit on that processor. Of course, different implementations (e.g., using a compiled language rather than an interpreted one), different hardware, etc. may permit execution of even larger instances.
V Collective dynamics of DMMs
The simulations surveyed in this paper and those already performed in relation to other problems  indicate that collective dynamics play an important role in the ability of DMMs to find good approximations to the optimum in non-convex landscapes. This contrasts sharply with the dynamics of stochastic-local-search solvers, and an analogy may be drawn to the co-tunneling events observed in quantum-annealing devices such as those manufactured by D-Wave , although DMMs are non-quantum machines.
Of particular interest is the fact that the transient dynamics of a DMM proceeds via an instantonic phase where the machine rids itself of logical defects [26, 33]. Instantons are families of trajectories of the non-linear equations of motion of DMMs, and connect different critical points in the phase space that have different indexes, namely different number of unstable directions.
In mathematical terms, if
is the equation of motion describing a DMM, with the set of elements (e.g., voltages, currents and internal state variables), and
the flow vector field, then instantons are deterministic trajectories
that connect two arbitrary critical points of , say and . The parameters are the so-called modulii of instantons, and encode their non-locality.
Indeed, the presence of instantons creates long-range order in the system, both in time and in space . Spatial long-range order means that logic gates can correlate at arbitrary distances from each other. Temporal long-range order implies that the system can follow different paths to obtain the same solution.
It is the presence of these instantons that renders these machines efficient in the solution search of complex problems. In fact, by transforming a combinatorial or optimization problem (a Boolean problem) into a physical (dynamical) one as described above, its simulation is also efficient. The reason is that the original Boolean problem is no longer solved combinatorially, as it is usually done. A combinatorial approach requires a search in a space that grows exponentially for a complex problem. Rather, the solution search is accomplished, via instantons, by a physical, dynamical system and the latter can be simulated efficiently by solving its corresponding differential equations. In addition, since instantons are topological objects, the solution search is robust against noise and structural disorder, a fact that was also shown explicitly in Ref. .
The performance of digital memcomputing machines on the benchmarks presented in this paper demonstrates the substantial advantages of our approach, based on the simulation of non-linear dynamical systems, compared to traditional combinatorial ones. While we have focused on the maximum-satisfiability problem, the methods we have illustrated readily generalize to a wide variety of combinatorial optimization problems. It would then seem that physics-based approaches offer a lot to the world of computing, and we believe these ideas may form the basis for the next generation of computational devices.
We sincerely thank Dr. Shaowei Cai for providing us with the binary compiled code DeciLS. We also thank Haik Manukian and Robert Sinkovits for helpful discussions. M.D. and F.L.T. acknowledge partial support from the Center for Memory Recording Research at UCSD. M.D. and F.S. acknowledge partial support from MemComputing, Inc. All calculations reported here have been performed by one of us (P.C.) on a single processor of the Comet cluster of the San Diego Supercomputer Center, which is an NSF resource funded under award #1341698. The authors would be delighted to provide, upon request, all instances of the constrained delta-Max-E3SAT used to generate Figs. 1 and 2.
-  M. Di Ventra and Y. V. Pershin, “The parallel approach,” Nature Physics, vol. 9, pp. 200–202, 2013.
-  F. L. Traversa and M. Di Ventra, “Universal memcomputing machines,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 11, p. 2702, 2015.
-  Y. Pei, F. L. Traversa, and M. Di Ventra, “On the universality of memcomputing machines,” arXiv:1712.08702, 2017.
-  F. L. Traversa and M. Di Ventra, “Polynomial-time solution of prime factorization and np-complete problems with digital memcomputing machines,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 27, p. 023107, 2017.
-  F. L. Traversa, C. Ramella, F. Bonani, and M. Di Ventra, “Memcomputing NP-complete problems in polynomial time using polynomial resources and collective states,” Science Advances, vol. 1, no. 6, p. e1500031, 2015.
-  M. D. Ventra and F. L. Traversa, “Memcomputing: Leveraging memory and physics to compute efficiently,” J. Appl. Phys., vol. 123, p. 180901, 2018.
-  H. Manukian, F. L. Traversa, and M. Di Ventra, “Accelerating deep learning with memcomputing,” arXiv:1801.00512, 2018.
S. H. Adachi and M. P. Henderson, “Application of quantum annealing to training of deep neural networks.”
X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,”
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 315–323.
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” inProceedings of the 32nd International Conference on Machine Learning
, ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37. Lille, France: PMLR, 07–09 Jul 2015, pp. 448–456.
-  F. L. Traversa, P. Cicotti, F. Sheldon, and M. Di Ventra, “Evidence of exponential speed-up in the solution of hard optimization problems,” Complexity (in press); arXiv:1710.09278.
-  M. R. Garey and D. S. Johnson, Computers and Intractability; A Guide to the Theory of NP-Completeness. New York, NY, USA: W. H. Freeman & Co., 1990.
-  J. Håstad, “Some optimal inapproximability results,” Journal of the ACM, vol. 48, no. 4, pp. 798–859, jul 2001.
-  K. S. Christos H. Papadimitriou, Combinatorial Optimization. Dover Publications Inc., 1998.
-  S. H. Z. Edwin K. P. Chong, An Introduction to Optimization. JOHN WILEY & SONS INC, 2013.
-  S. Kirkpatrick, C. D. Gelatt, M. P. Vecchi et al., “Optimization by simulated annealing,” Science, vol. 220, no. 4598, p. 671, 1983.
-  S. Kobe and J. Krawczyk, “Ground states, energy landscape, and low-temperature dynamics of spin glasses,” in Computational complexity and statistical physics, A. Percus, G. Istrate, and C. Moore, Eds. OUP USA, 2006.
-  [Online]. Available: http://www.maxsat.udl.cat/16/index.html
-  C. P. Gomes, H. Kautz, A. Sabharwal, and B. Selman, “Satisfiability solvers,” in Handbook of knowledge representation, F. Van Harmelen, V. Lifschitz, and B. Porter, Eds. Elsevier, 2008, vol. 1, ch. 2, pp. 89–134.
-  H. A. Kautz, A. Sabharwal, and B. Selman, “Incomplete algorithms,” in Handbook of satisfiability, A. Biere, M. Heule, and H. van Maaren, Eds. IOS press, 2009, vol. 185, pp. 185–203.
-  J. Hromkovic, Algorithmics for Hard Problems: introduction to combinatorial optimization, randomization, approximation, and heuristics. Springer, 2010.
-  U. Feige, “A threshold of ln n for approximating set cover,” Journal of the ACM, vol. 45, no. 4, pp. 634–652, jul 1998.
-  J. Hale, Asymptotic Behavior of Dissipative Systems, 2nd ed., ser. Mathematical Surveys and Monographs. Providence, Rhode Island: American Mathematical Society, 2010, vol. 25.
-  M. Di Ventra and F. L. Traversa, “Absence of periodic orbits in digital memcomputing machines with solutions,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 27, p. 101101, 2017.
-  M. Di Ventra and F. L. Traversa, “Absence of chaos in digital memcomputing machines with solutions,” Phys. Lett. A, vol. 381, p. 3255, 2017.
-  M. Di Ventra, F. L. Traversa, and I. V. Ovchinnikov, “Topological field theory and computing with instantons,” Ann. Phys. (Berlin), vol. 529, p. 1700123, 2017.
-  [Online]. Available: https://www.mathworks.com/
-  S. Cai, C. Luo, and H. Zhang, “From decimation to local search and back: A new approach to maxsat,” in to appear in proceedings of International Joint Conference on Artificial Intelligence, 2017.
-  F. Ricci-Tersenghi, M. Weigt, and R. Zecchina, “Simplest random k-satisfiability problem,” Physical Review E, vol. 63, no. 2, p. 026702, 2001.
-  H. Jia, C. Moore, and B. Selman, “From spin glasses to hard satisfiable formulas,” in International Conference on Theory and Applications of Satisfiability Testing. Springer, 2004, pp. 199–210.
-  S. Cocco, R. Monasson, A. Montanari, and G. Semerjian, “Analyzing search algorithms with physical methods,” 2006.
-  V. S. Denchev, S. Boixo, S. V. Isakov, N. Ding, R. Babbush, V. Smelyanskiy, J. Martinis, and H. Neven, “What is the computational value of finite-range tunneling?” Phys. Rev. X, vol. 6, p. 031015, Aug 2016. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevX.6.031015
-  S. R. B. Bearden, H. Manukian, F. L. Traversa, and M. Di Ventra, “Instantons in self-organizing logic gates,” Physical Review Applied, vol. 9, p. 034029, 2018.