Spiking Neural Streaming Binary Arithmetic

03/23/2022
by   James B. Aimone, et al.
Sandia National Laboratories
0

Boolean functions and binary arithmetic operations are central to standard computing paradigms. Accordingly, many advances in computing have focused upon how to make these operations more efficient as well as exploring what they can compute. To best leverage the advantages of novel computing paradigms it is important to consider what unique computing approaches they offer. However, for any special-purpose co-processor, Boolean functions and binary arithmetic operations are useful for, among other things, avoiding unnecessary I/O on-and-off the co-processor by pre- and post-processing data on-device. This is especially true for spiking neuromorphic architectures where these basic operations are not fundamental low-level operations. Instead, these functions require specific implementation. Here we discuss the implications of an advantageous streaming binary encoding method as well as a handful of circuits designed to exactly compute elementary Boolean and binary operations.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

12/18/2020

Factorizations of Binary Matrices – Rank Relations and the Uniqueness of Boolean Decompositions

The application of binary matrices are numerous. Representing a matrix a...
09/14/2016

A Fast Algorithm for Computing the Truncated Resultant

Let P and Q be two polynomials in K[x, y] with degree at most d, where K...
04/27/2020

Computing the Boolean product of two n× n Boolean matrices using O(n^2) mechanical operation

We study the problem of determining the Boolean product of two n× n Bool...
04/20/2020

VOWEL: A Local Online Learning Rule for Recurrent Networks of Probabilistic Spiking Winner-Take-All Circuits

Networks of spiking neurons and Winner-Take-All spiking circuits (WTA-SN...
12/20/2021

Efficient Floating Point Arithmetic for Quantum Computers

One of the major promises of quantum computing is the realization of SIM...
03/25/2019

Spike-based primitives for graph algorithms

In this paper we consider graph algorithms and graphical analysis as a n...
01/29/2016

Boolean Operations using Generalized Winding Numbers

The generalized winding number function measures insideness for arbitrar...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction Background

Fundamental to many paradigms of computing are Boolean functions and arithmetic operations. These core concepts can then be composed to build arbitrarily complex computations and set a foundation for comparing implementations and understanding computability. In pursuing an understanding of what computations neural circuits can perform, prior work has explored universal function approximation as well as Turing completeness [1, 2]

. Accordingly, with that foundation in hand it is straightforward that spiking neurons can be used to compute arithmetic functions. However, here we not only provide several fundamental arithmetic computations as spiking neural streaming circuits, but use them as a means of understanding and enabling neuromorphic computing (NMC). Accordingly, here we present a set of streaming neural binary circuits implemented in Fugu, a neural algorithm composition framework, showing how more complex functions can be built upon operations such as addition and subtraction leading to multiplication.

Classic computational paradigms are incredibly efficient at performing these base building blocks of numerical computation, having been optimized for decades to minimize the computational kernel and maximize scalability [3, 4]. Exacting these computations in neurons both shows potential for how future, device breakthroughs in the development of neuromorphic hardware can enable classic numerical computations. And this work has been inspired partly by previous approaches for implementing logic and arithmetic in spiking networks, such as [5, 6, 7, 8]

. But furthermore, this exploration also examines how the computational flexibility in spiking neural networks can be leveraged to enable compositionality for more complex computations. Alternatively, if the highly optimized canonical approaches were used to compute fundamental arithmetic operations which integrate neural sub-functions, there is a cost to convert in and out of neural circuity analogous to paying for analog to digital conversions.

Ii Fugu

As a means of showing compositionality and scalability of spiking algorithms, we use the Fugu framework to represent the neural circuits presented here [9]. While implementation details vary based on NMC hardware, Fugu is a high-level framework specifically designed for developing spiking circuits in terms of computation graphs. Accordingly, with a base leaky-integrate-and fire (LIF) neuron model at its core, neural circuits are built as ‘bricks’. These foundational computations are then combined and composed as ‘scaffolds’ to construct larger computations. This allows us to describe the streaming binary arithmetic circuits in terms of neural features common to most NMC architectures rather than platform specific designs.

In addition to architectural abstraction, the compositionality concept of Fugu not only facilitates a hierarchical approach to functionality development but also enables adding pre and post processing operations to overarching neural circuits. Such properties position Fugu to help explore under what parameterization or scale a neural approach may offer an advantage. For example, prior work has analyzed neural algorithms for computational kernels like sorting, optimization, and graph analytics identifying different regions in which a neural advantage exists accounting for neural circuit setup, timing, or other factors [10, 11, 12, 13].

Iii Spiking Binary Arithmetic

An open research question in neuroscience and neuromorphic computing is how to encode information [14]. The transmission of spikes can conveys information in their timing, enabling complex spatial temporal representations. However, here we do not exploit any novel spike encoding representations but rather use a binary representation of numbers which starts streaming the least significant bit first to the most significant bit last. Neurons can be used to represent many different coding schemes, but the main advantages of this “little-endian” temporal binary representation are that:

  • One neuron is required per variable represented, with timesteps required for a -bit number.

  • Overflow can be handled by simply increasing the length of the vector (adding one additional timestep).

  • We can construct streaming addition, subtraction, and multiplication based on standard binary arithmetic operations.

The circuits described here are intended to be compatible with any large-scale NMC platform, which means that we mostly avoid using decays as they can be additive or multiplicative depending on the architecture. We note exception to this in our discussions, such as in the inequality check in section III-C.

Iii-a Streaming Adder

One fundamental operation that we can easily implement is an adder, which we have implemented in Fugu, the IBM’s TrueNorth Architecture [15], and the Intel Loihi [16] platform. An adder is an instrumental function in digital electronics. Besides its use in arithmetic logic units directly, the adder operation is involved in many control tasks of a processor as well. At the bit level, this operation simply produces the outcome of combining bits for all possible combinations. This simple concept becomes slightly more complex when needing to handle overflow or carry. But foundationally, the concept can readily be implemented as a series of Boolean gates which account for the number of bits received. The connectivity, thresholds, and decay dynamics of the neural circuit are constructed such that the neurons can implement logic functions. For example, with a threshold of one, a neuron spikes upon receiving any input (analogous to a logical OR). Decay dynamics impose structure upon when inputs must arrive. And the internal state of the neurons allows the circuit to store a carry value. Altogether, these operations enable the neural circuit to route the flow of spikes, much like a canonical binary full adder does using standard logic gates. The adder that we constructed takes two inputs (denoted and in Figure 1) and uses three hidden neurons with different thresholds.

Fig. 1: Schematic of a binary adder circuit. This circuit is implemented as a ‘brick’ in Fugu, allowing it to be re-used in more complex circuits

Iii-B Inversion

To support a subtraction operation we need to invert a binary stream which is equivalent to a NOT gate. Our implementation of a NOT gate on TrueNorth requires one neuron and is depicted in Figure 3a. Our binary number representation in the spike domain is that a spike in time represents a 1 and no spike in time represents a 0. The NOT gate neuron inhibits any incoming spikes, but if no spike is received the neuron will spike. The implementation details for this will vary based on the NMC hardware being used. For TrueNorth, we configure a neuron to have a positive leak value of and a threshold of

. The input synapse weight to the neuron is set at

. In this configuration, if no spike is received, the positive leak value will cause the neuron to cross threshold and spike at every time step, resetting its neural potential to . However, if a spike is received the synapse weight of will be added to the leak value of and the neuron potential will be unchanged111 This effect is a result of the specific nature of the TrueNorth neural dynamic in which the leak is applied to the stored potential before the threshold check. Other NMC platforms will require a different neuron configuration. causing no spike to occur. When this type of neuron dynamic receives a stream of spikes that represents a binary number, the output of the neuron will be the bitwise negation of the input stream. On platforms with multiplicative leak, such as Loihi, a comparable circuit can be generated that requires an extra neuron.

Fig. 2: Inequality Check Circuit leveraging the binary spike adder with an inversion neuron and carry bit check circuit. Each box consists of a separate brick within Fugu.

Iii-C Inequality

The streaming adder can serve as the basis for building more complex arithmetic with the addition of small support circuits that implement additional base-logic functions. One such support circuit described above is the inverting neuron, or NOT gate. With this inverting neuron we can perform an inequality check (Figure 2) that solves the simple logical question of is greater than , for two unsigned integers and . We do this by leveraging the binary ones compliment subtraction method. In binary ones compliment representation a negative integer is obtained by inverting each bit of its positive value representation. To use this to solve the inequality check of determining if , we perform the operation of in ones compliment representation and use the “end around carry” bit as a signal to determine the result. It can be shown that for unsigned integers and , the ones compliment subtraction method, , will produce a carry bit value of if and if . We restrict our inputs to unsigned integers to avoid the complexities of overflow, since the subtraction of two positive -bit integers cannot produce a result outside the range of the -bit representation.

Fig. 3: Supporting neural logic circuits for platforms with additive leak such as TrueNorth. (a) inverting neuron, (b) check carry bit circuit.

For our inequality check circuit (Figure 2), we are computing the expression and checking the final carry bit to determine if the expression is true or false. After leveraging the adder and NOT gate to perform the computation of we produce the result by inspecting the carry bit. This inspection requires an additional support circuit requiring two neurons (Figure 3b). The main neuron of this circuit, the check neuron, receives the result of the adder and will spike if the final carry bit of the adder result is and will not spike for any other input. To do this we create an additional input to the check neuron from a self-spiking neuron that provides a spike to the check neuron at the exact time the final carry bit of the adder result is entering the check neuron. This precise timing is maintained by a spike back to itself with a weight of , which will reset its potential back to .

Depending on the manner the neuromorphic hardware implements leak, there is an alternative approach to inequality that uses only one neuron if there is multiplicative leak, such as on Loihi. The check neuron receives a sub-threshold positive input from and a negative input from , and uses a multiplicative decay at each timestep. After both inputs are complete, the check neuron can be interrogated (by an input at exactly threshold), to determine if the cumulative voltage is greater than zero, thus , or less than zero, thus (The circuit can be modified to check for equality as well). This approach is considerably more efficient, but it is not universal on NMC platforms, and further has the drawback that unlike the arithmetic circuit above, its intermediate computation is not as useful for other operations, and it relies on sufficient numerical precision within the neuron’s voltage representation.

Fig. 4: Spiking 2-to-1 Multiplexer: The input neuron is configured to pass spikes forward if there is no input from the select neuron, while the input neuron is configured to inhibit any incoming spikes if there is no input from the select neuron. This functionality is reversed if the select neuron begins spiking.
Fig. 5: The cascading of multiple adders combined with delays can compose a fixed multiplication circuit. The multiplier can be natively implemented in Fugu by combining pre-defined bricks for the adder and temporal delay circuits.

Iii-D Min/Max

A min or max operation can be performed by utilizing the inequality check circuit. The result of the inequality check circuit provides a signal that identifies which input is larger or smaller. This is the same computation that a min or max function would need to perform. What is missing is the ability to select the full input of or and pass it down stream for later use. For this we develop a streaming spike based 2-to-1 multiplexer (mux). Here the inputs of and are split first, with one copy entering the inequality circuit and the other copy being delayed and fed into the 2-to-1 mux. The result of the inequality check circuit is fed into the select input of the mux. That is, if the inequality check circuit spikes it is indicating that , signaling the mux to pass the stream of spikes on the input to perform a max operation or pass the stream of spikes on the input to perform a min operation. The 2-to-1 mux is a carefully crafted neural circuit whose implementation details will depend on the underlying NMC hardware being used. We have implemented this circuit on TrueNorth but we will only conceptually describe its function here.

The spike based 2-to-1 mux requires four neurons, as depicted in Figure 4. Two neurons for each input and , one select neuron, and one output neuron. When the select neuron is providing no input, one input neuron will be inhibited and the other will pass its input forward. The select neuron is crafted to receive the input of the inequality circuit and will either not spike or will sequentially spike for each of k timesteps, where k is the bit length, and then stop until another input is received. Because the select input is connected to each input neuron with either a positive or negative weight, when the select neuron is excited it reverses the function of the input neurons. In Figure 4, an excited select neuron would cause the A input neuron to inhibit, and the B input neuron to mirror the input B through to the output neuron. The output neuron passes any spike it receives.

Iii-E Subtraction

The functional blocks described thus far can be used to implement full binary subtraction. This is done by understanding that when performing binary subtraction, we use the addition form, that is . This requires a signed binary representation. The standard signed binary representation is twos compliment. It is possible to implement a full subtraction in ones compliment form, but it will not be detailed here222 The basis of a ones compliment subtraction was provided when describing the inequality check circuit.. Given a binary number , the negation of that value in twos compliment representation is performed by inverting all the bits and adding one. That is . Therefore, , in twos compliment representation. By using our inverting neuron and adder circuit we can perform the computation and feed this result into another adder that receives at the appropriate time, thus performing in twos complement representation.

Iii-F Scalar Multiplication

Lastly, we describe a streaming scalar multiplier to perform the function , where is a variable in our temporal binary format and is a pre-programmed scalar of some precision. The multiplication of two numbers in binaries is equivalent to the process of long multiplication in decimal. However, since in binary each element equals either zero or one, multiplication becomes a series of additions and multiplications times powers of . What is convenient about the streaming little endian coding scheme used here coupled with neuromorphic hardware is that we can multiply by powers of by implementing a simple bitshift of the variable , which is simply adding a timestep delay, to multiply the variable by . So scalar multiplication of times amounts to cumulatively adding bit-shifted versions of for each element of that equals . Viewed as a combination of additions and temporal-shifts, scalar multiplications becomes a natural fit to Fugu. We now take two bricks (the above adder brick, and a simple temporal shift brick) and we can instantiate a straightforward scalar multiplication (Figure 5). Though it adds complexity, using an inhibitory signal to gate certain adders allows this scalar multiplication to be extended from a fixed times to two variables times .

Iv Conclusion

While much research in neural computation has been for data science tasks, there is also value in exploring numerical computing and compound functions. Surely this approach is roundabout and will not replace all forms of computing. However we see value in specific uses applicable both for the existing computing hardware of today but also emerging paradigms. This perspective of function composition and streaming binary representation offers the ability to effectively increase the scope and applicability of tasks appropriate for neuromorphic processors. Furthermore, these seemingly straightforward approaches are due to our choice in spiking representation and represent the benefits advantageous spike encoding can confer.

Given that today’s large-scale neuromorphic hardware leverages digital implementations of neurons who typically have arithmetic units inside, why would one use neurons to perform these basic computations? One immediate reason is the cost of I/O on large scale neuromorphic platforms. Because data movement on and off of a chip is orders of magnitude more expensive than on chip communication (and in many cases not possible given the designs), it is critical to develop basic tools to analze data in situ. For instance, in machine learning algorithms, identifying a neuron that is active the most over a data set may have value, so using simple arithmetic to perform that cumulative count and comparison on chip is advantageous, despite costing a few neurons.

Longer term there is a value to these arithmetic circuits when considering the trajectory of more powerful devices. As the microelectronics community moves away from a primary focus on transistor like behavior, one possible outcome are devices that naturally perform more threshold-gate like computation (e.g., a tunable threshold, multiple inputs). In that case, as device level logic increases in complexity, even modestly, these circuits described here may have tremendous value.

Acknowledgment

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.

This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government.

SAND2021-13472 C

References

  • [1] G. Cybenko, “Mathematics of control,” Signals and Systems, vol. 2, p. 303, 1989.
  • [2] C. Schuman, B. Kay, T. Potok et al., “Neuromorphic computing is turing-complete,” arXiv preprint arXiv:2104.13983, 2021.
  • [3] J. Hennessy and D. Patterson, “Computer organization and design risc-v edition: The hardware software interface,” 2017.
  • [4] J. L. Hennessy and D. A. Patterson, Computer architecture: a quantitative approach.   Elsevier, 2011.
  • [5] X. Lagorce and R. Benosman, “Stick: spike time interval computational kernel, a framework for general purpose computation using neurons, precise timing, delays, and synchrony,” Neural computation, vol. 27, no. 11, pp. 2261–2317, 2015.
  • [6] M. Reljan-Delaney and J. Wall, “Solving the linearly inseparable xor problem with spiking neural networks,” in 2017 Computing Conference.   IEEE, 2017, pp. 701–705.
  • [7] L. Sahni, D. Chakraborty, and A. Ghosh, “Implementation of boolean and and or logic gates with biologically reasonable time constants in spiking neural networks,” in

    Proceedings of the AAAI Conference on Artificial Intelligence

    , vol. 33, no. 01, 2019, pp. 10 021–10 022.
  • [8] W. Maass, “Networks of spiking neurons: the third generation of neural network models,” Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997.
  • [9] J. B. Aimone, W. Severa, and C. M. Vineyard, “Composing neural algorithms with fugu,” in Proceedings of the International Conference on Neuromorphic Systems, 2019, pp. 1–8.
  • [10] S. J. Verzi, F. Rothganger, O. D. Parekh, T.-T. Quach, N. E. Miner, C. M. Vineyard, C. D. James, and J. B. Aimone, “Computing with spikes: The advantage of fine-grained timing,” Neural computation, vol. 30, no. 10, pp. 2660–2690, 2018.
  • [11] S. J. Verzi, C. M. Vineyard, E. D. Vugrin, M. Galiardi, C. D. James, and J. B. Aimone, “Optimization-based computation with spiking neurons,” in 2017 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2017, pp. 2015–2022.
  • [12] O. Parekh, C. A. Phillips, C. D. James, and J. B. Aimone, “Constant-depth and subcubic-size threshold circuits for matrix multiplication,” in Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures.   ACM, 2018, pp. 67–76.
  • [13] K. E. Hamilton, C. D. Schuman, S. R. Young, N. Imam, and T. S. Humble, “Neural networks and graph algorithms with next-generation processors,” in 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018, pp. 1194–1203.
  • [14] C. D. Schuman, J. S. Plank, G. Bruer, and J. Anantharaj, “Non-traditional input encoding schemes for spiking neuromorphic systems,” in 2019 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2019, pp. 1–10.
  • [15] P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura et al., “A million spiking-neuron integrated circuit with a scalable communication network and interface,” Science, vol. 345, no. 6197, pp. 668–673, 2014.
  • [16] M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y. Cao, S. H. Choday, G. Dimou, P. Joshi, N. Imam, S. Jain et al., “Loihi: A neuromorphic manycore processor with on-chip learning,” IEEE Micro, vol. 38, no. 1, pp. 82–99, 2018.