1 Introduction
Spiking neural networks (SNNs), as the third generation of neural networks, are getting more and more attention due to their higher biological plausibility, hardware friendliness, lower energy demand, and temporal nature [1, 2, 3, 4]. Although SNNs have not yet reached the performance of the stateoftheart artificial neural networks (ANNs) with deep architectures, recent efforts on adapting the gradient descent and backpropagation algorithms to SNNs have led to great achivements [5].
Contrary to artificial neurons with floatingpoint outputs, spiking neurons communicate via sparse and asynchronous stereotyped spikes which makes them suitable for eventbased computations [1, 2]. That is why the neuromorphic implementation of SNNs can be far less energyhungry than ANN implementations [6] which makes them appealing for realtime embedded AI systems and edge computing solutions. However, as SNNs become larger they require more storage and computational power. Binarizing the synaptic weights, similar to the binarized artificial neural networks (BANNs) [7], could be a good solution to reduce the memory and computational requirements of SNNs.
Although the use of binary (+1 and 1) weights in ANNs is not a very recent idea [8, 9, 10]
, the early studies could not adapt backpropagation to BANNs. Since binary weights cannot be updated in small amounts, the backpropagation and stochastic gradient descent algorithms cannot be directly applied to BANNs. By proposing BinaryConnect
[11, 12]Courbariaux et al. were the first who successfully trained deep BANNs using the backpropagation algorithm. They used realvalued weights which are binarized before being used in the forward pass. During backpropagation, using the StraightThrough Estimator (STE), the gradients of the binary weights are simply passed and applied to the realvalued weights. Soon after, Rastegari et al.
[13]proposed XNORNet that is very similar to BinaryConnect but it multiplies a perlayer scaling factor (the L1norm of realvalued weights) to the binary weights to make a better approximation of the realvalued weights. In order to speed up the learning phase of BANNs, Tang et al.
[14] controlled the rate of oscillations in binary weights between 1 and 1 by optimizing the learning rates. They also proposed to use learned scaling factors instead of the L1norm of realvalued weights in XNORNet. In DoReFaNET [15], Zhou et al. proposed a model with variable widthsize (down to binary) weights, activations, and even gradients during backpropagation. A more detailed survey on BANNs is provided in [7].A few recent studies have tried to convert supervised BANNs into equivalent binary SNNs (BSNNs), however, there is no other study to the best of our knowledge aimed at directly training multilayer supervised SNNs with binary weights. Esser et al. [16] trained ANNs with constrained weights and activations and deployed them into SNNs with binary weights on TrueNorth. Later in [17], they mapped convolutional ANNs with trinary weights and binary activations to SNNs on TrueNorth. Rueckauer et al. [18] converted BinaryConnect [11]
with binary and fullprecision activations into equivalent ratecoded BSNNs. Although their converted BSNN had binary weights, they did not binarize the fullprecision parameters of the batchnormalization layers. In
[19], Wang et al. convert BinaryConnect networks to ratecoded BSNNs using a weightsthresholds balance conversion method which scales the highprecision batch normalization parameters of BinaryConnect into 1 or 1. In another study, Lu et al. [20] converted a modified version of XNORNet without batch normalization and bias inputs into equivalent ratecoded BSNNs.In this work, we propose a direct supervised learning algorithm to train multilayer SNNs with binary synaptic weights. The input layer uses a temporal timetofirstspike coding
[21, 22, 23]to convert the input image into a spike train with one spike per neuron. The nonleaky integrateandfire (IF) neurons in the subsequent hidden and output layers integrate incoming spikes through binary (+1 or 1) synapses and emit only one spike right after the first crossing of the threshold. Inspired by BANNs, we also use a set of realvalued proxy weights such that the binary weights are indeed the sign of realvalued weights. Hence, in the backward pass, we update the realvalued weights based on the errors made by the binary weights. Literally, after completing the forward pass with binary weights, the output layer computes the errors by comparing its actual and target firing times, and then, realvalued synaptic weights get updated using the temporal error backpropagation. We evaluated the proposed network on MNIST
[24] and FashionMNIST [25] datasets with 97.0% and 87.3% categorization accuracies, respectively.SNNs can vary in terms of neuronal model, neural connectivity, information coding, and learning strategy which deeply affect their accuracy, memory, and energy efficiency. The advantages of the proposed BSNN are 1) the use of nonleaky IF neurons whit a very simple neuronal dynamics, 2) having binarized connectivity with low memory and computational cost 3) the use of a sparse temporal coding with at most one spike per neuron, and 4) learning by a direct supervised temporal learning rule which forces the network to make decisions as accurate and early as possible.
2 Methods
The input layer of the proposed binarized singlespike supervised spiking neural network (BS4NN) converts the input image into a spike train based on a timetofirstspike coding. These spikes are then propagated through the network, where, the binary IF neurons in hidden and output layers are not allowed to fire more than once per image. Each output neuron is dedicated to a different category and the first output neuron to fire determines the decision of the network.
The error of each output neuron is computed by comparing its actual firing time with a target firing time. Then, a modified version of the temporal backpropagation algorithm in S4NN [26] is used to update the synaptic weights. During the learning phase, we have two sets of weights, the realvalued weights, , and the corresponding binary weights, , where . The forward propagation is done with the binary weights, while, the error backpropagation and weight updates are done by the realvalued weights. Finally, we put the realvalued weights aside and use the binary weights to inference about testing images. Note that some of the following equations are adopted from S4NN [26] and they are reproduced here for the sake of readers.
2.1 Forward pass
The input layer converts the input image into a volley of spikes using a singlespike temporal coding scheme known as intensitytolatency conversion. For images with the pixel intensity of range , the firing time of the th input neuron, , corresponding to the th pixel intensity, , is computed as
(1) 
where, is the maximum firing time. In this way, input neurons with higher pixel intensities have shorter spike latencies. Here, we used discrete time. Therefore, the spike train of the th input neuron is defined as
(2) 
Subsequent hidden and output layers are comprised of nonleaky IF neurons. The th IF neuron of th layer receives incoming spikes through binary synaptic weights of 1 or +1 and update its membrane potential, , as
(3) 
where and are, respectively, the input spike train and the binary synaptic weight connecting the th presynaptic neuron to the neuron . Note that is a scaling factor shared between all the neurons of the th layer. The IF neuron fires only once, the first time its membrane potential crosses the threshold ,
(4) 
where checks if the neuron has not fired at any previous time step. Equivalently, one can move the scaling factor from Eq. 3 to Eq. 4 by replacing with .
For each input image, we first reset all the membrane voltages to zero and then run the simulation for at most time steps. Each output neuron is assigned to a different category and the output neuron that fires earlier than others determines the category of the input image. Hence, in the test phase, we do not need to continue the simulation after the first spike in the output layer. If none of the output neurons fires before , the output neuron with the maximum membrane potential at makes the decision. However, during the learning phase, to compute the temporal error and gradients, we need all the neurons in the network to fire at some point, and hence, we continue the simulation until and if a neuron never fires, we force it to emit a fake spike at time .
2.2 Backward pass
For a categorization task with categories, we define the temporal error as a function of the actual and target firing times,
(5) 
where and are the actual and the target firing times of the th output neuron, respectively. Let’s define as the minimum firing time in the output layer (i. e., ). For an input image belonging to the th category, we have
(6) 
where, is a positive constant. This way the correct neuron is encouraged to fire first and others are penalized to not fire earlier than . In a special case that all the output neurons remain silent during the forward pass (emit fake spikes at ), we set and to force the correct neuron to fire.
To apply the gradient descent algorithm, we should compute , the gradient of the loss function with respect to the binary weights. However, the gradient descent method makes small changes to the weights, which cannot be done with binary values. To solve the problem, during the learning phase, we use a set of realvalued weights, , as a proxy, such that
(8) 
and, as the gradient of the function is 0 or undefined, using the straightthrough estimator (STE) we approximate , therefore, we have
(9) 
Now, we can update the realvalued weights as
(10) 
where is the learning rate parameter.
Layer size  Initial realvalue weights  Initial parameters  
Dataset  Hidden  Output  
MNIST  600  10  256  100  5  5  0.1  0.01  1  
FashionMNIST  1000  10  256  700  5  10  0.1  0.01  1 
Model  Coding  Neuron / Synapse / PSP  Learning  Hidden(#)  Acc. (%) 

Mostafa (2017) [28]  Temporal  IF / Realvalue /Exponential  Temporal Backpropagation  800  97.2 
Tavanaei et al. (2019) [27]  Rate  IF / Realvalue / Instantaneous  STDPbased Backpropagation  1000  96.6 
Comsa et al. (2019) [29]  Temporal  SRM / Realvalue / Exponential  Temporal Backpropagation  340  97.9 
Zhang et al. (2020) [30]  Temporal  IF / Realvalue / Linear  Temporal Backpropagation  400  98.1 
Zhang et al.(2020) [30]  Temporal  IF / Realvalue / Linear  Temporal Backpropagation  800  98.4 
Sakemi et al.(2020) [31]  Temporal  IF / Realvalue / Linear  Temporal Backpropagation  500  97.8 
Sakemi et al.(2020) [31]  Temporal  IF / Realvalue / Linear  Temporal Backpropagation  800  98.0 
S4NN [26]  Temporal  IF / Realvalue / Instantaneous  Temporal Backpropagation  400  97.4 
BNN 
Binary (0 & 1)  Binary Sigmoid/ Binary/   Backpropagation with ADAM  600  96.8 
BS4NN (this paper)  Temporal  IF / Binary / Instantaneous  Temporal Backpropagation  600  97.0 

Let’s define
(11) 
where, is the firing time of the th neuron of the th layer. Also, according to [26], we approximate to be if and 0 otherwise. Therefore, we have
(12) 
where for the output layer (i. e., ) we have
(13) 
and for the hidden layers (i. e., ), according to the backpropagation algorithm, we compute the weighted sum of the delta values of neurons in the following layer,
(14) 
where, iterates over neurons in layer . Similar to [26], we approximate and if and only if . To have smooth gradients, we use the realvalued weights, , instead of the scaled binary weights, .
We also update the scaling factor as
(15) 
where is the learning rate parameter. Therefore we compute
(16) 
where and iterate over neurons in layer and , respectively. Here again, similar to [26], we approximate and according to Eq.3, we compute .
Note that before updating the weights we normalize the gradients as , to avoid exploding and vanishing gradients. Also, we added an norm weight regularization term to the loss function to avoid overfitting. The parameter is the regularization parameter accounting for the degree of weigh penalization.
3 Results
3.1 MNIST dataset
In this section, we evaluate BS4NN on the MNIST dataset which is the most popular benchmark for spiking neural networks [1]. The MNIST dataset contains 60,000 handwritten digits (0 to 9) in images of size pixels as the train set. The test set contains 10,000 digit images, images per digit. Here, we train a fully connected BS4NN with one hidden layer containing 600 IF neurons. The parameter settings are provided in Table 1. Initial synaptic weights including inputhidden ( ) and hiddenoutput (
) weights are drawn from uniform distributions in range
and , respectively. Trainable parameters including the synaptic weights the scale factors of hidden () and output () layers are tuned through the learning phase. Adaptive parameters including andare discounted by 30% every ten epochs. Other parameters remain intact in both the learning and testing phases.
Table 2 presents the categorization accuracy of the proposed BS4NN along with some other SNNs with spiketimebased direct supervised learning algorithms and fullyconnected architectures. BS4NN is the only network in this table that uses binary weights and it could reach 97.0% accuracy on MNIST. As mentioned in the Methods Section, BS4NN uses a modified version of the temporal backpropagation algorithm in S4NN (Kheradpisheh et al. (2020) [26]) to have binary weights. Compared to S4NN, the categorization accuracy in BS4NN dropped by 0.4% only. Although BS4NN is outperformed by the other SNNs by at most 1.4%, its advantages are the use of binary weights instead of realvalued fullprecision weights and instantaneous postsynaptic potential (PSP) function. As seen, BS4NN could outperform Tavanaei et al. (2019) [27] that uses realvalued weights and instantaneous PSP. Other SNNs use exponential and linear PSP functions which complicate the neural processing and the learning procedure of the network, which consequently, increase their computational and energy cost.
We also compared BS4NN to a BNN with a similar architecture. To do a fair comparison, inspired from [12], we implemented a BNN with binary weights (1 and +1) and binary sigmoid activations (0 and 1). The network has a single hidden layer of size 600 and it is trained using ADAM optimizer and squared hinge loss function for 500 epochs. The learning rate initiates from and exponentially decays, through the learning epochs, down to . According to [32], the initial realvalued weights of each layer are randomly drawn from a uniform distribution in range , where, is the number of synaptic weights of that layer. As provided in Table 2, the BNN could reach the best accuracy of 96.8% on MNIST, that is a 0.2% drop with respect to BS4NN (we will comment these results in the Discussion).
The firing times of the ten output neurons over all test images are shown in Figure 0(a). Images are ordered by the digit category from ’0’ to ’9’. For each test image, the firing time of each neuron is shown by a colorcoded dot. As seen, for each category, its corresponding output neuron tends to fire earlier than others. It is better evident in Figure 0(b) which shows the mean firing time of each output neuron for each digit category. Each output neuron has, by difference, the shortest mean firing time for images of its corresponding digit. Interestingly, BS4NN needs a much longer time to detect digit ’1’ (188 time steps) that could be due to the use of binary weights. Other digits cover more pixels of the image, and therefore, produce more early spikes than digit ’1’. Since the weights are binary, the few early spikes of digit ’1’ can not activate the hidden IF neurons, and hence, BS4NN needs to wait for later surrounding spikes to distinguish digit ’1’ from other digits.
We further counted the mean required number of spikes for BS4NN to categorize images of each digit category. To this end, we counted the number of spikes in all the layers until the emission of the first spike in the output layer (when the network makes its decision). The mean required spikes of the input and hidden layers are depicted in Figure 2. All digit categories but ’1’, on average, require about 100 spikes in the input and 200 spikes in the hidden layers, respectively. Digit ’1’ requires about 300 input spikes, while, similar to other digits, its hidden layer needs about 100 spikes. As explained above, digit ’1’ covers a fewer number of pixels than other digits and also its shape overlaps with the constituent parts of some other digits, hence, due to the use of the binary weights, the network should wait for later input spikes to distinguish digit ’1’ from other digits.
Figure 3 shows the time course of the membrane potentials of the output neurons for a sample ’9’ test image. The membrane potential of the 9th output neuron overtakes others at the 15th time step and quickly increases until it crosses the threshold at the 58th time step. The accumulated input spikes until the 15, 58, 100, 190, and 250 time steps are depicted in this figure. As seen, up to the 15th time step, a few input spikes are propagated and at the 58th time steps with the propagation of a few more input spikes, the 9th output neuron reaches its threshold and determines the category of the input image. Later input, hidden, and output spikes are no more required by the network.
To evaluate the robustness of the trained BS4NN to the input noise, during the test phase, we added random jitter noise drawn from a uniform distribution in range to the pixels of the input images. The noise level, , varies from 5% to 100% of the maximum pixel intensity, . Figure 3(a) shows a sample image contaminated with different levels of jitter noise. The recognition accuracy of the trained model over noisy test images under different levels of noise is plotted in Figure 3(b). As shown, the recognition accuracy remains above 95% and it drops to 79% for the 100% noise level. In higher noise levels, the order of input spikes can dramatically change and because BS4NN has only +1 and 1 synaptic weights even to the insignificant parts of the input images, It affects the behavior of IF neurons which consequently increase the categorization error rate.
Model  Neuron  Coding  Synapses  Learning  Acc. (%)  

Zhang et al. (2019) [33]  Recurrent SNN  Leaky IF  Rate  Realvalue  Spiketrain backpropagation  90.1 
Ranjan et al. (2019) [34]  Convolutional SNN  Leaky IF  Rate  Realvalue  Spikerate backpropagation  89.0 
Wu et al. (2020) [35]  Convolutional SNN  Leaky IF  Rate  Realvalue  Globallocal hybrid learning rule  93.3 
Zhang et al. (2020) [36]  Fullyconnected SNN  Leaky IF  Rate  Realvalue  Spikesequence backpropagation  89.5 
Zhang et al. (2020) [36]  Fullyconnected SNN  IOW^{2}^{2}2Input Output Weighted Leaky IF [36]  Rate  Realvalue  Spikesequence backpropagation  90.2 
Hao et al.(2020) [37]  Fullyconnected SNN  Leaky IF  Rate  Realvalue  Dopaminemodulated STDP  85.3 
S4NN  Fullyconnected SNN  IF  Temporal  Realvalue  Temporal backpropagation  88.0 
BNN 
Fullyconnected  Binary Sigmoid  Binary    ADAM  86.4 
BS4NN (this paper) 
Fullyconnected SNN  IF  Temporal  Binary  Temporal backpropagation  87.3 
In a further experiment, we replaced the binary weights of the trained BS4NN with their corresponding realvalued weights and applied them to the test images. In other words, we replaced the term in Eq. 3 with . The network reached 89.1% accuracy on test images which is far less than the 97.0% accuracy of the binary weights. It shows that, although we update the realvalued proxy weights during the learning phase, we are actually tuning the binary weights, because the loss and gradients are computed based on the binary weights. Figure 5 shows the pairs of the real and binaryvalued weights for 16 randomly selected hidden neurons. Dark pixels correspond to negative and bright values correspond to positive weights. It seems that hidden neurons tend to detect different variants of digits and their constituent parts.
To assess the speedaccuracy tradeoff in BS4NN, we first trained the network with a threshold of 100 for all the neurons, then we varied the thresholds from 0 to 200 for all the neurons and evaluated the network on test images. As shown in Figure 6, the accuracy peaks around the threshold of 100 and drops as we move to higher or lower threshold values, while, the response time (time to the first spike in the output layer) increases by the threshold. Regarding this tradeoff, by reducing the threshold of the pretrained BS4NN, one can get faster responses but with lower accuracy. For instance, by setting the threshold to 80, the response time shortens from 112.9 to 44.9 (3x faster responses), while, the accuracy drops from 97.0% to 91.0%.
The scaling factors are fullprecision floatingpoint parameters we used in our neuronal layers to have a better approximation of the realvalued weights by the binary weights. We could round the factors in the pretrained network down to two decimal places without a change in the categorization accuracy.
3.2 FashionMNIST dataset
FashionMNIST [25] is a fashion product image dataset with 10 classes (see Figure 7). Images are gathered from the thumbnails of the clothing products on an online shopping website. FashionMNIST has the same image size and training/testing splits as MNIST, but it is a more challenging classification task. Here, we used a BS4NN with a single hidden layer with 1000 IF neurons. Details of the parameter values are presented in Table 1. The initial weights of all layers are randomly drawn from a uniform distribution in the range [0,1]. The learning rate parameters and discount by 30% every 10 epochs, and the scaling factors and are trained during the learning phase.
(a) The mean firing times of the output neurons over the FashionMNIST categories. (b) The confusion matrix of BS4NN on FashionMNIST. (c) The mean required number of spikes per category and layer.
Table 3 summarizes the characteristics and recognition accuracies of recent SNNs on the FashionMNIST dataset. BS4NN could reach 87.3% accuracy (0.7% drop with respect to S4NN). Apart from BS4NN, all the models use realvalued synaptic weights, spikeratebased neural coding, and leaky neurons with exponential decay. The mean firing times of the output neurons of BS4NN for each of the ten categories of FashionMNIST are illustrated in Figure 7(a). As seen, the correct output neuron has the minimum firing time for its corresponding category than others. However, compared to MNIST, there is a small difference between the mean firing times of the correct and some other neurons. It could be due to the similarities between instances of different categories. For instance, as shown in Figure 7(b), BS4NN confuses ankle boots, sandals, and sneakers. There is a similar situation for shirts and tshirts, and also, between pullovers and coats, where, their firing times are close together and consequently BS4NN confuses them by each other sometimes. The total required number of spikes in each layer and the total network is provided in Figure 7(c). Those classes that are mostly confused by each other (i. e., shirts, tshirts, coats, and pullovers) require more spikes in both input and hidden layers. One reason could be the larger size of these objects in the input image leading to more early input spikes. But, the other reason, especially for the hidden layer, could be the need for more discriminative features between these confusing categories.
We also did a comparison between BS4NN and a BNN with binary weights (1 and 1), binary activations (0 and1), and the same architecture as BS4NN on FashionMNIST. The BNN is trained using ADAM optimizer and squared hinge loss function. The learning rate is initially set to and exponential decays down to . The initial realvalued weights of each layer are randomly drawn from a uniform distribution in range , where, is the number of synaptic weights of that layer. Interestingly, BS4NN outperforms BNN by 0.9% accuracy.
4 Discussions
In this paper, we propose a binarized spiking neural network (called BS4NN) with a direct supervised temporal learning algorithm. To this end, we used a very common approach in the area of BANNs [7]. During the learning phase, we have two sets of real and binaryvalued weights, such that the binary weights are the sign of the realvalued weights. The binary weights are used for the inference and gradient backpropagation, while, in the backward pass, the weight updates are applied to the realvalued weights. The proposed BS4NN uses the timetofirstspike coding [38, 39, 40, 41, 42] to convert image pixels into spike trains in which input neurons with higher pixel intensities emit spikes with shorter latencies. The subsequent hidden and output neurons are comprised of nonleaky IF neurons with binary (+1 or 1) weights that fire once when they reach their threshold for the first time. The decision is simply made by the first spike in the output layer. The temporal error is then computed by comparing the actual and target firing times. Gradients backpropagate through the network and are applied to the realvalued weights. Target firing times are computed relative to the actual firing times of the output neurons to push the correct output neuron to fire earlier than others. It forces BS4NN to make quick and accurate decisions with the less possible amount of spikes (high sparsity).
In our experiments, BS4NN could reach 97.0% and 87.3% accuracy on MNIST and FashionMNIST datasets, respectively. Although in terms of accuracy, BS4NN could not beat the realvalued SNNs, it has several computational, memory, and energy advantages which makes it suitable for hardware and neuromorphic implementations. Interestingly, BS4NN has also outperformed BNNs with same architectures on MNIST and FashionMNIST by 0.2% and 0.9% accuracy, respectively. This improvement with respect to BNN could be due to the use of time in our timetofirstspike coding and temporal backpropagation in BS4NN. Both networks have binary activations and binary weights, but the advantage of BS4NN is the use of temporal information encoded in spiketimes.
Instead of realvalued weights, BS4NN uses binary synapses with only one fullprecision scaling factor per layer. It can be very important for memory optimization in hardware implementations where every synaptic weight requires a separate memory space. If one implements the binary synapses with a single bit of memory, then it can reduce the network size by 32x compared to a network with 32bit floatingpoint weights [13, 43]. Also, it can ease the implementation of multiplicative synapses by replacing them with one unit increment and decrement operations. Hence, it can be important for reducing the computational and energyconsumption costs [13, 43].
The use of nonleaky IF neurons instead of complicated neuron models such as SRM [29] and LIF [36, 44] makes BS4NN more computationally efficient and hardware friendly. It might be possible to efficiently implement leakage in analog hardware regarding the physical features of transistors and capacitors [6], but it is always costly to be implemented in digital hardware. To do so, one might periodically (e.g., every millisecond) decrease the membrane potential of all neurons (clockdriven) [45], or whenever an input spike is received by a neuron (eventbased) [46, 47]. The first one requires energy and the latter one needs more memory to store last firing times.
The implementation of instantaneous synapses used in BS4NN is way simpler than the exponential [28], alpha [29], and linear [30, 31, 48] synaptic currents and costs much less energy and computation. In instantaneous synapses, each input spike causes a sudden potential increment or decrement, but in the currentbased synapses, each input spike causes the potential to be updated on several consecutive time steps (which requires an extra state parameter).
As mentioned above, BS4NN uses singlespike neural coding throughout the network. The input layer employs a timetofirstspike coding by which input neurons fires only once (shorter latencies for stronger inputs). Also, neurons in the subsequent layers are allowed to fire at most once and only when they reach their threshold for the first time. In addition, the proposed temporal learning algorithm used to train BS4NN forces it to rely on earlier spikes and respond as quickly as possible. This cocktail is shown to take much less energy and time on neuromorphic devices compared to the ratecoded SNNs [49, 50], even up to 15 times lower energyconsumption and 5 times faster decisions [51].
Recently, efforts are made to convert pretrained BANNs into equivalent BSNNs with spikeratebased neural coding [18, 19, 20]
. However, these networks do not use the temporal advantages of SNNs that can be obtained through a direct learning algorithm. Due to the nondifferentiability of the thresholding activation function in spiking neurons, it is not convenient to apply backpropagation and gradient descents to SNNs. Various solutions are proposed to tackle this problem including computing gradients with respect to the spike rates instead of single spikes
[52, 53, 54, 55], using differentiable smoothed spike functions [56], surrogate gradients for the threshold function in the backward pass [57, 58, 59, 60, 61, 62], and transfer learning by sharing weights between the SNN and an ANN
[49, 63]. In another approach, known as latency learning, the neuron’s activity is defined based on the firing time of its first spike, therefore, they do not need to compute the gradient of the thresholding function. In return, they need to define the firing time as a function of the membrane potential [26, 29, 30, 31, 36, 64, 65], or directly as a function the firing times of presynaptic neurons [28]. By the way, all aforementioned learning strategies work with fullprecision realvalued weights and future studies can assess their capabilities to be used in BSNNs.References

[1]
A. Tavanaei, M. Ghodrati, S. R. Kheradpisheh, T. Masquelier and A. Maida, Deep learning in spiking neural networks,
Neural Networks 111 (2019) 47–63.  [2] M. Pfeiffer and T. Pfeil, Deep learning with spiking neurons: opportunities and challenges, Frontiers in Neuroscience 12 (2018) p. 774.
 [3] A. Taherkhani, A. Belatreche, Y. Li, G. Cosma, L. P. Maguire and T. M. McGinnity, A review of learning in biologically plausible spiking neural networks, Neural Networks 122 (2020) 253–272.
 [4] B. Illing, W. Gerstner and J. Brea, Biologically plausible deep learning–but how far can we go with shallow networks?, Neural Networks (2019).
 [5] X. Wang, X. Lin and X. Dang, Supervised learning in spiking neural networks: A review of algorithms and evaluations, Neural Networks (2020).
 [6] K. Roy, A. Jaiswal and P. Panda, Towards spikebased machine intelligence with neuromorphic computing, Nature 575 (nov 2019) 607–617.
 [7] T. Simons and D.J. Lee, A review of binarized neural networks, Electronics 8(6) (2019) p. 661.
 [8] D. Saad and E. Marom, Training feed forward nets with binary weights via a modified chir algorithm, Complex Systems 4(5) (1990).
 [9] S. S. Venkatesh, Directed drift: A new linear threshold algorithm for learning binary weights online, Journal of Computer and System Sciences 46(2) (1993) 198–217.
 [10] C. Baldassi, A. Braunstein, N. Brunel and R. Zecchina, Efficient supervised learning in networks with binary synapses, Proceedings of the National Academy of Sciences 104(26) (2007) 11079–11084.
 [11] M. Courbariaux, Y. Bengio and J.P. David, Binaryconnect: Training deep neural networks with binary weights during propagations, Advances in neural information processing systems, 2015, pp. 3123–3131.
 [12] M. Courbariaux, I. Hubara, D. Soudry, R. ElYaniv and Y. Bengio, Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or 1, arXiv preprint arXiv:1602.02830 (2016).

[13]
M. Rastegari, V. Ordonez, J. Redmon and A. Farhadi, Xnornet: Imagenet classification using binary convolutional neural networks,
European conference on computer vision
, Springer2016, pp. 525–542. 
[14]
W. Tang, G. Hua and L. Wang, How to train a compact binary neural network with
high accuracy?,
ThirtyFirst AAAI conference on artificial intelligence
, 2017.  [15] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen and Y. Zou, Dorefanet: Training low bitwidth convolutional neural networks with low bitwidth gradients, arXiv preprint arXiv:1606.06160 (2016).
 [16] S. K. Esser, R. Appuswamy, P. Merolla, J. V. Arthur and D. S. Modha, Backpropagation for energyefficient neuromorphic computing, Advances in Neural Information Processing Systems 28, eds. C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama and R. Garnett (Curran Associates, Inc., 2015), pp. 1117–1125.
 [17] S. K. Esser, P. A. Merolla, J. V. Arthur, A. S. Cassidy, R. Appuswamy, A. Andreopoulos, D. J. Berg, J. L. McKinstry, T. Melano, D. R. Barch, C. di Nolfo, P. Datta, A. Amir, B. Taba, M. D. Flickner and D. S. Modha, Convolutional networks for fast, energyefficient neuromorphic computing, Proceedings of the National Academy of Sciences 113(41) (2016) 11441–11446.
 [18] B. Rueckauer, I.A. Lungu, Y. Hu, M. Pfeiffer and S.C. Liu, Conversion of continuousvalued deep networks to efficient eventdriven networks for image classification, Frontiers in Neuroscience 11 (2017) p. 682.
 [19] Y. Wang, Y. Xu, R. Yan and H. Tang, Deep spiking neural networks with binary weights for object recognition, IEEE Transactions on Cognitive and Developmental Systems (2020).
 [20] S. Lu and A. Sengupta, Exploring the connection between binary and spiking neural networks, arXiv preprint arXiv:2002.10064 (2020).
 [21] S. R. Kheradpisheh, M. Ganjtabesh, S. J. Thorpe and T. Masquelier, Stdpbased spiking deep convolutional neural networks for object recognition, Neural Networks 99 (2018) 56–67.
 [22] M. Mozafari, S. R. Kheradpisheh, T. Masquelier, A. NowzariDalini and M. Ganjtabesh, Firstspikebased visual categorization using rewardmodulated stdp, IEEE Transactions on Neural Networks and Learning Systems 29(12) (2018) 6178–6190.

[23]
S. R. Kheradpisheh, M. Ganjtabesh and T. Masquelier, Bioinspired unsupervised learning of visual features leads to robust invariant object recognition,
Neurocomputing 205 (sep 2016) 382–392.  [24] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner et al., Gradientbased learning applied to document recognition, Proceedings of the IEEE 86(11) (1998) 2278–2324.
 [25] H. Xiao, K. Rasul and R. Vollgraf, Fashionmnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747 (2017).
 [26] S. R. Kheradpisheh and T. Masquelier, Temporal backpropagation for spiking neural networks with one spike per neuron, International Journal of Neural Systems 30(06) (2020) p. 2050027, PMID: 32466691.
 [27] A. Tavanaei and A. Maida, Bpstdp: Approximating backpropagation using spike timing dependent plasticity, Neurocomputing 330 (2019) 39–47.
 [28] H. Mostafa, Supervised learning based on temporal coding in spiking neural networks, IEEE Transactions on Neural Networks and Learning Systems 29(7) (2017) 3227–3235.
 [29] I. M. Comsa, K. Potempa, L. Versari, T. Fischbacher, A. Gesmundo and J. Alakuijala, Temporal coding in spiking neural networks with alpha synaptic function, arXiv (2019) p. 1907.13223.
 [30] M. Zhang, J. Wang, Z. Zhang, A. Belatreche, J. Wu, Y. Chua, H. Qu and H. Li, Spiketimingdependent back propagation in deep spiking neural networks, arXiv preprint arXiv:2003.11837 (2020).
 [31] Y. Sakemi, K. Morino, T. Morie and K. Aihara, A supervised learning algorithm for multilayer spiking neural networks based on temporal coding toward energyefficient vlsi processor design, arXiv preprint arXiv:2001.05348 (2020).
 [32] X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp. 249–256.
 [33] W. Zhang and P. Li, Spiketrain level backpropagation for training deep recurrent spiking neural networks, Advances in Neural Information Processing Systems, 2019, pp. 7802–7813.

[34]
J. A. K. Ranjan, T. Sigamani and J. Barnabas, A novel and efficient classifier using spiking neural network,
The Journal of Supercomputing (2019) 1–16.  [35] Y. Wu, R. Zhao, J. Zhu, F. Chen, M. Xu, G. Li, S. Song, L. Deng, G. Wang, H. Zheng et al., Braininspired globallocal hybrid learning towards humanlike intelligence, arXiv preprint arXiv:2006.03226 (2020).
 [36] W. Zhang and P. Li, Temporal spike sequence learning via backpropagation for deep spiking neural networks, arXiv preprint arXiv:2002.10085 (2020).
 [37] Y. Hao, X. Huang, M. Dong and B. Xu, A biologically plausible supervised learning method for spiking neural networks using the symmetric stdp rule, Neural Networks 121 (2020) 387–395.
 [38] M. Mozafari, M. Ganjtabesh, A. NowzariDalini, S. J. Thorpe and T. Masquelier, Bioinspired digit recognition using rewardmodulated spiketimingdependent plasticity in deep convolutional networks, Pattern Recognition 94 (2019) 87–95.
 [39] M. Mozafari, M. Ganjtabesh, A. NowzariDalini and T. Masquelier, SpykeTorch: Efficient Simulation of Convolutional Spiking Neural Networks With at Most One Spike per Neuron, Frontiers in Neuroscience 13 (jul 2019) 1–12.

[40]
R. Vaila, J. Chiasson and V. Saxena, Feature Extraction using Spiking Convolutional Neural Networks,
Proceedings of the International Conference on Neuromorphic Systems  ICONS ’19, (ACM Press, New York, New York, USA, 2019), pp. 1–8.  [41] R. Vaila, J. Chiasson and V. Saxena, Deep convolutional spiking neural networks for image classification, arXiv preprint arXiv:1903.12272 (2019).
 [42] P. Kirkland, G. Di Caterina, J. Soraghan and G. Matich, Spikeseg: Spiking segmentation via stdp saliency mapping, International Joint Conference on Nerual Networks, 2020.
 [43] B. McDanel, S. Teerapittayanon and H. Kung, Embedded binarized neural networks, arXiv preprint arXiv:1709.02260 (2017).
 [44] T. Masquelier and S. R. Kheradpisheh, Optimal localist and distributed coding of spatiotemporal spike patterns through stdp and coincidence detection, Frontiers in computational neuroscience 12 (2018) p. 74.
 [45] A. Yousefzadeh, T. Masquelier, T. SerranoGotarredona and B. LinaresBarranco, Hardware implementation of convolutional STDP for online visual feature learning, 2017 IEEE International Symposium on Circuits and Systems (ISCAS) (may 2017) 1–4.
 [46] G. Orchard, C. Meyer, R. EtienneCummings, C. Posch, N. Thakor and R. Benosman, HFirst: A Temporal Approach to Object Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (2015).
 [47] A. Yousefzadeh, T. SerranoGotarredona and B. LinaresBarranco, Fast Pipeline 128x128 pixel spiking convolution core for eventdriven vision processing in FPGAs 2015 International Conference on Eventbased Control, Communication, and Signal Processing (EBCCSP) , (IEEE, jun 2015), pp. 1–8.
 [48] B. Rueckauer and S.C. Liu, Conversion of analog to spiking neural networks using sparse temporal coding, 2018 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE2018, pp. 1–5.
 [49] S. P, K. T. N. Chu, Y. Tavva, J. Wu, M. Zhang, H. Li, T. E. Carlson et al., You only spike once: Improving energyefficient neuromorphic inference to annlevel accuracy, arXiv preprint arXiv:2006.09982 (2020).
 [50] J. Göltz, A. Baumbach, S. Billaudelle, A. Kungl, O. Breitwieser, K. Meier, J. Schemmel, L. Kriener and M. Petrovici, Fast and deep neuromorphic learning with firstspike coding, Proceedings of the Neuroinspired Computational Elements Workshop, 2020, pp. 1–3.
 [51] S. Oh, D. Kwon, G. Yeom, W.M. Kang, S. Lee, S. Y. Woo, J. S. Kim, M. K. Park and J.H. Lee, Hardware implementation of spiking neural networks using timetofirstspike encoding, arXiv preprint arXiv:2006.05033 (2020).
 [52] E. Hunsberger and C. Eliasmith, Spiking deep networks with lif neurons, arXiv (2015) p. 1510.08829.
 [53] J. H. Lee, T. Delbruck and M. Pfeiffer, Training deep spiking neural networks using backpropagation, Frontiers in Neuroscience 10 (2016) p. 508.
 [54] E. O. Neftci, C. Augustine, S. Paul and G. Detorakis, Eventdriven random backpropagation: Enabling neuromorphic deep learning machines, Frontiers in Neuroscience 11 (2017) p. 324.
 [55] F. Zenke and S. Ganguli, Superspike: Supervised learning in multilayer spiking neural networks, Neural Computation 30(6) (2018) 1514–1541.
 [56] D. Huh and T. J. Sejnowski, Gradient descent for spiking neural networks, Advances in Neural Information Processing Systems, 2018, pp. 1433–1443.
 [57] E. O. Neftci, H. Mostafa and F. Zenke, Surrogate gradient learning in spiking neural networks, arXiv (2019) p. 1901.09948.
 [58] S. M. Bohte, Errorbackpropagation in networks of fractionally predictive spiking neurons, International Conference on Artificial Neural Networks, Springer2011, pp. 60–68.
 [59] S. K. Esser, P. A. Merolla, J. V. Arthur, A. S. Cassidy, R. Appuswama, A. Andreopoulos, D. J. Berg, J. L. McKinstry, T. Melano, D. R. Barch, C. d. Nolfo, P. Datta, A. Amir, B. Taba, M. D. Flickner and D. S. Modha, Convolutional networks for fast energyefficient neuromorphic computing, Proceedings of the National Academy of Sciences of USA 113(41) (2016) 11441–11446.
 [60] S. B. Shrestha and G. Orchard, Slayer: Spike layer error reassignment in time, Advances in Neural Information Processing Systems, 2018, pp. 1412–1421.

[61]
G. Bellec, D. Salaj, A. Subramoney, R. Legenstein and W. Maass, Long shortterm memory and learningtolearn in networks of spiking neurons,
Advances in Neural Information Processing Systems, 2018, pp. 787–797.  [62] R. Zimmer, T. Pellegrini, S. F. Singh and T. Masquelier, Technical report: supervised training of convolutional spiking neural networks with pytorch, arXiv preprint arXiv:1911.10124 (2019).
 [63] J. Wu, Y. Chua, M. Zhang, G. Li, H. Li and K. C. Tan, A tandem learning rule for efficient and rapid inference on deep spiking neural networks, arXiv (2019) arXiv–1907.
 [64] S. M. Bohte, H. La Poutré and J. N. Kok, ErrorBackpropagation in Temporally Encoded Networks of Spiking Neurons, Neurocomputing 48 (2000) 17–37.
 [65] S. Zhou, Y. Chen, Q. Ye and J. Li, Direct training based spiking convolutional neural networks for object recognition, arXiv preprint arXiv:1909.10837 (2019).
Comments
There are no comments yet.