I Introduction
Deep neural networks (DNNs) have achieved outstanding performance in diverse areas [1, 2, 3, 4, 5], while it seems that the brain uses another network architecture, spiking neural networks, to realize various complicated cognitive functions[6, 7, 8]. Compared with the existing DNNs, SNNs mainly have two superiorities: 1) the spike pattern flowing through SNNs fundamentally codes more spatiotemporal information, while most DNNs lack timing dynamics, especially the widely used feedforward DNNs; and 2) eventdriven paradigm of SNNs can make it more hardware friendly, and be adopted by many neuromorphic platforms [9, 10, 11, 12, 13, 14].
However, it remains challenging in training SNNs because of the quite complicated dynamics and nondifferentiable nature of the spike activity. In summary, there exist three kinds of training methods for SNNs: 1) unsupervised learning; 2) indirect supervised learning; 3) direct supervised learning. The first one origins from the biological synaptic plasticity for weight modification, such as spike timing dependent plasticity (STDP)
[15, 16, 17]. Because it only considers the local neuronal activities, it is difficult to achieve high performance. The second one firstly trains an ANN, and then transforms it into its SNN version with the same network structure where the spiking rate of SNN neurons acts as the analog activity of ANN neurons
[18, 19, 20, 21]. This is not a bioplausible way to explore the learning nature of SNNs. The most promising method to obtain highperformance training is the recent direct supervised learning based on the gradient descent theory with error backpropagation. However, such a method only considers the layerbylayer spatial domain and ignores the dynamics in temporal domain [22, 23]. Therefore many complicated training skills are required to improve performance[19, 24, 23], such as fixedamountproportional reset, lateral inhibition, error normalization, weight/threshold regularization, etc. Thus, a more general dynamic model and learning framework on SNNs are highly required.In this paper, we propose a direct supervised learning framework for SNNs which combines both the SD and TD in the training phase. Firstly, we build an iterative LIF model with SNNs dynamics but it is friendly for gradient descent training. Then we consider both the spatial direction and temporal direction during the error backpropagation procedure, i.e, spatiotemporal backpropagation (STBP), which significantly improves the network accuracy. Furthermore, we introduce an approximated derivative to address the nondifferentiable issue of the spike activity. We test our SNNs framework by using the fully connected and convolution architecture on the static MNIST and a custom object detection dataset, as well as the dynamic NMNIST. Many complicated training skills which are generally required by existing schemes, can be avoided due to the fact that our proposed method can make full use of STD information that captures the nature of SNNs. Experimental results show that our proposed method could achieve the best accuracy on either static or dynamic dataset, compared with existing stateoftheart algorithms. The influence of TD dynamics and different methods for the derivative approximation are systematically analyzed. This work shall open a way to explore the highperformance SNNs for future brainlike computing paradigms with rich STD dynamics.
Ii Method and Material
Iia Iterative Leaky IntegrateandFire Model in Spiking Neural Networks
Compared with existing deep neural networks, spiking neural networks fundamentally code more spatiotemporal information due to two facts that i) SNNs can also have deep architectures like DNNs, and ii) each neuron has its own neuronal dynamic properties. The former one grants SNNs rich spatial domain information while the later one offers SNNs the power of encoding temporal domain information. However, currently there is no unified framework that allows the effective training of SNNs just as implementing backpropagation (BP) in DNNs by considering the spatiotemporal dynamics. This has challenged the extensive use of SNNs in various applications. In this work, we will present a framework based on iterative leaky integrateandfire (LIF) model that enables us to apply spatiotemporal backpropagation for training spiking neural networks.
It is known that LIF is the most widely applied model to describe the neuronal dynamics in SNNs, and it can be simply governed by
(1) 
where is the neuronal membrane potential at time , is a time constant and denotes the presynaptic input which is determined by the preneuronal activities or external injections and the synaptic weights. When the membrane potential exceeds a given threshold , the neuron fires a spike and resets its potential to . As shown in Figure 1, the forward dataflow of the SNN propagates in the layerbylayer SD like DNNs, and the selffeedback injection at each neuron node generates nonvolatile integration in the TD. In this way, the whole SNN runs with complex STD dynamics and codes spatiotemporal information into the spike pattern. The existing training algorithms only consider either the SD such as the supervised ones via backpropagation, or the TD such as the unsupervised ones via timingbased plasticity, which causes the performance bottleneck. Therefore, how to build an learning framework making full use of the STD is fundamentally required for highperformance SNNs that forms the main motivation of this work.
However, obtaining the analytic solution of LIF model in (1) directly makes it inconvenient/obscure to train SNNs based on backpropagation. This is because the whole network shall present complex dynamics in both SD and TD. To address this issue, the following eventdriven iterative updating rule
(2) 
can be well used to approximate the neuronal potential in (1
) based on the last spiking moment
and the presynaptic input . The membrane potential exponentially decays until the neuron receives presynaptic inputs, and a new update round will start once the neuron fires a spike. That is to say, the neuronal states are codetermined by the spatial accumulations of and the leaky temporal memory of .As we know, the efficiency of error backpropagation for training DNNs greatly benefits from the iterative representation of gradient descent which yields the chain rule for layerbylayer error propagation in the SD backward pass. This motivates us to propose a iterative LIF based SNN in which the iterations occur in both the SD and TD as follows:
(3)  
(4)  
(5) 
where
(6) 
(7) 
In above formulas, the upper index denotes the moment at time , and and denote the layer and the number of neurons in the layer, respectively. is the synaptic weight from the neuron in presynaptic layer to the neuron in the postsynaptic layer, and is the neuronal output of the neuron where denotes a spike activity and denotes nothing occurs. is a simplified representation of the presynaptic inputs of the neuron, similar to the in the original LIF model. is the neuronal membrane potential of the neuron and is a bias parameter related the threshold .
Actually, formulas (4)(5) are also inspired from the LSTM model [25, 26, 27] by using a forget gate to control the TD memory and an output gate to fire a spike. The forget gate controls the leaky extent of the potential memory in the TD, the output gate generates a spike activity when it is activated. Specifically, for a small positive time constant , can be approximated as
(8) 
since . In this way, the original LIF model could be transformed to an iterative version where the recursive relationship in both the SD and TD is clearly describe, which is friendly for the following gradient descent training in the STD.
IiB SpatioTemporal Backpropagation Training
In order to present STBP training methodology, we define the following loss function
in which the mean square error for all samples under a given time windows is to be minimized(9) 
where and
denote the label vector of the
th training sample and the neuronal output vector of the last layer , respectively.By combining equations (3)(9) together it can be seen that is a function of and . Thus, to obtain the derivative of with respect to and is required for the STBP algorithm based on gradient descent. Assume that we have obtained derivative of and at each layer at time , which is an essential step to obtain the final and . Figure2 describes the error propagation (dependent on the derivation) in both the SD and TD at the singleneuron level (figure2.a) and the network level (figure2.b). At the singleneuron level, the propagation is decomposed into a vertical path of SD and a horizontal path of TD. The dataflow of error propagation in the SD is similar to the typical BP for DNNs, i.e. each neuron accumulates the weighted error signals from the upper layer and iteratively updates the parameters in different layers; while the dataflow in the TD shares the same neuronal states, which makes it quite complicated to directly obtain the analytical solution. To solve this problem, we use the proposed iterative LIF model to unfold the state space in both the SD and TD direction, thus the states in the TD at different time steps can be distinguished that enables the chain rule for iterative propagation. Similar idea can be found in the BPTT algorithm for training RNNs in [28].
Now, we discuss how to obtain the complete gradient descent based on the following four cases. Firstly, we denote that
(10) 
Case 1: at the output layer .
In this case, the derivative can be directly obtained since it depends on the loss function in Eq.(9) of the output layer. We could have
(11) 
The derivation with respect to is generated based on
(12) 
Case 2: at the layers .
In this case, the derivative iteratively depends on the error propagation in the SD at time as the typical BP algorithm. We have
(13) 
Similarly, the derivative yields
(14) 
Case 3: at the output layer .
In this case, the derivative depends on the error propagation in the TD direction. With the help of the proposed iterative LIF model in Eq.(3)(5) by unfolding the state space in the TD, we acquire the required derivative based on the chain rule in the TD as follows
(15)  
(16) 
(17) 
where as in Eq.(11).
Case 4: at the layers .
In this case, the derivative depends on the error propagation in both SD and TD. On one side, each neuron accumulates the weighted error signals from the upper layer in the SD like Case 2; on the other side, each neuron also receives the propagated error from selffeedback dynamics in the TD by iteratively unfolding the state space based on the chain rule like Case 3. So we have
(18)  
(19)  
(20)  
(21) 
Based on the four cases, the error propagation procedure (depending on the above derivatives) is shown in Figure2. At the singleneuron level (Figure2.a), the propagation is decomposed into the vertical path of SD and the horizontal path of TD. At the network level (Figure2.b), the dataflow of error propagation in the SD is similar to the typical BP for DNNs, i.e. each neuron accumulates the weighted error signals from the upper layer and iteratively updates the parameters in different layers; and in the TD the neuronal states are unfolded iteratively in the timing direction that enables the chainrule propagation. Finally, we obtain the derivatives with respect to and as follows
(22)  
(23) 
IiC Derivative Approximation of the Nondifferentiable Spike Activity
In the previous sections, we have presented how to obtain the gradient information based on STBP, but the issue of nondifferentiable points at each spiking time is yet to be addressed. Actually, the derivative of output gate is required for the STBP training of Eq.(11)(22). Theoretically, is a nondifferentiable Dirac function of which greatly challenges the effective learning of SNNs [23]. has zero value everywhere except an infinity value at zero, which causes the gradient vanishing or exploding issue that disables the error propagation. One of existing method viewed the discontinuous points of the potential at spiking times as noise and claimed it is beneficial for the model robustness[29, 23], while it did not directly address the nondifferentiability of the spike activity. To this end, we introduce four curves to approximate the derivative of spike activity denoted by , , and in Figure3.b:
(24)  
(25)  
(26)  
(27) 
where determines the curve shape and steep degree. In fact, , , and
are the derivative of the rectangular function, polynomial function, sigmoid function and Gaussian cumulative distribution function, respectively. To be consistent with the Dirac function
, we introduce the coefficient to ensure the integral of each function is 1. Obviously, it can be proven that all the above candidates satisfy that(28) 
Iii Results
Iiia Parameter Initialization
The initialization of parameters, such as the weights, thresholds and other parameters, is crucial for stabilizing the firing activities of the whole network. We should simultaneously ensure timely response of presynaptic stimulus but avoid too much spikes that reduces the neuronal selectivity. As it is known that the multiplyaccumulate operations of the prespikes and weights, and the threshold comparison are two key steps for the computation in the forward pass. This indicates the relative magnitude between the weights and thresholds determines the effectiveness of parameter initialization. In this paper, we fix the threshold to be constant in each neuron for simplification, and only adjust the weights to control the activity balance. Firstly, we initial all the weight parameters sampling from the standard uniform distribution
(30) 
Then, we normalize these parameters by
(31) 
The set of other parameters is presented in TableI. Furthermore, throughout all the simulations in our work, any complex skill as in [19, 23] is no longer required, such as the fixedamountproportional reset, error normalization, weight/threshold regularization, etc.
Network parameter  Description  Value 

Time window  30ms  
Threshold (MNIST/object detection dataset/NMNIST)  1.5, 2.0, 0.2  
Decay factor (MNIST/object detection dataset/NMNIST)  0.1ms, 0.15ms, 0.2ms  
Derivative approximation parameters(Figure3)  1.0  
Simulation time step  1ms  
Learning rate (SGD)  0.5  
Adam parameters  0.9, 0.999, 1 
IiiB Dataset Experiments
We test our SNNs model and the STBP training method on various datasets, including the static MNIST and a custom object detection dataset, as well as the dynamic NMNIST dataset. The input of the first layer should be a spike train, which requires us to convert the samples from the static datasets into spike events. To this end, the Bernoulli sampling from original pixel intensity to the spike rate is used in this paper.
IiiB1 Spatiotemporal fully connected neural network
Static Dataset. The MNIST dataset of handwritten digits [30] (figure4.b) and a custom dataset for object detection [14] (figure4.a) are chosen to test our method.
MNIST is comprised of a training set with 60,000 labelled handwritten digits, and a testing set of other 10,000 labelled digits, which are generated from the postal codes of 09. Each digit sample is a 2828 grayscale image. The object detection dataset is a twocategory image dataset created by our lab for pedestrian detection. It includes 1509 training samples and 631 testing samples of 28 grayscale image. By detecting whether there is a pedestrian, an image sample is labelled by 0 or 1, as illustrated in Figure4.a. The upper and lower subfigures in Figure4.c are the spike pattern of 25 input neurons converted from the center patch of 55 pixels of a sample example on the object detection dataset and MNIST, respectively. Figure4.d illustrates an example for the spike pattern of output layer within 15ms before and after the STBP training over the stimulus of digit 9. At the beginning, neurons in the output layer randomly fires, while after the training the 10th neuron coding digit 9 fires most intensively that indicates correct inference is achieved.
Model  Network structure  Training skills  Accuracy  

Spiking RBM (STDP)[31]  78450040  None  93.16%  
Spiking RBM(pretraining*)[20]  78450050010  None  97.48%  
Spiking MLP(pretraining*) [19]  7841200120010  Weight normalization  98.64%  
Spiking MLP(BP) [22]  78420020010  None  97.66%  
Spiking MLP(STDP) [15]  7846400  None  95.00%  
Spiking MLP(BP) [23]  78480010 

98.71%  
Spiking MLP(STBP)  78480010  None  98.89% 

We mainly compare with these methods that have the similar network architecture, and * means that their model is based on pretrained ANN models.
TableII compares our method with several other advanced results that use the similar MLP architecture on MNIST. Although we do not use any complex skill, the proposed STBP training method also outperforms all the reported results. We can achieve 98.89% testing accuracy which performs the best. TableIII compares our model with the typical MLP on the object detection dataset. The contrast model is one of the typical artificial neural networks (ANNs), i.e. not SNNs, and in the following we use ’nonspiking network’ to distinguish them. It can be seen that our model achieves better performance than the nonspiking MLP. Note that the overall firing rate of the input spike train from the object detection dataset is higher than the one from MNIST dataset, so we increase its threshold to 2.0 in the simulation experiments.
Model  Network structure  Accuracy  

Mean  Interval  
Nonspiking MLP(BP)  78440010  98.31%  [97.62%, 98.57%] 
Spiking MLP(STBP)  78440010  98.34%  [97.94%, 98.57%] 

* results with epochs [201,210].
Dynamic Dataset. Compared with the static dataset, dynamic dataset, such as the NMNIST[32]
, contains richer temporal features, and therefore it is more suitable to exploit SNN’s potential ability. We use the NMNIST database as an example to evaluate the capability of our STBP method on dynamic dataset. NMNIST converts the mentioned static MNIST dataset into its dynamic version of spike train by using the dynamic vision sensor (DVS)
[33]. For each original sample from MNIST, the work [32] controls the DVS to move in the direction of three sides of the isosceles triangle in turn (figure5.b) and collects the generated spike train which is triggered by the intensity change at each pixel. Figure5.a records the saccade results on digit 0. Each subgraph records the spike train within 10ms and each 100ms represents one saccade period. Due to the two possible change directions of each pixel intensity (brighter or darker), DVS could capture the corresponding two kinds of spike events, denoted by onevent and offevent, respectively (figure5.c). Since NMNIST allows the relative shift of images during the saccade process, it produces 3434 pixel range. And from the spatiotemporal representation in figure5.c, we can see that the onevents and offevents are so different that we use two channel to distinguish it. Therefore, the network structure is 3434240040010.Model  Network structure  Training skills  Accuracy  
Nonspiking CNN(BP)[24]    None  95.30%  
Nonspiking CNN(BP)[34]    None  98.30%  
Nonspiking MLP(BP)[23]  80010  None  97.80%  
LSTM(BPTT)[24]    Batch normalization  97.05%  
PhasedLSTM(BPTT)[24]    None  97.38%  
Spiking CNN(pretraining*)[34]    None  95.72%  
Spiking MLP(BP)[23]  80010 

98.74%  
Spiking MLP(BP)[35]  1000010  None  92.87%  
Spiking MLP(STBP)  80010  None  98.78% 

We only show the network structure based on MLP, and the other network structure refers to the above references. *means that their model is based on pretrained ANN models.
TableIV compares our STBP method with some stateoftheart results on NMNIST dataset. The upper 5 results are based on ANNs, and lower 4 results including our method uses SNNs. The ANNs methods usually adopt a framebased method, which collects the spike events in a time interval () to form a frame of image, and use the conventional algorithms for image classification to train the networks. Since the transformed images are often blurred, the framebased preprocessing is harmful for model performance and abandons the hardware friendly eventdriven paradigm. As can be seen from TableIV, the models of ANN are generally worsen than the models of SNNs. In contrast, SNNs could naturally handle event stream patterns, and by better use of spatiotemporal feature of event streams, our proposed STBP method achieves best accuracy of 98.78% when compared all the reported ANNs and SNNs methods. The greatest advantage of our method is that we did not use any complex training skills, which is beneficial for future hardware implementation.
IiiB2 Spatiotemporal convolution neural network
Extending our framework to convolution neural network structure allows the network going deeper and grants network more powerful SD information. Here we use our framework to establish the spatiotemporal convolution neural network. Compared with our spatiotemporal fully connected network, the main difference is the processing of the input image, where we use the convolution in place of the weighted summation. Specifically, in the convolution layer, each convolution neuron receives the convoluted input and updates its state according to the LIF model. In the pooling layer, because the binary coding of SNNs is inappropriate for standard max pooling, we use the average pooling instead.
Model  Network structure  Accuracy 

Spiking CNN (pretraining)[13]  2828112C5P264C5P210  99.12% 
Spiking CNN(BP)[23]  2828120C5P250C5P220010  99.31% 
Spiking CNN (STBP)  2828115C5P240C5P230010  99.42% 

We mainly compare with these methods that have the similar network architecture, and * means that their model is based on pretrained ANN models.
Model  Network structure  Accuracy  

Mean  Interval  
Nonspiking CNN(BP)  6C330010  98.57%  [98.57%, 98.57%] 
Spiking CNN(STBP)  6C330010  98.59%  [98.26%, 98.89%] 

* results with epochs [201,210].
Our spiking CNN model are also tested on the MNIST dataset as well as the object detection dataset . In the MNIST, our network contains one convolution layers with kernel size of and two average pooling layers alternatively, followed by one hidden layer. And like traditional CNN, we use the elastic distortion [36] to preprocess dataset. TableV records the stateoftheart performance spiking convolution neural networks over MNIST dataset. Our proposed spiking CNN model obtain 98.42% accuracy, which outperforms other reported spiking networks with slightly lighter structure. Furthermore, we configure the same network structure on a custom object detection database to evaluate the proposed model performance. The testing accuracy is reported after training 200 epochs. TableVI indicates our spiking CNN model could achieve a competitive performance with the nonspiking CNN.
IiiC Performance Analysis
IiiC1 The Impact of Derivative Approximation Curves
In section IIB, we introduce different curves to approximate the ideal derivative of the spike activity. Here we try to analyze the influence of different approximation curves on the testing accuracy. The experiments are also conducted on the MNIST dataset, and the network structure is . The testing accuracy is reported after training 200 epochs. Firstly, we compare the impact of different curve shapes on model performance. In our simulation we use the mentioned , , and shown in Figure3.b. Figure6.a illustrates the results of approximations of different shapes. We observe that different nonlinear curves, such as , , and , only present small variations on the performance.
Furthermore, we use the rectangular approximation as an example to explore the impact of width on the experiment results. We set and corresponding results are plotted in figure6.b. Different colors denote different values. Both too large and too small value would cause worse performance and in our simulation, achieves the highest testing accuracy, which implies the width and steepness of rectangle influence the model performance. Combining figure 6.a and figure 6.b, it indicates that the key point for approximating the derivation of the spike activity is to capture the nonlinear nature, while the specific shape is not so critical.
IiiC2 The Impact of Temporal Domain
A major contribution of this work is introducing the temporal domain into the existing spatial domain based BP training method, which makes full use of the spatiotemporal dynamics of SNNs and enables the highperformance training. Now we quantitatively analyze the impact of the TD item. The experiment configurations keep the same with the previous section () and we also report the testing results after training 200 epochs. Here the existing BP in the SD is termed as SDBP.
Model  Dataset  Network structure  Training skills  Accuracy  

Mean  Interval  
Spiking MLP  Objective tracking  78440010  None  97.11%  [96.04%,97.78%] 
(SDBP)  MNIST  78440010  None  98.29%  [98.23%, 98.39%] 
Spiking MLP  Objective tracking  78440010  None  98.32%  [97.94%, 98.57%] 
(STBP)  MNIST  78440010  None  98.48%  [98.42%, 98.51%] 

* results with epochs [201,210].
TableVII records the simulation results. The testing accuracy of SDBP is lower than the accuracy of the STBP on different dataset, which shows the time information is beneficial for model performance. Specifically, compared to the STBP, the SDBP has a 1.21% loss of accuracy on the objective tracking dataset, which is 5 times larger than the loss on the MNIST. And results also imply that the performance of SDBP is not stable enough. In addition to the interference of the dataset itself, the reason for this variation may be the unstability of SNNs training. Actually, the training of SNNs relies heavily on the parameter initialization, which is also a great challenge for SNNs applications. In many reported works, researchers usually leverage some special skills or mechanisms to improve the training performance, such as the lateral inhibition, regularization, normalization, etc. In contrast, by using our STBP training method, much higher performance can be achieved on the same network. Specifically, the testing accuracy of STBP reaches 98.48% on MNIST and 98.32% on the object detection dataset. Note that the STBP can achieve high accuracy without using any complex training skills. This stability and robustness indicate that the dynamics in the TD fundamentally includes great potential for the SNNs computing and this work indeed provides a new idea.
Iv Conclusion
In this work, a unified framework that allows supervised training spiking neural networks just like implementing backpropagation in deep neural networks (DNNs) has been built by exploiting the spatiotemporal information in the networks. Our major contributions are summarized as follows:

We have presented a framework based on an iterative leaky integrateandfire model, which enables us to implement spatiotemporal backpropagation on SNNs. Unlike previous methods primarily focused on its spatial domain features, our framework further combines and exploits the features of SNNs in both the spatial domain and temporal domain;

We have designed the STBP training algorithm and implemented it on both MLP and CNN architectures. The STBP has been verified on both static and dynamic datasets. Results have shown that our model is superior to the stateoftheart SNNs on relatively smallscale networks of spiking MLP and CNNs, and outperforms DNNs with the same network size on dynamic NMNIST dataset. An attractive advantage of our algorithm is that it doesn’t need extra training techniques which generally required by existing schemes, and is easier to be implemented in largescale networks. Results also have revealed that the use of spatiotemporal complexity to solve problems could fulfill the potential of SNNs better;

We have introduced an approximated derivative to address the nondifferentiable issue of the spike activity. Controlled experiment indicates that the steepness and width of approximation curve would affect the model’s performance and the key point for approximations is to capture the nonlinear nature, while the specific shape is not so critical.
Because the brain combines complexity in the temporal and spatial domains to handle input information, we also would like to claim that implementing STBP on SNNs is more bioplausible than applying BP on DNNs. The property of STBP that doesn’t rely on too many training skills makes it more hardwarefriendly and useful for the design of neuromorphic chip with online learning ability. Regarding the future research topics, two issues we believe are quite necessary and very important. One is to apply our framework to tackle more problems with the timing characteristics, such as dynamic data processing, video stream identification and speech recognition. The other is how to accelerate the supervised training of large scale SNNs based on GPUs/CPUs or neuromorphic chips. The former aims to further exploit the rich spatiotemporal features of SNNs to deal with dynamic problems, and the later may greatly prompt the applications of large scale of SNNs in real life scenarios.
References

[1]
P. Chaudhari and H. Agarwal,
Progressive Review Towards Deep Learning Techniques
. Springer Singapore, 2017.  [2] L. Deng and D. Yu, “Deep learning: Methods and applications,” Foundations and Trends in Signal Processing, vol. 7, no. 3, pp. 197–387, 2014.

[3]
Jia, Yangqing, Shelhamer, Evan, Donahue, Jeff, Karayev, Sergey, Long, and Jonathan, “Caffe: Convolutional architecture for fast feature embedding,”
Eprint Arxiv, pp. 675–678, 2014.  [4] G. Hinton, L. Deng, D. Yu, and G. E. Dahl, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, 2012.
 [5] K. He, X. Zhang, S. Ren, and J. Sun, Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. Springer International Publishing, 2014.
 [6] X. Zhang, Z. Xu, C. Henriquez, and S. Ferrari, “Spikebased indirect training of a spiking neural networkcontrolled virtual insect,” in Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on. IEEE, 2013, pp. 6798–6805.
 [7] J. N. Allen, H. S. AbdelAtyZohdy, and R. L. Ewing, “Cognitive processing using spiking neural networks,” in IEEE 2009 National Aerospace and Electronics Conference, 2009, pp. 56–64.
 [8] N. Kasabov and E. Capecci, “Spiking neural network methodology for modelling, classification and understanding of eeg spatiotemporal data measuring cognitive processes,” Information Sciences, vol. 294, no. C, pp. 565–575, 2015.
 [9] B. V. Benjamin, P. Gao, E. Mcquinn, S. Choudhary, A. R. Chandrasekaran, J. M. Bussat, R. AlvarezIcaza, J. V. Arthur, P. A. Merolla, and K. Boahen, “Neurogrid: A mixedanalogdigital multichip system for largescale neural simulations,” Proceedings of the IEEE, vol. 102, no. 5, pp. 699–716, 2014.
 [10] P. A. Merolla, J. V. Arthur, R. Alvarezicaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, and Y. Nakamura, “Artificial brains. a million spikingneuron integrated circuit with a scalable communication network and interface.” Science, vol. 345, no. 6197, pp. 668–73, 2014.
 [11] S. B. Furber, F. Galluppi, S. Temple, and L. A. Plana, “The spinnaker project,” Proceedings of the IEEE, vol. 102, no. 5, pp. 652–665, 2014.
 [12] T. Hwu, J. Isbell, N. Oros, and J. Krichmar, “A selfdriving robot using deep convolutional neural networks on neuromorphic hardware,” arXiv.org, 2016.
 [13] S. K. Esser, P. A. Merolla, J. V. Arthur, A. S. Cassidy, R. Appuswamy, A. Andreopoulos, D. J. Berg, J. L. Mckinstry, T. Melano, and D. R. Barch, “Convolutional networks for fast, energyefficient neuromorphic computing,” Proceedings of the National Academy of Sciences of the United States of America, vol. 113, no. 41, p. 11441, 2016.
 [14] S. S. Zhang, L.P. Shi, “Creating more intelligent robots through braininspired computing,” Science(suppl), vol. 354, 2016.
 [15] P. U. Diehl and M. Cook, “Unsupervised learning of digit recognition using spiketimingdependent plasticity,” Frontiers in Computational Neuroscience, vol. 9, p. 99, 2015.
 [16] D. Querlioz, O. Bichler, P. Dollfus, and C. Gamrat, “Immunity to device variations in a spiking neural network with memristive nanodevices,” IEEE Transactions on Nanotechnology, vol. 12, no. 3, pp. 288–295, 2013.
 [17] S. R. Kheradpisheh, M. Ganjtabesh, and T. Masquelier, “Bioinspired unsupervised learning of visual features leads to robust invariant object recognition,” Neurocomputing, vol. 205, no. C, pp. 382–392, 2016.
 [18] J. A. Perezcarrasco, B. Zhao, C. Serrano, B. Acha, T. Serranogotarredona, S. Chen, and B. Linaresbarranco, “Mapping from framedriven to framefree eventdriven vision systems by lowrate ratecoding and coincidence processing. application to feed forward convnets.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 11, pp. 2706–19, 2013.

[19]
P. U. Diehl, D. Neil, J. Binas, and M. Cook, “Fastclassifying, highaccuracy spiking deep networks through weight and threshold balancing,” in
International Joint Conference on Neural Networks, 2015, pp. 1–8. 
[20]
O. Peter, N. Daniel, S. C. Liu, D. Tobi, and P. Michael, “Realtime classification and sensor fusion with a spiking deep belief network,”
Frontiers in Neuroscience, vol. 7, p. 178, 2013.  [21] E. Hunsberger and C. Eliasmith, “Spiking deep networks with lif neurons,” Computer Science, 2015.
 [22] P. O’Connor and M. Welling, “Deep spiking networks,” arXiv.org, 2016.
 [23] J. H. Lee, T. Delbruck, and M. Pfeiffer, “Training deep spiking neural networks using backpropagation,” Frontiers in Neuroscience, vol. 10, 2016.
 [24] D. Neil, M. Pfeiffer, and S. C. Liu, “Phased lstm: Accelerating recurrent network training for long or eventbased sequences,” arXiv.org, 2016.
 [25] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: continual prediction with lstm,” Neural Computation, vol. 12, no. 10, p. 2451, 1999.

[26]
S. Hochreiter and J. Schmidhuber, “Long shortterm memory,”
Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. 
[27]
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Gated feedback recurrent neural networks,”
Computer Science, pp. 2067–2075, 2015.  [28] P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE, vol. 78, no. 10, pp. 1550–1560, 1990.
 [29] Y. Bengio, T. Mesnard, A. Fischer, S. Zhang, and Y. Wu, “An objective function for stdp,” Computer Science, 2015.
 [30] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.

[31]
E. Neftci, S. Das, B. Pedroni, K. Kreutzdelgado, and G. Cauwenberghs, “Eventdriven contrastive divergence for spiking neuromorphic systems,”
Frontiers in Neuroscience, vol. 7, p. 272, 2013.  [32] G. Orchard, A. Jayawant, G. K. Cohen, and N. Thakor, “Converting static image datasets to spiking neuromorphic datasets using saccades,” Frontiers in Neuroscience, vol. 9, 2015.
 [33] P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128x128 120db 15us latency asynchronous temporal contrast vision sensor,” IEEE Journal of SolidState Circuits, vol. 43, no. 2, pp. 566–576, 2007.
 [34] D. Neil and S. C. Liu, “Effective sensor fusion with eventbased sensors and deep network architectures,” in IEEE Int. Symposium on Circuits and Systems, 2016.
 [35] G. K. Cohen, G. Orchard, S. H. Leng, J. Tapson, R. B. Benosman, and A. V. Schaik, “Skimming digits: Neuromorphic classification of spikeencoded images,” Frontiers in Neuroscience, vol. 10, no. 184, 2016.
 [36] P. Y. Simard, D. Steinkraus, and J. C. Platt, “Best practices for convolutional neural networks applied to visual document analysis,” in International Conference on Document Analysis and Recognition, 2003, p. 958.
Comments
There are no comments yet.