Multivariate Time Series Classification Using Spiking Neural Networks

07/07/2020
by   Haowen Fang, et al.
Syracuse University
0

There is an increasing demand to process streams of temporal data in energy-limited scenarios such as embedded devices, driven by the advancement and expansion of Internet of Things (IoT) and Cyber-Physical Systems (CPS). Spiking neural network has drawn attention as it enables low power consumption by encoding and processing information as sparse spike events, which can be exploited for event-driven computation. Recent works also show SNNs' capability to process spatial temporal information. Such advantages can be exploited by power-limited devices to process real-time sensor data. However, most existing SNN training algorithms focus on vision tasks and temporal credit assignment is not addressed. Furthermore, widely adopted rate encoding ignores temporal information, hence it's not suitable for representing time series. In this work, we present an encoding scheme to convert time series into sparse spatial temporal spike patterns. A training algorithm to classify spatial temporal patterns is also proposed. Proposed approach is evaluated on multiple time series datasets in the UCR repository and achieved performance comparable to deep neural networks.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

04/22/2021

Noise-Robust Deep Spiking Neural Networks with Temporal Information

Spiking neural networks (SNNs) have emerged as energy-efficient neural n...
02/19/2020

Exploiting Neuron and Synapse Filter Dynamics in Spatial Temporal Learning of Deep Spiking Neural Network

The recent discovered spatial-temporal information processing capability...
10/17/2021

Backpropagation with Biologically Plausible Spatio-Temporal Adjustment For Training Deep Spiking Neural Networks

The spiking neural network (SNN) mimics the information processing opera...
11/09/2021

Unsupervised Spiking Instance Segmentation on Event Data using STDP

Spiking Neural Networks (SNN) and the field of Neuromorphic Engineering ...
03/01/2019

Integrating Temporal Information to Spatial Information in a Neural Circuit

In this paper, we consider a network of spiking neurons with a determini...
10/19/2021

An Adaptive Sampling and Edge Detection Approach for Encoding Static Images for Spiking Neural Networks

Current state-of-the-art methods of image classification using convoluti...
06/21/2022

TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks

Spiking Neural Networks (SNNs) is a practical approach toward more data-...

I Introduction

The function and behavior of Spiking Neural Networks (SNN) are derived from its inspiration, i.e. the brain, which is capable of performing numerous cognitive tasks with minimal energy requirements. The potential of SNN has drawn various research interests including emerging device, algorithm and applications [30, 23, 17, 29]

. In general, the brain’s capability comes from the complex dynamics of the networks of spiking neurons and the plastic synapses connecting them. These dynamics can capture complex spatial temporal features of input encoded as sparse temporal spiking activity. Despite the biological inspiration, majority of the existing models of SNNs are unable to replicate such dynamics to encode, learn and decode temporal information.

The limitations of existing SNNs are multifold. Firstly, most SNN models and training algorithms consider only the statistics of spike activities. A numerical value is represented by spike counts in a time window [9]. Though, this type of SNN has demonstrated state-of-art performance in various tasks[18], it suffers from high spike activities [16]

. Thus, it cannot fully benefit from event-driven computation. Secondly, directly adapting backpropagation is not feasible because spiking neuron’s output is a Dirac delta function. One approach to address the problem is to train an artificial neuron network (ANN) such as multi-layer perceptron (MLP) and map the weights to SNN. However, it suffers form accuracy degradation, additional fine-tuning of weights and thresholds is required to minimize the performance penalty

[4]. Recently gradient surrogate is proposed to approximate the gradient of the spiking function, enabling backpropagation [15, 6, 24, 26, 25, 8]. [6] derived a cumulative error function as gradient surrogate. [26] derived a simplified model from Leaky Integrate and Fire neuron (LIF), and proposed four functions as gradient surrogates. Other approaches include replacing hard threshold function by a differentiable soft spike [20] [14]. However, it compromises SNN’s most distinct feature, binary spike.

While most SNN models and training algorithm use spike counts to resemble numerical value, it is observed that in biological neural networks, temporal structure of spike train also conveys information [3]. Two spike trains of same spike rates can have distinct temporal patterns, hence the represented information is different. Such temporal encoding can efficiently encode information using extremely sparse spikes-events [16]. There are some existing works to train neurons to detect temporal spike patterns. Tempotron [12] trains individual neuron to perform binary classifications for different spatial temporal input spike patterns. Neuron generate at least one spike for positive pattern, and remain inactive for other patterns. Based on Tempotron, [13] proposed an algorithm to adjust synaptic weights such that neuron can generate desired number of spikes given a specific input pattern. SPAN [19] trains an individual neuron to associate a spatial temporal input pattern with a specific output spike pattern. However, these works aim at training individual neurons, cannot be extended to multiple layers, therefore the performance is limited. There are also recent works utilize backpropagation through time (BPTT) to address the temporal dependency problems. [25] proposed a training rule to reassign errors in time. [11]

proposed a novel loss function and derived an iterative model from Tempotron. Based on the iterative model, network can be unrolled hence BPTT is possible.

[31] captures the temporal dependency on membrane potentials and use membrane potential as objective function to learn temporal patterns.

Existing works have achieved comparable performance with DNN in vision tasks such as static image or event-based data classification, however few SNN models address the time series classification tasks. The first challenge is the limitation of rate coding, since it treats spikes statistically, spike coding cannot represent temporal information. Though it is possible to flatten the time series into a 1-D array and then represent it by rate coding, this increases the input size. In addition, for real time applications, flattening input requires buffering, which increases computation latency. Secondly, unlike images, such as MNIST images, where all value’s range, precision and scale are identical, multivariate time series may be collected from different sensors, therefore their precision and range may vary. Rate coding has to guarantee the precision for the most high-resolution input. Such ”design for worst case” coding scheme lacks flexibility and hence is not efficient. New coding methods to represent time series are required to exploit the potential of SNNs.

In this work, our contributions are summarized as follows:

  • We present a coding method that can convert time series into sparse spatial temporal spike patterns.

  • We derive an iterative SNN model from Spike Response Model, such that the Backpropagation Through Time is possible. An event-based updating algorithm is also proposed to reduce computation overhead for inference.

  • We formulate a backpropagation rule for the iterative SNN model and propose a training algorithm to train the model on spatial temporal patterns.

  • We evaluate the proposed method on multiple datasets, and achieve comparable accuracy with DNN. To the best of our knowledge, this is the first work applying SNN for multivariate time series classification.

Ii SNN Model

Without loss of generality, we adopt the a widely used Leaky Integrate and Fire (LIF) neuron defined by Spike Response Model [10, 12]. Each input spike induces a charge in the neuron’s membrane potential, which is called a postsynaptic potential (PSP):

(1)

where denotes the arrival time of input spike. is synapse kernel. Neuron accumulate all input PSPs, such that the membrane potential is defined as:

(2)
Fig. 1: Spiking neuron model

where is the number of input synapse. is the arrival time of spike at input synapse. is a time constant of neuron. is the weight associated with each input synapse. is the time when the neuron generates an output spike and the rightmost term can be interpreted as a negative voltage applied to the neuron itself such that the membrane potential is decreased by a factor of the threshold voltage . This serves as the reset mechanism at the time of spike. Thus, the neuron’s potential is the summation of all weighted input PSP plus the negative voltage given by rightmost term [10]. The neuron model with inputs is illustrated in figure 1. PSP kernel is defined as:

(3)

where and are two time constants. is a normalization factor which scales the maximum value of to 1, and [13]. The shape of the PSP kernel is shown in figure 2.

Fig. 2: PSP kernel

At time , PSP and membrane potential are determined by all previous inputs. However, it is not feasible to directly implement the SNN model defined by Equation 2. At any given time , has to be computed by recursively convolving input spike trains with , thus, incurring significant computation overhead. To address this issue, an incremental way to update the PSP can be derived from the SRM model in discrete time domain.

More formally, input spike train can be defined as a sequence of time shifted Dirac Delta functions:

(4)

where denotes a spike received at time , otherwise . Similarly, output spike train can defined as:

(5)

where satisfies . To derive the incremental model, We define , , such that the can be expressed as the combination of and [28]:

(6)

and can be computed incrementally:

(7)

Similarly, we can compute reset voltage :

(8)

Such that the SNN defined in Equation 2 can be equivalently expressed as:

(9a)
(9b)
(9c)
(9d)
(9e)
(9f)

where indexes , , denote layer index, neuron index and input index respectively. denotes the number of neurons in layer. is input current, is reset voltage, and is neuron output. , , are three decay factors. More specifically, denotes the encoding layer, which will be discussed in section III, is the number of layers in the network and denotes output layer. is a Heaviside step function:

(10)

In the above model, the temporal dependency can be clearly seen in Equation 9a - 9f. At each time , the PSP, membrane potential and output can be computed based on time , hence by unfolding the network, Backpropagation Through Time (BPTT) can be used to train the network. Note that the gradient of is a Dirac Delta function, therefore backpropagation cannot be directly applied. Its approximation will be discussed in IV.

Ii-a Event-driven Inference

The above model provides an explicit approach to update SNN’s states based on step-wise computation, and is suitable for training. In inference, the model can be simulated in an event-driven manner, i.e. computation is only necessary when there is a spike event, hence significantly reducing the computation overhead.

Suppose at time , the value of or is known, without input spike, after unit time later, i.e. at time , the and can be computed as:

(11)

When there is an input spike at time . There is an instantaneous unit charge on and :

(12)

Similar update rule applies for :

(13)
Input spike buffer:
Elapsed time since last input spike:
Elapsed time since last output spike:
Time
for  do
       if  then
             foreach synapse  do
                  
                   if  in  then
                        
                        
                        
                        
                        
                  else
                        
                        
            if  then
                  
                  
                  
            else
                  
      else
            
            
            
      
Algorithm 1 Event-driven inference

The event-driven computation algorithm is shown in Algorithm 1. By tracking the elapsed time , computation is only necessary when there is an input or output spike. In addition, the kernel decays over time, and becomes effectively 0 after a period. Therefore the decay factor for different can be pre-computed and stored in a look-up table, so that the expensive exponential function is avoided.

Iii Spatial Temporal Population Encoding

Rate coding represents a numerical value by the activity of an individual neuron that fires at a particular rate. For example, in vision tasks, to encode one pixel, a spike train’s spike count in a given time window is proportional to the pixel value. There are several drawbacks of rate coding. 1) the precision is limited because the value represented by rate coding is quantized by bin size . Though higher precision can be obtained by increasing , the computational latency increases as well; 2) it is unable to represent temporal information as it treats spike activity statistically. Time series have to be flattened and then converted to spike trains. In real time scenarios, it requires data stream to be buffered, which causes additional latency; 3) Individual neuron is too noisy due to stochastic nature, thus it introduces additional noise; 4) it causes high spiking activity, as larger value has to be represented by more spikes, which deprives the SNN energy efficiency. 5) It is incapable of representing negative values, which are common in sensor inputs.

To address the above issues, we employ a coding method suitable for encoding time series by combining population coding and temporal coding [7]

. In population coding the information is represented by the activity of a group of neurons. Inside a population, each neuron has its favorable input, i.e. each neuron responds to a particular input and remains relatively inactive for other inputs. In temporal coding, the spike train patterns also convey information.

We utilize a population of Current-based Integrate and Fire(CUBA) neurons as encoder. A CUBA neuron is defined as a hybrid system [2]:

(14)

where is the external time-varying input current, is the membrane time constant, which determines the decay speed of membrane potential, is the neuron membrane potential, is the gain. Neuron accumulates the input current and updates the membrane potential continuously. When the exceeds , a reset is triggered, membrane potential is forced to 0.

In practice, CUBA neuron model is simulated in discrete time, is evaluated on a time grid and the interval of the grid is [21]. is also sampled at each time grid. Such that the discrete version of the model represented by Equation 14 is:

(15)
Fig. 3: Coding method

With this coding scheme, a univariate time series, for example sensor data is treated as input current and connected to a population of encoder neurons. Each neuron may have different time constant and gain , so that each encoder responds to the input differently. In addition, by setting to negative, neuron can also respond to negative input, which overcomes the drawback of rate coding. We utilize Neural Engineering Framework (NEF)[5] to pre-train the encoder. Using this approach, time varying signal is converted into time varying spike patterns. For multivariate time series of channels, each channel can be encoded using the above approach. Unlike vision tasks in which all input dimensions have identical resolution and range, multivariate time series maybe collected from different sensors, therefore the resolution, precision and range may vary. By coding method, the population size and tuning curve can be adjusted to provide just enough precision for all the channels [27, 7].

Iv Training Algorithm

For the classification task, the neuron in the output layer that fires most frequently represents the result, we use cross-entropy loss defined as:

(16)

where

is the probability of each class calculated by softmax,

given by:

(17)

where is the label, is number of layers, denotes output of last layer, and is the neuron number of last layer.

Equation 9a - 9f provide an explicit way to update SNN states and outputs. By unfolding the network, BPTT can be used for training. First, we define , and .

is given by:

(18)

At last layer , can be computed as:

(19)

is computed as:

(20)

For hidden layer , can be computed recursively from output layer and time to input layer and time 0:

(21)

Heaviside step function is non-differentiable. We employ gradient surrogate [20] to address this issue. In forward path, the spike generation mechanism remains unchanged, while in the backward path, the derivative of

is replaced by the derivative of a smooth function. We use a sigmoid function proposed by

[26] as the gradient surrogate in the backward path, such that the gradient of is approximated as:

(22)

Based on above equations, the gradient of weight can be computed as:

(23)
Input: Time-varying input
Output: Optimized weights
// Forward
for  to  do
       // Encoding
      
       if  then
             // Reset
             // Generate spike
            
      else
            
      for  to  do
             // Update states Eq.9a -9e
             // Eq.9f
            
      // Calculate loss
       // Eq.16-17
       // Backward
      
for  to  do
       // Eq.IV-20
       for  to  do
             // Eq.18,IV
            
Algorithm 2 Training process of one iteration

V Experiments

The proposed network model and algorithm are implemented in PyTorch. We demonstrate the effectiveness of our work in two experiments. In the first experiment, we compared the coding efficiency of rate coding and temporal population coding in terms of spike rate and input size. In second experiments, the proposed network and algorithm is evaluated on various multivariate time series classification tasks.

V-a Coding Efficiency

Fig. 4: Articulary Word Recognition dataset sample

First, we study the efficiency of the proposed coding method by utilizing the Articulary Word Recognition dataset collected by UEA & UCR Time Series repository [1] as an example. This dataset consists of multivariate sensor data recorded by Electromagnetic Articulograph (EMA), which is a device to track the motion of speakers’ tongue and lips. Each sample contains 9 variates of length 144. An example of this dataset is shown in figure 4, each line represents a time varying input. We use both rate coding and temporal population coding to convert the time series to spikes. The spike patterns are shown in figure 5. Each dot in figure 4(a) represents a spike, and in figure 4(b) a spike is represented by a vertical line. We use an input window of 300. Due to the incapability of rate coding to represent temporal information, the time series have to be flattened, resulting in 1296 spike trains. For clarity, only the first 100 spike trains are shown in figure. In temporal coding, for each variate, we use 5 neurons to encode. It is clearly seen that the temporal population coding is sparser. In addition, it is encoding the input with 45 spike trains, which is significantly smaller than the number of spike trains obtained by rate coding. This is beneficial to reduce the SNN model size.

We tested our coding method with rate coding on four multivariate datasets, details including average spike count, spike rate, input size are shown in table I. Temporal in table I refers to temporal population coding. The spike rates are significantly lower than rate-based coding. As can be seen in last column, the input size of our coding method is also significant less. Particularly for long time series, such as Atrial Fibrillation dataset. It consists of two variates, and the length is 640, therefore flattening operation resulting a large number of inputs. Buffering such long time series also causes significant latency in real time applications. While temporal population coding can convert input on the fly, not only input size is reduced, buffering is also no longer required.

(a) Rate coding
(b) Temporal population coding
Fig. 5: Coding comparison
Dataset
Coding Method
Spike Count
Spike Rate
Input Size
Articulary Word
Recognition
Rate 65653.8 16.90 1296
Temporal 518.7 3.84 45
Basic
Motions
Rate 15061.7 8.36 600
Temporal 122.4 1.13 30
Finger
Movements
Rate 1069.0 25.45 1400
Temporal 471.3 1.12 140
Atrial
Fibrillation
Rate 23041.4 10.77 1280
Temporal 164.5 4.76 10
TABLE I: Coding Efficiency

V-B Computation Overhead

To evaluate the computation overhead of proposed coding method and event driven inference algorithm, we build a SNN to classify Articulary Word Recognition dataset. A vanilla 2 layer stacked LSTM and RNN of unit size 300 are also built as reference. The network structure, number of network parameters, and accuracy are shown in table II. Our network achieved comparable accuracy with 11 % number of parameters of LSTM. In addition, the length of this time series is 144, LSTM and RNN have to perform computation step by step, this introduces significant amount of operations. Our model can benefit from the event driven nature, computations are only necessary when there are spike events. Given the average number of input spike is 518.7, the computation overhead is minimal compared with LSTM/RNN.

Model Network structure Parameter number Accuracy
LSTM 9-300-300-25 1103125 98.31
RNN 9-300-300-25 281425 98.20
SNN 45-300-300-25 125880 98.27
TABLE II: Model comparison

V-C Time Series Classification

Dataset
ED
[1]
DTW
[1]
TapNet
[32]
WEASEL
[22]
This
work
Articulary Word
Recognition
0.97 0.98 0.987 - 0.98
FaceDetection
0.519 0.513 0.556 - 0.57
BasicMotions 0.675 1 1 - 1
Heartbeat 0.62 0.659 0.751 - 0.72
Spoken Arabic
Digits
0.967 0.96 0.983 0.992 0.98
JapaneseVowels 0.924 0.959 0.965 0.976 0.97
RacketSports 0.868 0.842 0.868 0.934 0.87
TABLE III: Accuracy

Our algorithm is evaluated on 7 multivariate time series datasets. We build a network of size 500-500-500-X, X indicates the size of last layer, which varies according to different dataset class numbers. Adam optimizer is used and the learning rate is 0.0001. The result is shown in table III. ED and DTW refer to 1-Nearest Neighbor with Euclidean Distance and Dynamic Time Warping respectively [1]. TapNet is a DNN-based approach for time series classification [32]. No accuracy in SNN domain is listed, to our best knowledge, there is no previous work that comprehensively focusing on time series classification with SNN.

In all the 7 datasets, our method outperforms the 1-Nearest Neighbor classifier, which is a standard classifier for time series classification. In Spoken Arabic Digits dataset and Racket Sport dataset, our method achieves higher accuracy than DNN based approach. In Articulary Word Recognition dataset, Heart Beat dataset, thought TapNet achieves better accuracy, however the advantage is insignificant: 0.987 v.s. 0.98, 0.751 v.s. 0.72.

Vi Conclusion

In this work, we presented an iterative SNN model and training algorithm for spatial temporal spike pattern classification. A coding method to convert continuous time series to discrete spikes is also proposed. Our coding method is able to represent information by sparse spike patterns, such that the computation overhead can be significantly reduced. We evaluate our algorithm and coding method on various multivariate time series dataset, and outperform the standard 1-Nearest Neighbor classifiers and also show competitive performance with DNN based approaches.

References

  • [1] A. Bagnall, H. A. Dau, J. Lines, M. Flynn, J. Large, A. Bostrom, P. Southam, and E. Keogh (2018) The uea multivariate time series classification archive, 2018. arXiv preprint arXiv:1811.00075. Cited by: §V-A, §V-C, TABLE III.
  • [2] R. Brette, M. Rudolph, T. Carnevale, M. Hines, D. Beeman, J. M. Bower, M. Diesmann, A. Morrison, P. H. Goodman, F. C. Harris, et al. (2007) Simulation of networks of spiking neurons: a review of tools and strategies. Journal of computational neuroscience 23 (3), pp. 349–398. Cited by: §III.
  • [3] D. A. Butts, C. Weng, J. Jin, C. Yeh, N. A. Lesica, J. Alonso, and G. B. Stanley (2007) Temporal precision in the neural code and the timescales of natural vision. Nature 449 (7158), pp. 92. Cited by: §I.
  • [4] P. U. Diehl, D. Neil, J. Binas, M. Cook, S. Liu, and M. Pfeiffer (2015) Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. Cited by: §I.
  • [5] C. Eliasmith and C. H. Anderson (2004) Neural engineering: computation, representation, and dynamics in neurobiological systems. MIT press. Cited by: §III.
  • [6] S. K. Esser, R. Appuswamy, P. Merolla, J. V. Arthur, and D. S. Modha (2015) Backpropagation for energy-efficient neuromorphic computing. In Advances in Neural Information Processing Systems, pp. 1117–1125. Cited by: §I.
  • [7] H. Fang, A. Shrestha, Z. Zhao, Y. Li, and Q. Qiu (2019) An event-driven neuromorphic system with biologically plausible temporal dynamics. In 38th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2019, pp. 8942083. Cited by: §III, §III.
  • [8] H. Fang, A. Shrestha, Z. Zhao, and Q. Qiu (2020) Exploiting neuron and synapse filter dynamics in spatial temporal learning of deep spiking neural network. arXiv preprint arXiv:2003.02944. Cited by: §I.
  • [9] H. Fang, A. Shrestha, Z. Zhao, Y. Wang, and Q. Qiu (2019) A general framework to map neural networks onto neuromorphic processor. In 20th International Symposium on Quality Electronic Design (ISQED), pp. 20–25. Cited by: §I.
  • [10] W. Gerstner and W. M. Kistler (2002) Spiking neuron models: single neurons, populations, plasticity. Cambridge university press. Cited by: §II, §II.
  • [11] P. Gu, R. Xiao, G. Pan, and H. Tang (2019) STCA: spatio-temporal credit assignment with delayed feedback in deep spiking neural networks. In

    Proceedings of the 28th International Joint Conference on Artificial Intelligence

    ,
    pp. 1366–1372. Cited by: §I.
  • [12] R. Gütig and H. Sompolinsky (2006) The tempotron: a neuron that learns spike timing–based decisions. Nature neuroscience 9 (3), pp. 420. Cited by: §I, §II.
  • [13] R. Gütig (2016) Spiking neurons can discover predictive features by aggregate-label learning. Science 351 (6277), pp. aab4113. Cited by: §I, §II.
  • [14] D. Huh and T. J. Sejnowski (2018) Gradient descent for spiking neural networks. In Advances in Neural Information Processing Systems, pp. 1433–1443. Cited by: §I.
  • [15] J. H. Lee, T. Delbruck, and M. Pfeiffer (2016) Training deep spiking neural networks using backpropagation. Frontiers in neuroscience 10, pp. 508. Cited by: §I.
  • [16] T. Liu, Z. Liu, F. Lin, Y. Jin, G. Quan, and W. Wen (2017) Mt-spike: a multilayer time-based spiking neuromorphic architecture with temporal error backpropagation. In Proceedings of the 36th International Conference on Computer-Aided Design, pp. 450–457. Cited by: §I, §I.
  • [17] X. Ma, G. Yuan, S. Lin, C. Ding, F. Yu, T. Liu, W. Wen, X. Chen, and Y. Wang (2020) Tiny but accurate: a pruned, quantized and optimized memristor crossbar framework for ultra efficient dnn implementation. In 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC), Vol. , pp. 301–306. Cited by: §I.
  • [18] P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura, et al. (2014) A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345 (6197), pp. 668–673. Cited by: §I.
  • [19] A. Mohemmed, S. Schliebs, S. Matsuda, and N. Kasabov (2012) Span: spike pattern association neuron for learning spatio-temporal spike patterns. International journal of neural systems 22 (04), pp. 1250012. Cited by: §I.
  • [20] E. O. Neftci, H. Mostafa, and F. Zenke (2019) Surrogate gradient learning in spiking neural networks. arXiv preprint arXiv:1901.09948. Cited by: §I, §IV.
  • [21] S. Rotter and M. Diesmann (1999) Exact digital simulation of time-invariant linear systems with applications to neuronal modeling. Biological cybernetics 81 (5-6), pp. 381–402. Cited by: §III.
  • [22] P. Schäfer and U. Leser (2017) Multivariate time series classification with weasel+ muse. arXiv preprint arXiv:1711.11343. Cited by: TABLE III.
  • [23] A. Shrestha, K. Ahmed, Y. Wang, D. P. Widemann, A. T. Moody, B. C. Van Essen, and Q. Qiu (2017)

    A spike-based long short-term memory on a neurosynaptic processor

    .
    In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 631–637. Cited by: §I.
  • [24] A. Shrestha, H. Fang, Q. Wu, and Q. Qiu (2019) Approximating back-propagation for a biologically plausible local learning rule in spiking neural networks. In Proceedings of the International Conference on Neuromorphic Systems, pp. 10. Cited by: §I.
  • [25] S. B. Shrestha and G. Orchard (2018) SLAYER: spike layer error reassignment in time. In Advances in Neural Information Processing Systems, pp. 1412–1421. Cited by: §I, §I.
  • [26] Y. Wu, L. Deng, G. Li, J. Zhu, and L. Shi (2018) Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience 12. Cited by: §I, §IV.
  • [27] S. Yarrow and P. Seriès (2015) The influence of population size, noise strength and behavioral task on best-encoded stimulus for neurons with unimodal or monotonic tuning curves. Frontiers in computational neuroscience 9, pp. 18. Cited by: §III.
  • [28] Q. Yu, H. Li, and K. C. Tan (2018) Spike timing or rate? neurons learn to make decisions for both through threshold-driven plasticity. IEEE transactions on cybernetics 49 (6), pp. 2178–2189. Cited by: §II.
  • [29] G. Yuan, X. Ma, C. Ding, S. Lin, T. Zhang, Z. S. Jalali, Y. Zhao, L. Jiang, S. Soundarajan, and Y. Wang (2019) An ultra-efficient memristor-based dnn framework with structured weight pruning and quantization using admm. In 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), Vol. , pp. 1–6. Cited by: §I.
  • [30] G. Yuan, C. Ding, R. Cai, X. Ma, Z. Zhao, A. Ren, B. Yuan, and Y. Wang (2017) Memristor crossbar-based ultra-efficient next-generation baseband processors. In 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1121–1124. Cited by: §I.
  • [31] F. Zenke and S. Ganguli (2018)

    Superspike: supervised learning in multilayer spiking neural networks

    .
    Neural computation 30 (6), pp. 1514–1541. Cited by: §I.
  • [32] X. Zhang, Y. Gao, J. Lin, and C. Lu (2020) TapNet: multivariate time series classification with attentional prototypical network. Cited by: §V-C, TABLE III.