I Introduction
Human brain, consisting of massive interconnected neurons, is remarkable on various cognitive capabilities such as learning, memory, decision making, etc., while retains a high powerefficiency over modernday supercomputers [1]
. Leveraging intuitions about operations that neurons perform in the brain, the abstract, simple and yet powerful neuron models, such as perceptron
[2] (see Fig. 1Afor illustration), were proposed with aims to produce an approaching intelligence as compared to the brain. Incorporating more inspirations from the biological nervous systems and structures, artificial neural networks (ANNs) were proposed with simple abstractions of the biological counterparts to produce artificial intelligence to certain extent. With decades of development, ANNs, following the emergence of a family of machine learning techniques called deep learning
[3], have achieved ubiquitous success in various fields such as face and speech recognition, which are frequently performed in our daily lives in recent years [4]. Their remarkable achievements can be at least attributed to massive data resources grabbed from the Internet and powerful computing platforms equipped with graphical processing units (GPUs). The big data makes ANNs be generalized better, while GPUs accelerate both the training and inference of ANNs.Despite the recent advances of ANNs, their dependency on huge amount of training data and modern highperformance computing facilities to adapt massive parameters for a given task makes them extremely expensive both in time and energy, as well as storage capacity [5]. In some applications, such as onboard processing in smart platforms like mobile phones and autonomous drones, the demand for fast and efficient processing may still prohibit the running of large ANNs [6]. By contrast, the human brain performs impressive efficiency with a power budget of nearly 20 W [7], let alone its powerful capabilities for cognitive computations. Therefore, neuromorphic computing efforts emerge to potentially overcome the current limitations of ANNs by carrying out brainlike processing [6, 8, 9].
There is a fundamental difference between ANNs and nervous systems in the brain regarding to the information processing and computing paradigm. Neurons in the brain communicate and learn with discrete electrical pulses, also called spikes [1], while ANNs use continuous values as reflection of neurons’ activations or analog of their firing rates [2, 10]. The discrete feature of spikes is believed to play an essential role in efficient and remarkable cognitive computation in the brain, and thus leads to the development of a new generation of neural networks, i.e. spiking neural networks (SNNs) [11, 9]. Different from traditional neurons in ANNs where computations are normally degraded to spatial dimension only, spiking neurons extend their computational capability with an additional time domain (see Fig. 1B for a demonstration).
A typical spiking neuron receives and generates allornothing (binary) spikes from upstream neurons and to downstream ones, respectively. Although it still remains unclear how information is coded exactly with spikes, there are two of the most longstanding coding schemes that are widely studied, which are the rate and temporal codes [1, 13, 14, 15]. The rate code ignores the temporal structure of the spike train, making it highly robust with respect to interspikeinterval noise [16, 17], while the temporal code has a high informationcarrying capacity as a result of making full use of the temporal structure [10, 18, 19]. Although an increasing number of experiments have been shown in various nervous systems [17, 1, 12, 20] to support different codes, it is still arguable which one dominates information coding in the brain [21, 22, 23]. Could one compound code be utilized such that it can take advantages of different codes?
It is quite often that neurons elicit spikes in a form of burst [24, 25]: short periods of time with substantially high firing rate (see Fig. 1C for a demonstration). This bursting phenomenon has been shown to play an important role in both reliable coding and learning [26, 27, 28]. Recent experimental finding in retina [12] suggests that latency and spike count may serve to encode complementary stimulus features which could support a rapid and reliable analysis. A conceivable neural code can thus be introduced by combining the latency [10] and bursting phenomena, and we name it as latencyburst (Fig. 1D). In this latencyburst code, the stronger the stimulus intensity, the earlier of the leading spike and the more number of spikes in the following burst. In this way, the temporal component could support rapid processing while the burst would provide complementary information for subsequent refinement.
A more radical approach can assemble spikes in the latencyburst into a single spike event, named as augmented spike in this paper, where its appearance time carries the latency information while a multistage spike strength [29, 30] rather than the normal binary one can be used to reflect additional information. The concept of weighted spikes [29, 30] has shown its advantage in reduction of classification latency and the number of spikes owing to the spike strength, but the temporal benefit is rarely explored. Moreover, how neurons adapt their synaptic efficacies to extract information carried by these augmented spikes still remains very much unclear.
Learning is a vital part to almost every intelligent system. It can determine how neurons adapt their characteristics in response to external stimuli such that they can fit the environment to solve certain cognitive tasks. Spiketimingdependent plasticity (STDP), a timingdependent specialization of Hebbian learning [31, 32], is one of the common characteristics in nervous systems and widely studied in computational neuroscience. A typical STDP rule is to strengthen the synaptic connection between two neurons if they have causal spike order, while to weaken it otherwise. STDP enables neurons to process information in an unsupervised way [33], but with a requirement of temporal contiguity [34]. Many other synaptic learning rules are thus developed with different focuses [34, 35, 36, 37, 38]. Learning rules like the tempotron [34] are developed to train neurons to have binary response of firing or not, and thus they could be used for efficient learning and classification tasks [39]. Other rules [35, 36, 40, 41, 42] are developed to train neurons to fire at desired timings such that the temporal structure of the output could be utilized for further processing [43, 44]. Other developments are proposed to train neurons to have desired number of spikes in response to given input stimuli, and such learning rules are capable of extracting embedded features and discriminating patterns encoded with different coding schemes [38, 23]. Despite various synaptic learning rules, rare one considers augmented spikes.
Although the spikebased computation and representation are promising for improvements on efficiency and computational capability [9], their performance is relatively poor with respect to classification accuracy as is compared to ANNs [3, 37, 39, 44]. Recent approaches of mapping techniques can successfully convert a trained ANN to an SNN with almost lossless accuracy [45, 46]
, but spikes operate in a rate regime, thus leaving the efficient temporal benefit unexploited. The gap between ANNs and SNNs motivates us to combine the advantages of the fundamental elements of the both, that is the spikebased characteristic from SNNs and accurate representation in ANNs. Therefore, we heuristically introduce a new representation scheme with augmented spikes, which are also inspired and supported by biological phenomena
[25, 12, 10, 26] to certain extents. However, how could neurons process and learn from these augmented spikes remains unclear. Additionally, it is also intriguing what computational benefit and application merit would one expect from learning these augmented spikes.In this work, we propose a new spiking neuron model with the capability of processing augmented spikes. Then, several new synaptic learning rules are developed to enable neurons to learn from these augmented spikes. The significance of our major contributions can be enumerated as follows.

A new spiking neuron model is proposed to process augmented spikes where additional information can be coded with spike strength in addition to latency. This neuron model extends the computation with an additional dimension, which could be of great significance for both processing and learning.

We propose several synaptic learning rules that are capable of learning augmented spikes. A systematic characterization of these learning rules with respect to their computational properties is provided. These could contribute as a guideline for the development of spikebased frameworks.

We are the first one, to the best of our knowledge, to bring the concept of augmented spike to synaptic learning. This protocol could be easily generalized to other spikebased learning and processing systems, highlighting the versatility and potential of our methods.

We demonstrate the applicability of our methods with two practical tasks including acoustic and visual recognitions. The better performance of ours as compared to the baseline methods suggests the practical merit of our contribution.
Ii Methods
In this section, we will firstly describe the neuron model proposed for processing augmented spikes. Then, synaptic learning rules are developed based on the neuron model. Various spike learning rules can be categorized according to neuron’s output response. For example, a neuron can be trained to have binary spike response or multiple output spikes where their precise timings or total spike number is required for a desired response. In this paper, we select tempotron [34], PSD [41] and TDP [23] as representatives of different types. New learning rules based on these are developed to train neurons to process augmented spikes. Other specific methods used in the simulations are described in the next section, as well as in the appendix.
Iia Augmented Spiking Neuron Model
There are various popular spiking neuron models, such as HodgkinHuxley model [47], Izhikevich model [48] and leakyintegrateandfire (LIF) model [49, 50]. They are proposed with certain levels of resembling behaviors of biological neurons. In this paper, we use the currentbased leaky integrateandfire neuron model due to its simplicity and analytical tractability [38, 23]. Notably, our scheme with augmented spikes can be easily applied to other spiking neuron models.
Different from a typical binary spike, the augmented one utilizes the spike strength to carry additional information, and we name this quantity as spike coefficient in the following. Spike coefficients can be any analog or discrete values carrying information of both polarity and magnitude (Fig. 2A). In order to process augmented spikes, neurons need to incorporate the spike coefficients into their evolving dynamics. Like a standard one, our neuron model continuously integrates afferent spikes into its membrane potential and elicit output spikes whenever a firing condition is matched. In a normal case with binary spikes, each afferent spike will result in a postsynaptic potential (PSP) whose peak is solely controlled by synaptic weight. In our augmented spiking neuron model, we additionally use spike coefficients to modulate the effect of PSPs.
The evolving dynamics of our neuron model with synaptic afferents is described as
(1) 
where denotes the firing threshold and represents the time of the th output spike. is the synaptic weight. represents the th input spike time from the th afferent, with denoting the corresponding spike coefficient. is the time constant of the neuron model. Kernel determines the shape of PSPs, and is defined as
(2) 
where is a constant parameter that normalizes the peak of the kernel to unity. represent the time constant of the synaptic currents.
As can be seen from Fig. 2, each augmented spike will result in a PSP whose appearance is determined by the time of spike event, and its peak is controlled by both synaptic weight and spike coefficient. Whenever the neuron’s membrane potential crosses its firing threshold, an output spike is elicited, followed by a reset dynamics described in Eq. (1).
IiB Augmented Tempotron Rule
The tempotron (abbreviated as ‘Tmp’ in this paper) proposed in [34] is an efficient rule to train neurons to make decisions by firing or not. Following the principles in the original tempotron rule, we propose an extension, the augmented tempotron (‘AugTmp’) rule, to empower neurons the ability to learn from augmented spikes.
In the AugTmp rule, the neuron can only fire a single spike due to the constraint of a shunting mechanism [34], and the dynamics of which is slightly different from the one described in Eq. (1). A neuron is required to elicit a single spike in response to a target pattern () and to keep silent to a null one (). If the neuron failed to have a desired response to a given pattern, the learning will be carried out to adapt synaptic weights. A gradient descent method can be applied to minimize the cost defined by the distance between the neuron’s maximal potential and its firing threshold, leading to the following AugTmp rule:
(3) 
where represents the time at the maximal potential and is the learning rate controlling the modification size of each step on weights.
IiC Augmented PSD rule
There is a family of learning rules [35, 40, 43, 52] that can train the neurons not only to fire but also at exact timings as desired. Due to the efficiency and effectiveness of the PSD rule [41, 43], we select it as a representation of this type of rules, and develop our augmented one based on it (abbreviated as ‘AugPSD’).
The AugPSD rule is proposed to train neurons to fire at desired spike times in response to a target spike pattern. In such a way, the temporal domain of the output could be potentially utilized for information transmission as well as multicategory classification [41, 43]. The learning rule is implemented to minimize the difference between the desired () and the actual () output spike times. Following a similar derivation in [41], our AugPSD rule is thus given as:
(4)  
where denotes the Heaviside function.
In our AugPSD rule, a longterm depression will occur to decrease the weights when the neuron erroneously elicits an output spike, while a longterm potentiation will increase them when the neuron fails to fire at a desired time. Different distance metrics can be developed to measure the similarity between the desired and actual output spike trains such that they could be used in both training and evaluation. For example as in [41], a distance metric can be calculated as
(5) 
where and are filtered signals of the two spike trains, and is a time constant.
Although this metric can give a precise measurement, the choice of a critical value for termination in the training could be difficult. Additionally, the calculation of this metric would complicate the computation in a spikebased framework.
In this paper, we introduce a much simpler and efficient approach, i.e. the coincidence metric, to measure the distance. We introduce a margin parameter to control the precision of the coincidence detection. The output spike time will be regarded correct if it falls into the region of . This margin parameter and coincidence detection can facilitate the learning.
IiD Augmented TDP rule
Recently, a new family of learning rules are developed to train neurons to fire a certain number of spikes instead of requiring them at precise times [38, 23]. They have shown remarkable performance as compared to others for making decision and exploring temporal features from signals due to their multispike characteristic.
We adopt the TDP rule [23] in this paper, due to its efficiency and simplicity, to develop our new augmented TDP rule (abbreviated as AugTDP). The AugTDP rule is developed based on the spiking characteristic of multispike neurons, namely spikethresholdsurface (STS), which describes the relation between the neuron’s output spike number and its firing threshold [38]. A higher threshold value will normally result in a lower number of output spikes. STS is characterized by a group of critical threshold values, , that denotes the position of the neuron’s firing threshold at which point its output spike number would jump around the point . A neuron’s actual output spike number can thus be directly reflected by STS. Therefore, modifications of the critical threshold values can result in a desired spike number. Following steps in [38, 23], the critical threshold value can be determined as
(6) 
Here, , and is the total number of output spikes that occur before that represents the time point of . Following a similar routine as TDP rule [23], the derivative of with respect to the th synaptic efficacy can be given as
(7) 
Eq. (7) can be used to evaluate the derivatives , and thus our AugTDP rule can be given as
(11) 
where and represent the desired and actual output spike numbers, respectively. The principle of this rule is to decrease (increase) the corresponding that is greater (smaller) than with an LTD (LTP) process if the neuron fails to have a desired response.
Iii Simulation Results
In this section, we provide systematic evaluations on our proposed learning rules, including classification of spatiotemporal spike patterns, learning capacity, construction of causality, feature detection and discrimination, robustness and applicability to practical tasks. Some experimental setups are described in each task while others are given in the Appendix section. In all experiments, the default number of afferents is set as except for the acoustic and visual recognition tasks where it is determined by the encoding layer.
Iiia Augmented Tempotron
In this experiment, we study the capability of our AugTmp rule on classifying the spatiotemporal spike patterns where augmented spikes are used for information representation. We construct random patterns with each afferent firing at a Poisson rate of 2 Hz over a time window of
s. In addition to the randomness on the timing of each spike, the spike coefficient of each spike event is also randomly selected, e.g. from the set of {0.5, 1, 1.5} with equal possibility in this experiment.We generate three patterns, i.e. P13, in this task. Both P1 and P3 are independently constructed with both their spike timings and coefficients being randomly generated. P2 shares the same spike timings with P1 but with different random spike coefficients. We use both Tmp and AugTmp rules to train the neuron to elicit an output spike in response to P1 while remain silent to the other two.
As can be seen from Fig. 3, the Tmp rule cannot separate all the patterns while our AugTmp rule can successfully classify all of them as expected. This is because that there is no difference between P1 and P2 to the Tmp rule as it cannot process the augmented part, i.e. spike coefficients, thus resulting in an error rate around one third. Differently, our AugTmp rule can utilize both the spike timings and coefficients to recognize patterns, suggesting the effectiveness of our augmented scheme.
IiiB Learning Capacity
How many patterns could a neuron discriminate? A useful measurement of the load can be calculated as the number of patterns relative to the number of weights, denoted by . An important characteristic of the neuron’s capacity is the maximal load it can learn [34], and we denote it as the critical capacity . In this experiment, we study how many patterns a neuron can learn from augmented spikes with our learning rule.
Similar to setups in [34]
, we assess the performance of our AugTmp rule in classifying populations of single spike latencies. In each pattern, each afferent contains a single spike that occurs at a time randomly chosen from a uniform distribution over a time window
s. In addition to random latencies, we assign random discrete values to spike coefficients. For simplicity, we generate possible numbers that are evenly spaced over the range between 0.5 and 1.5, and then each spike coefficient is randomly selected from these values with equal possibility. Notably, will result in a single coefficient of 1, which returns to the normal binary spike patterns. The task consists of patterns, each of which is randomly assigned to the target or null classes with equal possibility. The neuron is required to fire in response to the target class while remain silent otherwise.We study two scenarios for the capacity task: random and fixed latencies (see Fig. 4). In the randomlatency case, both spike timing and coefficient of each spike in each pattern are randomly generated, while patterns share the same latencies but with different random spike coefficients in the fixedlatency case. The neuron is trained to successfully classify all patterns, and the corresponding cycles at convergence is recorded as the learning time. The appearance of all patterns during training is denoted as one cycle.
In order to have a reliable measurement of the learning time, we use the median, rather than the mean, of 100 single evaluations due to the heavytail effect which could skew it by a small portion of extremely large values (data were not shown). We obtain
from the capacity curve (Fig. 5A) at the point it crosses a learning time of 1000. Normally, the more patterns to learn, the longer of the convergent time. We choose our selection with a consideration of computational cost.Fig. 5B shows our capacity performance. For both random and fixed latencies, the capacity is almost invariant to the level of discretization, , on the spike coefficient. This is because the spiking neuron could be approximated by a twolayer perceptron whose capacity is invariant to the input statistics [34]. In the randomlatency case, the capacity is around 2.9 which is consistent to that in the tempotron [34]. Our result indicates neurons cannot store more patterns with augmented spikes. If we remove the randomness in the latency, the capacity drops to a value around 0.3, indicating the importance of the temporal domain for information processing in SNNs. Notably, the capacity for normal binary spikes () in the fixedlatency case is 0 (data not shown), indicating the ability of our learning to extract information from spike coefficients.
IiiC Augmented PSD
In this experiment, we examine the ability of our AugPSD rule to train neurons to fire at desired times in response to a given singlespike latency pattern. In the spike pattern, each afferent fires a single spike at a time point that is randomly chosen within the time window s. We impose spike coefficients of 2.0, 1.0 and 0.5 to spikes falling in the region of [0, 0.1), [0.1, 0.2) and [0.2, 0.3], respectively. The neuron is trained to fire at 0.1 and 0.2 s with a margin precision of ms in response to the pattern.
As can be seen in Fig. 6
, our AugPSD rule can successfully train the neuron to fire at desired positions within around 10 training epochs, suggesting the effectiveness and efficiency of our rule.
Moreover, our AugPSD rule is capable of constructing causal connections based on the augmented spikes. Fig. 7 shows that the PSD rule is incapable of processing spike coefficients, and thus result in a similar distribution around the two desired times (Fig. 7B). This is because each spike is treated as the same in the PSD rule. Differently, our AugPSD rule is sensitive to spike coefficients and thus build causal connections based on it. The resulted peaks of causal connections before desired times are reversely proportional to the spike coefficients of their causal inputs (Fig. 7C). This reflects the capability of our AugPSD rule to extract information embedded in spike coefficients.
IiiD Augmented TDP
In this experiment, we evaluate the ability of our AugTDP in detecting and discriminating features embedded in a background.
Similar to experimental setups in [38, 23], we construct four feature patterns with a Poisson firing rate of 4 Hz over a time window of 0.1 s. These features are randomly embedded with a mean Poisson appearance time of 3 in a background of s which has a same firing statistics as the features. Then a global noise with a Poisson rate of 1 Hz is imposed on the pattern. Differently, each spike contains a coefficient value that is randomly selected from the set of {0.5, 1.0, 1.5} with equal possibility. To make the task more challenging, we force all feature patterns share the same spike timings. This means their difference only exists in the spike coefficients. The neuron is required to fire 2 and 1 spikes in response to two randomly selected features (target), while to remain silent to others (distractor and background).
Fig. 8 shows the learning performance for this task. The TDP rule fails to detect and discriminate the target features from the others (Fig. 8B). This is because TDP cannot process spike coefficients and thus it is incapable to discriminate feature patterns as their difference only exists in the coefficients. Thanks to the augmented capability for processing augmented spikes, our AugTDP can successfully learn to have desired behavior in response to the pattern within tens of training cycles (Fig. 8C and D).
IiiE Robustness
In this task, we study the robustness performance of our augmented learning rules in multicategory classification tasks. We randomly generate three spike patterns with their spike timings following a Poisson rate of 2 Hz over s and their spike coefficients being randomly selected from the set of {0.5, 1.0, 1.5}. These three patterns are then fixed as the template of each category. We train three neurons to learn these categories with each one corresponding to one category as its target. For AugTmp, the neurons are required to fire at least one spike in response to their target categories while remain silent otherwise. For both AugPSD and AugTDP, neurons are trained to fire 10 spikes in response to their targets otherwise to remain silent. The desired timings for AugPSD are evenly spaced over with a precision of ms.
Neurons are trained and evaluated under two noise cases: spike jitter noise and spike deletion noise
. In the first case, the timing of each spike is jittered with a Gaussian distribution with mean of 0 and standard deviation of
. In the second case, each spike is randomly deleted with a probability of
. During training, we use ms and for each case. In the evaluation phase, we use a strict readout scheme for different rules to provide clear insight into their robustness. For AugTmp, a pattern is regarded as correctly classified only if its corresponding neuron fires. For multispike rules of AugPSD and AugTDP, we use half of the desired spike number used in the training as the readout indicator in the evaluation. That is, only if the corresponding neuron elicits more than 5 spikes, the pattern is treated as correctly recognized.Fig. 9 shows the performance of our learning rules. AugTDP is the most robust one for both noise cases as compared to the others. It can tolerate up to 80 ms jitter noise and 40% deletion while still remains at a high accuracy over 80%. This suggests the strength of multispike learning and readout. For AugPSD, neurons are constrained to fire at desired times thus limiting their ability to fully explore features over the whole time window. The bottleneck for AugTmp is its binary response of firing or not, which restricts neurons to make decisions based on only one temporal feature. Notably, advanced techniques [43, 51, 53] can be applied to improve the performance of AugTmp and AugPSD, but they would complicate the decision procedure in a spikebased framework.
IiiF Acoustic Pattern Recognition
Recent advances of spikebased frameworks for acoustic processing emerge with remarkable performances [51, 53, 54, 55]. In this experiment, we study the performance of our augmented rules with a more practical task, a sound recognition task used in [53]. We use the same experimental setups (details can be found in [53]) to have a better benchmark on the benefits brought by our augmented schemes.
Differently, we extend the encoding scheme of [53] with the ability to carry intensity with our augmented spikes (Fig. 10). In addition to the timing of each spike, its coefficient is used to carry the intensity information which is expected to facilitate the learning.
Methods  Clean  20dB  10dB  0dB  Avg 

MFCCHMM [51]  99.0  62.1  34.4  21.8  54.33 
SPECMLP [53]  100  94.38  71.8  42.68  77.22 
SPECCNN [53]  99.83  99.88  98.93  83.65  95.57 
SOMSNN[54]  99.6  79.15  36.25  26.5  60.38 
LSFSNN[51]  98.5  98.0  95.3  90.2  95.5 
LTFSNN[55]  100  99.6  99.3  96.2  98.77 
KPTmp [53]  99.35  96.58  94.0  90.35  95.07 
KPAugTmp  99.10  99.20  98.43  94.63  97.84 
KPTDP [53]  100  99.5  98.68  98.1  99.07 
KPAugTDP  100  99.95  99.55  98.03  99.38 
The performance is evaluated under a mismatched condition, where training is performed with only clean samples while testing is evaluated with different levels of noises. Table I shows the performance on the sound recognition task. Both spikebased and conventional approaches are considered. Traditional approaches include MFCCHMM, SPECMLP and SPECCNN. As a typical framework widely used in acoustic processing, MFCCHMM performs well in clean condition, but drops rapidly with increasing noise. Deep learning techniques, SPECMLP and SPECCNN, demonstrate improvements as is compared to MFCCHMM, but their dependency on GPUs would make them resource consuming and computationally inefficient. Thus, the spikebased approach provides an alternative possibility. Our framework with the AugTDP achieves the best performance as is compared to the other baselines. Our comparisons between close approaches with or without augmented scheme clearly highlight the corresponding improvements of our methods over the baseline ones. This indicates the additional information carried by spike coefficients, together with our augmented learning, can effectively improve the performance.
In addition, we evaluate the performance of our AugTDP with different desired spike numbers, , used for training. Fig. 11 shows the effects of on the performance. For small values of , the bigger the desired spike number, the better the recognition accuracy. This is because more spikes can be useful to extract important temporal features for a decision. However, further increase of will degrade the performance since the temporal interference [43] is getting severe and thus affects the performance.
Methods  Coding Scheme  Neurons(Structure)  #Spikes  Accuracy 

S1C1SNN [39]  Temporal  200+10  200  78.00 
S1C1+AugTmp 
Temporal  200+10  83  86.10 
S1C1+AugTDP 
Temporal  200+10  83  86.60 
CSNN [56] 
Temporal  800+400  800  88.00 
CNN+AugTmp 
Temporal  800+10  263  97.00 
CNN+AugTDP 
Temporal  800+10  263  97.80 
HMAX+AugTmp 
Temporal  800+10  399  96.90 
HMAX+AugTDP 
Temporal  800+10  399  97.90 
Dendritic Neurons [57] 
Rate  5000+10    90.26 
SpikeDBN [58] 
Rate  784+500+500+10    94.10 
Spiking RBM [59] 
Rate  6470+1010    94.09 
Unsupervised STDP [60] 
Rate  784+6400    95.00 
SNNWS [29] 
Rate(Weight)  784+1200+1200+10  98.60  
Weight norm [46] 
Rate  784+1200+1200+10  98.60  

IiiG Visual Pattern Recognition
In this section, we explore the ability of our augmented rules with a visual recognition task on MNIST dataset. Three encoding methods, namely S1C1 [61], HMAX [62] and CNN [56], that convert images into spike patterns are adopted to test the performance of our augmented schemes. The experimental setups of S1C1 and CNN are the same as [61] and [56], respectively. For HMAX, we adopt the first two layers in [33].
In the above encoding methods, neurons will integrate information from their receptive fields and elicit spikes whose timings linearly depend on their activation values with stronger ones resulting in earlier spikes. Differently, in this paper, we introduce a more simple and effective approach to generate afferent latencies. If the activation value of an encoding neuron crosses its firing threshold, it will fire an augmented spike at a fixed time randomly chosen from a uniform distribution between 0 and the time window ms, and with its activation value being assigned to the spike coefficient. Fig. 12 demonstrates an encoding example of our method. Each dot denotes an augmented spike and its size represents the activation value of the encoding neuron.
Table II shows the test accuracies of our augmented approaches, as well as other SNN models, on the MNIST task. Firstly, we conduct a close comparison with S1C1SNN [61] and CSNN [56] which use the same encoding scheme as ours but differently with normal binary spikes. As can be seen in Table II, our augmented frameworks perform better than those baselines with higher accuracies, lighter network structures and fewer number of spikes. This highlights the advantages of our augmented rules for improving the performance of current spikebased frameworks. Besides, we also compare ours with some ratebased ones that adopt firing rates to represent information. The accuracy of ours still outperforms most of them, while is slightly below the SNNWS [29] and WeightNorm [46] which are representative approaches of mapping the weights of a trained ANN to an SNN. However, the high accuracy of these mapping approaches [29, 46] comes with costs of a massive number of spikes and a complex network structure. In contrast, our temporalbased frameworks provide efficient and effective alternatives with fewer spikes and smaller number of neurons.
Iv Discussions
Traditional ANNs distribute computations over a spatial dimension, while SNNs extend them with an additional time domain. This spatiotemporal processing capability makes spiking neurons as a new generation of neuronal models that are computationally more powerful [11, 10, 18, 34]. One of the main differences between the traditional and spiking models is the way how information is represented: traditional ones use analog values while allornothing pulses are used in spiking ones. The spikebased computation and representation are promising for efficiency and computational capability [9], while the nonspike based ones are widely applied in different recognition tasks with a relatively high accuracy [3]. In this paper, we introduce a concept of augmented spikes to bridge the gaps between the spiking and nonspiking agents. In each augmented spike, both timing and spike coefficient could be used for computation and information transmission. The characteristics of precise analog values from ANNs and allornothing spikes from SNNs are thus combined. Our augmented spikes resemble the phenomena of burst that is widely observed in nervous systems [24, 25, 26, 27, 28].
A new spiking neuron model is proposed to process augmented spikes. Notably, our scheme is not limited to a specific type of neuron model, and thus it can be easily applied to any other spiking neuron models, such as HodgkinHuxley model [47], Izhikevich model [48] and hardwarefriendly ones [6]. Our results highlight that the augmented spiking neuron can not memorize more patterns than a normal one (Fig. 5), but it is superior to take advantage of additional information carried by spike coefficients to facilitate learning and processing.
In addition to the neuron model, several new synaptic learning rules are developed to learn from augmented spikes. Our learning rules are effective to train neurons to have binary response of firing or not, to fire at desired times and to elicit a target output spike number. Again, the idea of our augmented learning could be easily applied to other spikebased plasticity rules.
Remarkably, we successfully apply our augmented framework to two representative practical tasks: acoustic and visual pattern recognition. Spike coefficients are utilized to carry additional information that can be further used to improve the performance. These two practical tasks prove the effectiveness of our augmented encoding and learning, and they also highlight the potential merit of our approaches. In addition to these two examples, our augmented spikes could be widely used in different scenarios. In a conceivable example, a person can speak the same sentence with different emotions. Our augmented spikes could facilitate the information representation with spike timings carrying the sentence while coefficients reflecting emotions.
Our augmented spike framework is versatile, and could be generalized to many spikebased computations and plasticity schemes [63, 64, 65, 66, 67]. For example, in a neuromorphic sensory system [64], the spike coefficients of our augmented spikes could be easily packed in the addressevent representation (AER) and transmitted to downstream processing units.
V Conclusion
In this paper, we introduced a concept of augmented spikes to carry complementary information with spike coefficients in addition to spike latencies. We developed a new augmented spiking neuron model to process these advanced spikes. Moreover, new synaptic learning rules were proposed to train neurons to learn from augmented spikes. Therefore, we presented a new framework with augmented spikes, including focuses of coding, processing and learning. We provided systematic insight into the properties and characteristics of our proposed methods. Remarkably, our applied developments on practical recognition tasks have shown potential merits of our methods. Notably, our augmented spike framework is versatile and can be easily generalized to other spikebased systems, thus brings a new possible direction for neuromorphic computing.
Appendix
Va Momentum
A momentum scheme could accelerate the learning [34], and thus it was applied in our study. The actual performed synaptic update was composed of two parts: the current modification, , that is determined by the corresponding learning rules, and a fraction of the previous applied update . Therefore, in each error trial, the resulting synaptic update is as
(12) 
where is the momentum parameter determining the fraction of the previous update.
VB Neuronal Parameters
The neuronal parameters used in this paper are summarized in Table III.
Index  

Fig.3  20  5  0.9  
Fig.5  10  5  0.99  
Fig.6,7  10  5  0.01  0  
Fig.8  20  5  0.9  
Fig.9  20  5  0.9  

40  10  0  

40  10  0.7  
Table II  40  10  0.9  
References
 [1] E. R. Kandel, J. H. Schwartz, T. M. Jessell, S. A. Siegelbaum, and A. J. Hudspeth, Principles of Neural Science. McGrawhill New York, NY, USA, 2000, vol. 4.
 [2] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain.” Psychological Review, vol. 65, no. 6, p. 386, 1958.
 [3] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
 [4] N. D. Lane, S. Bhattacharya, A. Mathur, P. Georgiev, C. Forlivesi, and F. Kawsar, “Squeezing deep learning into mobile and embedded devices,” IEEE Pervasive Computing, vol. 16, no. 3, pp. 82–88, 2017.
 [5] W. Severa, C. M. Vineyard, R. Dellana, S. J. Verzi, and J. B. Aimone, “Training deep neural networks for binary communication with the whetstone method,” Nature Machine Intelligence, vol. 1, no. 2, p. 86, 2019.
 [6] R. A. Nawrocki, R. M. Voyles, and S. E. Shaheen, “A mini review of neuromorphic architectures and implementations,” IEEE Transactions on Electron Devices, vol. 63, no. 10, pp. 3819–3829, 2016.

[7]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in
Advances in neural information processing systems, 2012, pp. 1097–1105.  [8] J. Feldmann, N. Youngblood, C. Wright, H. Bhaskaran, and W. Pernice, “Alloptical spiking neurosynaptic networks with selflearning capabilities,” Nature, vol. 569, no. 7755, p. 208, 2019.
 [9] K. Roy, A. Jaiswal, and P. Panda, “Towards spikebased machine intelligence with neuromorphic computing,” Nature, vol. 575, no. 7784, pp. 607–617, 2019.
 [10] J. J. Hopfield, “Pattern recognition computation using action potential timing for stimulus representation,” Nature, vol. 376, no. 6535, pp. 33–36, 1995.
 [11] W. Maass, “Networks of spiking neurons: the third generation of neural network models,” Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997.
 [12] T. Gollisch and M. Meister, “Rapid neural coding in the retina with relative spike latencies,” Science, vol. 319, no. 5866, pp. 1108–1111, 2008.
 [13] P. Dayan and L. F. Abbott, Theoretical Neuroscience. Cambridge, MA, USA: MIT Press, 2001, vol. 806.
 [14] Q. Yu, H. Tang, J. Hu, and K. C. Tan, Neuromorphic Cognitive Systems: A Learning and Memory Centered Approach, 1st ed. Cham, Switzerland: Springer Int., 2017.
 [15] S. Panzeri, N. Brunel, N. K. Logothetis, and C. Kayser, “Sensory neural codes using multiplexed temporal scales,” Trends in Neurosciences, vol. 33, no. 3, pp. 111–120, 2010.
 [16] W. Gerstner, A. K. Kreiter, H. Markram, and A. V. Herz, “Neural codes: Firing rates and beyond,” Proceedings of the National Academy of Sciences, vol. 94, no. 24, pp. 12 740–12 741, 1997.
 [17] M. London, A. Roth, L. Beeren, M. Häusser, and P. E. Latham, “Sensitivity to perturbations in vivo implies high noise and suggests rate coding in cortex,” Nature, vol. 466, no. 7302, p. 123, 2010.
 [18] R. Kempter, W. Gerstner, and J. L. van Hemmen, “SpikeBased Compared to RateBased Hebbian Learning.” in NIPS’98, 1998, pp. 125–131.
 [19] A. Borst and F. E. Theunissen, “Information theory and neural coding.” Nature Neuroscience, vol. 2, no. 11, pp. 947–957, Nov. 1999.
 [20] D. A. Butts, C. Weng, J. Jin, C.I. Yeh, N. A. Lesica, A. JoseManuel, and G. B. Stanley, “Temporal precision in the neural code and the timescales of natural vision,” Nature, vol. 449, no. 7158, p. 92, 2007.
 [21] N. Masuda and K. Aihara, “Bridging rate coding and temporal spike coding by effect of noise,” Physical Review Letters, vol. 88, no. 24, p. 248101, 2002.
 [22] R. Gütig, “To spike, or when to spike?” Current Opinion in Neurobiology, vol. 25, pp. 134–139, 2014.
 [23] Q. Yu, H. Li, and K. C. Tan, “Spike timing or rate? neurons learn to make decisions for both through thresholddriven plasticity,” IEEE transactions on cybernetics, vol. 49, no. 6, pp. 2178–2189, 2018.
 [24] C. Beurrier, P. Congar, B. Bioulac, and C. Hammond, “Subthalamic nucleus neurons switch from singlespike activity to burstfiring mode,” Journal of Neuroscience, vol. 19, no. 2, pp. 599–609, 1999.
 [25] F. Zeldenrust, P. Chameau, and W. J. Wadman, “Spike and burst coding in thalamocortical relay cells,” PLoS computational biology, vol. 14, no. 2, p. e1005960, 2018.
 [26] R. Naud and H. Sprekeler, “Sparse bursts optimize information transmission in a multiplexed neural code,” Proceedings of the National Academy of Sciences, vol. 115, no. 27, pp. E6329–E6338, 2018.
 [27] J. Simonnet and M. Brecht, “Burst firing and spatial coding in subicular principal cells,” Journal of Neuroscience, vol. 39, no. 19, pp. 3651–3662, 2019.
 [28] S. S. Divakaruni, A. M. Van Dyke, R. Chandra, T. A. LeGates, M. Contreras, P. A. Dharmasri, H. N. Higgs, M. K. Lobo, S. M. Thompson, and T. A. Blanpied, “Longterm potentiation requires a rapid burst of dendritic mitochondrial fission during induction,” Neuron, vol. 100, no. 4, pp. 860–875, 2018.
 [29] J. Kim, H. Kim, S. Huh, J. Lee, and K. Choi, “Deep neural networks with weighted spikes,” Neurocomputing, vol. 311, pp. 373–386, 2018.
 [30] R. Chen, H. Ma, S. Xie, P. Guo, P. Li, and D. Wang, “Fast and efficient deep sparse multistrength spiking neural networks with dynamic pruning,” in 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, 2018, pp. 1–8.
 [31] G.q. Bi and M.m. Poo, “Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type,” Journal of neuroscience, vol. 18, no. 24, pp. 10 464–10 472, 1998.
 [32] N. Caporale and Y. Dan, “Spike timing–dependent plasticity: a hebbian learning rule,” Annu. Rev. Neurosci., vol. 31, pp. 25–46, 2008.

[33]
T. Masquelier and S. J. Thorpe, “Unsupervised learning of visual features through spike timing dependent plasticity,”
PLoS computational biology, vol. 3, no. 2, p. e31, 2007.  [34] R. Gütig and H. Sompolinsky, “The tempotron: a neuron that learns spike timingbased decisions,” Nature Neuroscience, vol. 9, no. 3, pp. 420–428, Feb. 2006.

[35]
S. M. Bohte, J. N. Kok, and J. A. L. Poutré, “Errorbackpropagation in temporally encoded networks of spiking neurons,”
Neurocomputing, vol. 48, no. 14, pp. 17–37, 2002.  [36] R.M. Memmesheimer, R. Rubin, B. P. Ölveczky, and H. Sompolinsky, “Learning precisely timed spikes,” Neuron, vol. 82, no. 4, pp. 925–938, 2014.
 [37] J. J. Wade, L. J. McDaid, J. A. Santos, and H. M. Sayers, “Swat: a spiking neural network training algorithm for classification problems,” IEEE Transactions on Neural Networks, vol. 21, no. 11, pp. 1817–1830, 2010.
 [38] R. Gütig, “Spiking neurons can discover predictive features by aggregatelabel learning,” Science, vol. 351, no. 6277, p. aab4113, 2016.
 [39] Q. Yu, H. Tang, K. C. Tan, and H. Li, “Rapid feedforward computation by temporal encoding and learning with spiking neurons,” IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 10, pp. 1539–1552, 2013.

[40]
F. Ponulak and A. J. Kasinski, “Supervised Learning in Spiking Neural Networks with ReSuMe: Sequence Learning, Classification, and Spike Shifting.”
Neural Computation, vol. 22, no. 2, pp. 467–510, 2010.  [41] Q. Yu, H. Tang, K. C. Tan, and H. Li, “Precisespikedriven synaptic plasticity: Learning heteroassociation of spatiotemporal spike patterns,” PLoS One, vol. 8, no. 11, p. e78318, 2013.
 [42] A. Taherkhani, A. Belatreche, Y. Li, and L. P. Maguire, “A supervised learning algorithm for learning precise timing of multiple spikes in multilayer spiking neural networks,” IEEE transactions on neural networks and learning systems, vol. 29, no. 11, pp. 5394–5407, 2018.
 [43] Q. Yu, R. Yan, H. Tang, K. C. Tan, and H. Li, “A spiking neural network system for robust sequence recognition,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 3, pp. 621–635, 2016.
 [44] Y. Zheng, S. Li, R. Yan, H. Tang, and K. C. Tan, “Sparse temporal encoding of visual features for robust object recognition by spiking neurons,” IEEE transactions on neural networks and learning systems, vol. 29, no. 12, pp. 5823–5833, 2018.

[45]
Y. Cao, Y. Chen, and D. Khosla, “Spiking deep convolutional neural networks
for energyefficient object recognition,”
International Journal of Computer Vision
, vol. 113, no. 1, pp. 54–66, 2015.  [46] P. U. Diehl, D. Neil, J. Binas, M. Cook, S.C. Liu, and M. Pfeiffer, “Fastclassifying, highaccuracy spiking deep networks through weight and threshold balancing,” in 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, 2015, pp. 1–8.
 [47] A. L. Hodgkin and A. F. Huxley, “A quantitative description of membrane current and its application to conduction and excitation in nerve,” The Journal of physiology, vol. 117, no. 4, pp. 500–544, 1952.
 [48] E. M. Izhikevich, “Simple model of spiking neurons,” IEEE Transactions on neural networks, vol. 14, no. 6, pp. 1569–1572, 2003.
 [49] W. Gerstner and W. M. Kistler, Spiking neuron models: Single neurons, populations, plasticity. Cambridge university press, 2002.
 [50] A. N. Burkitt, “A review of the integrateandfire neuron model: I. homogeneous synaptic input,” Biological cybernetics, vol. 95, no. 1, pp. 1–19, 2006.
 [51] J. Dennis, Q. Yu, H. Tang, H. D. Tran, and H. Li, “Temporal coding of local spectrogram features for robust sound recognition,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 803–807.
 [52] R. V. Florian, “The Chronotron: A Neuron that Learns to Fire Temporally Precise Spike Patterns,” PLoS One, vol. 7, no. 8, p. e40233, 2012.
 [53] Q. Yu, Y. Yao, L. Wang, H. Tang, J. Dang, and K. C. Tan, “Robust environmental sound recognition with sparse keypoint encoding and efficient multispike learning,” arXiv preprint arXiv:1902.01094, 2019.
 [54] J. Wu, Y. Chua, M. Zhang, H. Li, and K. C. Tan, “A spiking neural network framework for robust sound classification,” Frontiers in neuroscience, vol. 12, 2018.
 [55] R. Xiao, H. Tang, P. Gu, and X. Xu, “Spikebased encoding and learning of spectrum features for robust sound recognition,” Neurocomputing, vol. 313, pp. 65–73, 2018.
 [56] Q. Xu, Y. Qi, H. Yu, J. Shen, H. Tang, and G. Pan, “Csnn: An augmented spiking based framework with perceptroninception.” in IJCAI, 2018, pp. 1646–1652.
 [57] S. Hussain, S.C. Liu, and A. Basu, “Improved margin multiclass classification using dendritic neurons with morphological learning,” in 2014 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2014, pp. 2640–2643.

[58]
P. O’Connor, D. Neil, S.C. Liu, T. Delbruck, and M. Pfeiffer, “Realtime classification and sensor fusion with a spiking deep belief network,”
Frontiers in neuroscience, vol. 7, p. 178, 2013.  [59] P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. S. Modha, “A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45nm,” in 2011 IEEE custom integrated circuits conference (CICC). IEEE, 2011, pp. 1–4.
 [60] P. U. Diehl and M. Cook, “Unsupervised learning of digit recognition using spiketimingdependent plasticity,” Frontiers in computational neuroscience, vol. 9, p. 99, 2015.
 [61] Q. Yu, H. Tang, K. C. Tan, and H. Li, “Rapid feedforward computation by temporal encoding and learning with spiking neurons,” IEEE transactions on neural networks and learning systems, vol. 24, no. 10, pp. 1539–1552, 2013.
 [62] M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nature neuroscience, vol. 2, no. 11, p. 1019, 1999.
 [63] T. Wu, A. Păun, Z. Zhang, and L. Pan, “Spiking neural p systems with polarizations,” IEEE transactions on neural networks and learning systems, vol. 29, no. 8, pp. 3349–3360, 2017.
 [64] S.C. Liu and T. Delbruck, “Neuromorphic sensory systems,” Current opinion in neurobiology, vol. 20, no. 3, pp. 288–295, 2010.
 [65] B. V. Benjamin, P. Gao, E. McQuinn, S. Choudhary, A. R. Chandrasekaran, J.M. Bussat, R. AlvarezIcaza, J. V. Arthur, P. A. Merolla, and K. Boahen, “Neurogrid: A mixedanalogdigital multichip system for largescale neural simulations,” Proceedings of the IEEE, vol. 102, no. 5, pp. 699–716, 2014.
 [66] P. A. Merolla, J. V. Arthur, R. AlvarezIcaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura et al., “A million spikingneuron integrated circuit with a scalable communication network and interface,” Science, vol. 345, no. 6197, pp. 668–673, 2014.
 [67] Y. Dan and M.m. Poo, “Spike timingdependent plasticity of neural circuits,” Neuron, vol. 44, no. 1, pp. 23–30, 2004.
Comments
There are no comments yet.