Understanding the microscopic brain activity (mechanisms) in action and in context is crucial for learning and control of the neuron behavior. Toward this end, the main purpose of studying neuron activities is to identify neuron history process and their inter-dependence in the ensemble neural system. The technique of multiple electrodes makes it possible to record and study the spiking activity of each neuron in a large ensemble system simultaneously [1, 2, 3]. At first, the study was mainly focusing on single neuron behavior and the bivariate relationship of neuron pairs or triplets, while ignoring the possible ensemble neuron effect [4, 5, 6]. Later, the multivariate auto-regressive framework was introduced with more complex neuronal connections. Multiple covariates affect the spiking activity of each neuron in the system . The most common covariates are the intrinsic, extrinsic and other parameters related to inputs and the environmental stimuli. Previous attempts has been made to analyze the impact of spiking history and the interaction with other neurons [8, 6, 7].
From the mathematical perspective, the neuron activity and their spike trains are stochastic in nature and their statistics time-varying or non-stationary. Traditionally, the neuron activity is modeled by a discrete time series event of point processes [9, 2, 5]. The likelihood method for multivariate point process is a central tool of statistical learning to model neuron spiking behaviors. The likelihood function is a variate of the parameters for the point process. These parameters are estimated from the experimental data with statistical tools 
. Numerical methods of linear or non-linear regression including gradient ascending, iterative reweighed least square and fixed point iteration are adopted in previous likelihood estimation study[10, 6, 8, 11].
The inclusion of unknowns in the context of time-varying complex networks have shown promising results in the fractional dynamical models  to represent spatio-temporal physiological signals, and making predictions for motor-imagery tasks 
. The adoption of unknown sources is also applicable to the neural spiking system. Prior work on the neural activity modeling assume that all neurons in the system can be monitored and their activities are available for mathematical modeling. The excitatory and inhibitory behavior are among the detected neurons in the system. However, in reality, the sensors and detectors can only access the neurons at the surface of monitored region. Neurons underneath the external layer and the environment stimuli, which are referred as the unknown inputs to the brain system, can also influence the neural system and contribute to the brain activity in action and in context. Related to unknown artifacts of the neuron system, a neuron model with common input, modeling the spiking trains as a point process with hidden term of a Gaussian-Markov dynamics and implementing Kalman filters has been described in. In order to model the neural activity and reconstruct the latent dynamics of the network of neurons,  describes a data-driven linear Gaussian dynamics modeling framework that account for the hidden inputs of the neural system. Although the above-mentioned prior work considered the importance of hidden inputs onto the neural system, they rely on specific assumptions the mathematical features of the unknowns. In this paper, we purpose a neural system model with more general unknown inputs, and at the same time show that the proposed methods are computationally efficient and suitable for big size of datasets.
Figure 1 illustrates a cyber-physical systems approach to sensing, modeling, analysis and control of neural activity [3, 5, 16, 17]. Multiple electrodes record the neuron system activities from some area of the brain, e.g., motor cortex, somatosensory cortex etc. The process of spike sorting assigns spikes to specific neuron in the system. Using generalized linear model framework with likelihood analysis, we analyze neuron spike trains after the spike sorting, and model the neuron system with unknown unknowns. In this paper, we propose a neuron system model with the inclusion of general unknown artifacts. The neuron spiking process has simultaneous effects from: (i) its own spiking history, (ii) the activities of other ensemble neurons, and (iii) the unknown sources. We develop computationally efficient fixed-point iteration method for the multivariate model to estimate the parameters and the unknowns which is suitable for big data size. The real-world datasets usually has enough redundancies which could be exploited to fill the gaps using unknowns. We refer to this phenomena as the extra degrees-of-freedom offered by the data. At first thought, the ideology of incorporating unknowns, intuitively, seems always necessary to every dataset. However, sometimes the data does not have enough degrees-of-freedom. In such cases, a good modeling technique should be able to differentiate the need for unknowns. We have explored this phenomena in the context of real-world dataset as explained in Section V-B.
The paper is organized as follows. In section II, we introduce the neuron spiking model with the unknown artifacts assumption, and the problem definition considered in this work. Next, in Section III we provide a maximal likelihood estimation algorithm for estimating the systems models parameters. In Section IV, we generate artificial neuron spike trains and apply the proposed methods to analyze the simulated data. In Section V, we implement the algorithm on variety of real-world neuron spiking data. The associated challenges and interesting observations are discussed. We conclude in Section VI and the proofs are presented in the Appendix.
Ii Problem Formulation
In this section, we first describe our point process model of neuron system in discrete time with inclusion of unknown artifacts. The monitored neuron behavior is modeled with having influence of (i) its own spiking history, (ii) other neurons activities, (iii) and the unknown unknowns. We formally describe the problem statement addressed in this work in the following subsections.
Ii-a Neuron Point Process
We consider a multi-neuron network with a total number of neurons, and assuming that their spiking activities are monitored simultaneously during an observation interval . The spiking interval length is usually small (in the orders of milliseconds) and is the total number of observations. The neuron spiking activities are modeled as point process. Let denote the spike counting process of the -th neuron, , during the time interval . Also, a more useful quantity called incremental spiking count is the number of spike count fired by the neuron in time interval , and is sample path of the entire monitored multi-neuron activity.
In the similar fashion, let represent the unknown artifact activity in time interval , where is the index of the unknown, , and is the total number of unknowns. In what follows, we always assume that . Moreover, we can also define as the activity path of the unknowns during the observed time horizon .
The probability density function (PDF) of any-th with sample path can be expressed with respect to the conditional intensity function (CIF) , where is the spiking history till time-interval and are the associated parameters. The CIF fully characterizes the spike trains for some neuron [9, 10, 2]. The CIF can be mathematically defined as the spiking probability on time interval
where all other conditional covariates contribute to the neuron activity. In our model, the spiking history with , and is the parameter tuple to measure this process.
The joint conditional probability density function for the entire multi-neuron point process model can now be expressed with the CIF . We assume that with the given spiking history , the activities of all neurons at -th time-interval are independent, or they are conditionally independent with respect to the past. In other words, the correlation between neuronal activities appears only through the history. The joint probability of the spike train can be written as
We use generalized linear model (GLM) framework to describe the CIF along with exponential rectification. In this model, the CIF of any neuron at time is linear function by four covariates namely: the spontaneous, the intrinsic, the extrinsic, and the unknown covariate. The CIF is formally written as
where is the spontaneously spiking rate of the neuron , are the intrinsic parameters corresponding to neuron’s own spiking history with a memory length of . The extrinsic parameters relates the effect of neurons in system towards neuron with a memory length . The unknown parameters describe the influence of unknown artifacts with the memory length of . The complete parameter tuple can now be formally expressed as with appropriate indices of and .
Ii-B Model Estimation with Unknown Unknowns
The statistical modeling of neuron spikes is restricted by the assumption that the neuron network is completely known. But even on adding edges between neurons in the assumed network can improve the likelihood only up to some extent. At this stage, it has to be realized that closed network assumption may not be valid always and inclusion of effects of the unknown sources is necessary. The unknown source is a generic term and can include the effects of un-probed neuronal nodes, and experimental bias (e.g. instrument error, filtering process bias, physical pressure and temperature etc.), i.e. why, they are in the most general form referred as unknown unknowns.
To overcome the above-mentioned challenges, we propose a data-driven framework which starts from the available restricted data (i.e., few time series consisting of spiking trains of neurons) and constructs a time-varying complex network model of the neural activity that is subject to the unknown stimuli. The generic use of the term unknown unknowns is used to target all contributions that makes the data deviate from the undertaken model property.
The unknown sources are assumed to be independent across time and space, and we also assume that neurons have retainment property i.e. the influence of unknown sources are not instantaneous but have some memory for each neuron. In what follows, we assume that unknown parameters are known for each neuron and are non-negative, and the problem statement considered in this work is formally stated as.
Problem: Given Spiking train data and intrinsic, extrinsic memory length and , and unknown parameters , Estimate the system models parameters and the unknown activities .
The unknown parameters are constrained to be non-negative, and we will see in Section III that this does not restrict our modeling abilities of inhibition and excitation. The incorporation of unknowns in the model is also contingent on the assumption that the data has sufficient degrees-of-freedom to support unknowns, in other words there are gaps that can be filled by the unknowns. In most of the real-world cases this is true, however, interestingly we have observed that in certain cases this may not happen, as illustrated in Section V-B.
Iii Statistical Spiking Model Estimation
With the spiking data and some initial knowledge of the unknown parameters , the goal of the estimation procedure as described in this section is to perform two tasks, first (i) estimate the system models parameters , and simultaneously (ii) estimate the unknown activities
. To perform these two tasks, we propose to use the Expectation-Maximization (EM) formulation[18, 19]. The proposed algorithm like EM is split into two parts. First, it estimates the unknown activities having some previous knowledge of the system models parameters. In the next step, the algorithm uses this estimated unknown activity values to update the system models parameters . These steps are repeated until convergence. The goal of the algorithm is to maximize the likelihood, and the proposed procedure being iterative will provide a maximal likelihood solution. The log likelihood associated with the current objective can be written as
where . It can be readily realized that the first part is a function of unknown activities , while the second part is function of system models parameters . Hence, the ‘E’ and ‘M’ like step will alternatively update these two parts of CIF, respectively, to maximize the log likelihood in equation (6) iteratively. The CIF partition will be used for the rest of the paper in its most useful form as follows.
where . The update of unknown activities, or also , is performed using the following result.
The reason for restricting the values of unknown parameters to be non-negative in Section II-B can now be realized more concretely from equation (4) and (5). It can be seen that the denominators of terms in both (4) and (5) can go negative (depending on the data) and hence the fixed-point iterations would possibly become intractable. However, as already mentioned, this does not restrict our ability to model the inhibition and excitation effects, because now it can decided through the sign of cumulative estimated unknown activities.
For the next step, we wish to update the other part of CIF written in (8) as . Again, we express the
in its most useful form for the rest of the paper by defining the following vectors.
where and are vectors, . The can now be written as
Theorem 2 ().
The exponential of system models parameters as defined in (10) are updated using fixed-point iterations at iteration index as follows.
where is the maximum likelihood estimate of .
The denominator of is a variant of maximum likelihood (ML) estimate of which is problematic as it is not available at the time of iterations. However, an approximation with counting argument is provided in [10, 20] which works well for the estimation problems.
It should be noted that in each fixed-point iterations, both in Theorem 1 and Theorem 2, the value of exponent and is constant with respect to iteration index . Similarly, the numerators of the fixed-point functions are constant. Hence, they need to be computed only once per EM update and lot of computations can be saved. It is also important to note that another big advantage offered by proposed fixed-point iterations is that they are independent across time index and unknown index , therefore they can be implemented in parallel using current multi-threading/multi-processing architectures. This make the computations very fast especially when we have large size of the data. On the other hand, the existing Kalman smoothing techniques  have dependence across time and has to be computed serially. The fixed-point iterations and hence EM iterations are in the closed-form, and they are computationally efficient. The convergence is fast, as we will see in Section IV and Section V. The choice of scalar in equation (5) can play an important role in convergence rate and hence can be taken as an input parameter as well.
Iv Simulation Experiment
We now demonstrate the applicability of the proposed neuron spiking model estimation technique with unknown unknowns. The estimation process, as detailed in the Algorithm 1, is applied to an artificially generated spiking data which is explained in the following parts.
Iv-a Artificial Neuron Network
An artificial neuron network in Figure 2 is designed with a total of six neurons. Each neuron is assumed to be influenced by its intrinsic activity, extrinsic effects via other neurons in the network, and unknown artifacts. The contribution of unknown sources is quantized by having two unknown nodes and . The extrinsic effect is modeled by having directed edged among neurons as shown in Figure 2. For example, neuron excites , and the excitation effect is indicated using , also, neuron inhibits , and the inhibition effect is marked through sign. The contribution of unknown nodes is indicated by the corresponding directed edges from to in the network.
The parameters required for defining the intensity function of the spiking point-process as in equation (3) are selected as follows: The values of spontaneous firing rate for six neurons are selected as, . The intrinsic effect memory length is selected as for all neurons. The previous spiking history of each neuron can have excitatory as well as inhibitory effects on the future spiking activity. Also, with the increasing length of the history, the effects get mitigated. The initial values of the intrinsic parameters are negative to model the refractory period of the neurons after firing a spike [8, 21]. To collectively model these effects, we model the intrinsic parameters using sinc function. The intrinsic parameter values are selected as shown in Figure 2(a)-2(c). Next, the extrinsic effect as indicated via directed edges in Figure 2 is quantized though parameters for each . The extrinsic memory length is fixed as for each directed edge. We have used exponential decay functions to model the parameters and the values are shown in Figure 2(d)-2(h). Next, the unknown sources contribution memory length is fixed as for both sources and . The parameters associated with unknown excitations are always taken to be positive, as explained in Section II-B, and the undertaken values are shown in Figure 4. Finally, with these parameter values we can now proceed to the spike generation process.
Iv-B Spike Generation with Unknown contributions
The multi-neuron spike generation can be performed by recursively compute the conditional intensity parameter, or firing rate, and adding the contributions of the unknown sources. As mentioned in Appendix, we have used log-Gamma distribution for, and independent and identically distributed (iid) samples of size are generated using the PDF in equation (17). The parameter values are taken to be and , and the samples are then mean centered to zero. The spike train generation procedure is similar to  and for the sake of completeness, we briefly write the steps as follows:
Generate a uniform random number and if , then there is spike for -th neuron at time interval
Repeat with recursively computing conditional intensity function from equation (3), until desired length of spike train.
where the value of is taken to be , and for the -th step, the time-interval in consideration is . For simulations, we have generated a total of multi-neuron spikes. The spike train obtained using above procedure for all six neurons in the first time-intervals is as shown in Figure 5. We now apply the proposed technique of model estimation with unknown unknowns in the next section.
Iv-C Model estimation
The artificially generated spike train in Figure 5 with the parameters for unknown sources as shown in Figure 4 is used as an input for the Algorithm 1. The algorithm recursively computes the unknown contribution and then update the system models parameters. The iterations are fixed-point based and the convergence is fast. The log likelihood plot (with constants ignored) is shown in Figure 6. As expected, there is sharp increase in the likelihood in the first few iterations, due to incorporation of unknowns, and it is observed that few EM iterations are sufficient for convergence.
The proposed approach shows good results on the simulated data, but there are other challenges that need to be addressed especially when the data is coming from real-world experiments. We will address some of them by considering a variety of real-world datasets in the Section V.
V Real-World Data: Case Study
In this section, we explore various neuron spiking data recorded through real-world experiment. Each dataset poses some challenges as compared to the simulated dataset considered in the Section IV. We describe the datasets and discuss the results in the following sections.
V-a Mouse Somatosensory Spiking Data
The mouse somatosensory spiking dataset is recorded using a 512-channel multi-electrode array system in an organotypic cultures of dorsal hippocampus and somatosensory cortex sections [22, 23, 24, 25, 26, 27]. The cultures were not stimulated and the dataset represent spontaneous activity. The spikes were sorted using Principle Component Analysis (PCA). The dataset is downloaded from , where a total of datasets are available, and hundreds of neurons were probed to have the spiking data. In this work, we have taken a total of neurons to study the inter-neuronal interactions with unknown unknowns. The spiking activity of the considered neurons is as shown in Figure 7.
The inter-spiking interval is having a large value for spikes at some times, and it is very small for others. Therefore, while modeling such datasets it is possible that the assumed model for CIF in (3) may not apply in its original form. The proposed technique like any other data-driven estimators can work only if there are samples corresponding to the associated parameters. For example, when the data is too sparse in some time-intervals then the denominator term in equation (15)
will go to zero. It is an indication that there is no data corresponding to the - term in (10) and hence it cannot be recovered. A simple modification would be to increase the spiking interval length and take as accumulated counts in that enlarged interval. The proposed Algorithm 1 is then applied to the spike train and the log likelihood (with constants ignored) is shown in Figure 8. We observe that the convergence is fast and the likelihood increase sharply in the beginning.
V-B Mouse Retina Spiking Data
The mouse Retina dataset contains neuronal responses of retinal ganglion cells to various visual stimuli recorded in the isolated retina from lab mice (Mus Musculus) using a 61-electrode array. The visual stimuli were displayed on a gamma-corrected cathode-ray tube monitor and projected on the photoreceptor layer of the retina (magnification /pixel; luminance range 0.5-3.8 mW/m) from above through a custom-made lens system . Extracellular action potentials were recorded (sampling rate kHz) and single units were identified by a semi-automated spike-sorting algorithm. The dataset is downloaded from  which comprises retinal preparations. The spike trains for a total of seven neurons is shown in Figure 9.
The ideology of having unknown sources is better in the sense that it provides flexibility to fill the gaps in the real-world data and the assumed model by exploiting the extra degrees-of-freedom that the data might have. However, we have observed in this dataset that this may not be true always, and sometimes unknown contributions are not necessary. In particular to our model, this can be realized when the denominator of equation (5)
goes to zero. This indicates that there is not enough degrees-of-freedom in the data (and given values of unknown parameters ) to estimate the . In such cases, we can set the unknown activities to zero for the corresponding indices, or in other words assume that there is no requirement of unknowns at -th time step for -th unknown source. The log likelihood (with constants ignored) after this adjustment on applying Algorithm 1 is shown in Figure 10. We observe that the likelihood increases quickly and convergence is fast.
V-C Cat Retina Spiking Data
The Cat Retina dataset records spontaneous activities of the mesencephalic reticular formation (MRF) neurons of head-restrained cat to investigate their dynamic properties during sleep and waking [31, 32]
. The behavioral states were classified into one of the three following states which continued for a relatively long period (several hundred seconds):slowwave sleep (SWS); paradoxical sleep (PS); and highly attentive states (BW) with sustained attention to a pair of small birds in a cage. We have taken the spike train data of PS in this work. The spike train for a total of five probed neurons is shown in Figure 11.
The Cat Retina dataset also has long and short inter-spiking intervals, and hence similar to the analysis of Mouse somatosensory dataset in Section V-A, we enlarge the spiking window to care of this effect. The proposed methods are applied to estimate the model parameters and the estimation procedure of Algorithm 1 converges fast. The output log likelihood (with constants ignored) is shown in Figure 12 where we observed that the initial jump of likelihood is sharp and it gets saturate very soon.
We have presented a multi-variate neuron system framework with the inclusion of unknown inputs. The single neuron activity is affected by its own spiking history, the influence from other neurons, and the unknown stimuli. The statistical framework is built on the non-stationary likelihood of the multi-channel point process, which is characterized by the conditional intensity function with a generalized linear combination of the covariates. The proposed method aims at increasing the likelihood of the data, and the inclusion of unknown unknowns to fill the discrepancies between the data and the model. The developed algorithm is based on fixed-point iterations and converges quickly. The proposed fixed-point method offers advantages of independence across time and hence can be computed in parallel using modern multi-core processors. Experiments on simulated spike trains and on the real-world datasets of mouse somatosensory, retinal and cat retina show promising results in terms of likelihood maximization. We have observed interesting insights into degrees-of-freedom offered by the data which sometimes suggest not to use unknown unknowns.
The proposed mathematical framework is general for non-stationary discrete time-series in the form of spike trains with missing data (or gaps) and the contribution of unknown sources. The developed techniques are computationally efficient especially when the data is recorded over long time-horizon. The future research will focus on exploring applications in the domain of pattern identification in social networks of opinion dynamics, smart cities monitoring for chemical and threat detection etc. A critical challenge with such datasets is the huge data size, and the computational advantages offered by the proposed techniques can make the statistical inference tractable.
The authors are thankful to Prof. Satish Iyengar, University of Pittsburgh, and Prof. Mitsuaki Yamamoto, Tohoku Social Welfare University, for providing the Cat Retina neuron spiking dataset.
-  M. A. Wilson and B. L. McNaughton, “Dynamics of the hippocampal ensemble code for space,” Science, vol. 261, no. 5124, pp. 1055–1058, 1993.
-  E. N. Brown, “Theory of point processes for neural systems,” in Les Houches. Elsevier, 2005, vol. 80, pp. 691–727.
-  M. S. Lewicki, “A review of methods for spike sorting: the detection and classification of neural action potentials,” Network: Computation in Neural Systems, vol. 9, no. 4, pp. R53–R78, 1998.
M. Krumin and S. Shoham, “Multivariate autoregressive modeling and granger causality analysis of multiple spike trains,”Computational intelligence and neuroscience, vol. 2010, p. 10, 2010.
-  E. N. Brown, R. E. Kass, and P. P. Mitra, “Multiple neural spike train data analysis: state-of-the-art and future challenges,” Nature neuroscience, vol. 7, no. 5, p. 456, 2004.
-  M. Okatan, M. A. Wilson, and E. N. Brown, “Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity,” Neural computation, vol. 17, no. 9, pp. 1927–1961, 2005.
-  S. Kim, D. Putrino, S. Ghosh, and E. N. Brown, “A granger causality measure for point process models of ensemble neural spiking activity,” PLoS computational biology, vol. 7, no. 3, p. e1001110, 2011.
-  W. Truccolo, U. T. Eden et al., “A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects,” Journal of neurophysiology, vol. 93, no. 2, pp. 1074–1089, 2005.
-  A. Karr, Point processes and their statistical inference. Routledge, 2017.
-  E. Chornoboy, L. Schramm, and A. Karr, “Maximum likelihood identification of neural point process systems,” Biological cybernetics, vol. 59, no. 4-5, pp. 265–275, 1988.
-  G. Borisyuk, R. Borisyuk et al., “A new statistical method for identifying interconnections between neuronal network elements,” Biological Cybernetics, vol. 52, no. 5, pp. 301–306, 1985.
-  G. Gupta, S. Pequito, and P. Bogdan, “Dealing with unknown unknowns: Identification and selection of minimal sensing for fractional dynamics with unknown inputs,” in American Control Conference, 2018, arXiv:1803.04866.
-  ——, “Re-thinking EEG-based non-invasive brain interfaces: Modeling and analysis,” in Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems, ser. ICCPS ’18. IEEE Press, 2018, pp. 275–286.
-  J. E. Kulkarni and L. Paninski, “Common-input models for multiple neural spike-train data,” Network: Computation in Neural Systems, vol. 18, no. 4, pp. 375–407, 2007.
-  J. H. Macke, L. Buesing et al., “Empirical models of spiking in neural populations,” in Advances in neural information processing systems, 2011, pp. 1350–1358.
-  G. Buzsáki, “Large-scale recording of neuronal ensembles,” Nature neuroscience, vol. 7, no. 5, p. 446, 2004.
-  R. Q. Quiroga, Z. Nadasdy, and Y. Ben-Shaul, “Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering,” Neural computation, vol. 16, no. 8, pp. 1661–1687, 2004.
-  G. McLachlan and T. Krishnan, The EM Algorithm and Extensions. John Wiley & Sons, New York, 1996.
-  A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the royal statistical society. Series B (methodological), pp. 1–38, 1977.
-  E. Chornoboy, “Maximum likelihood techniques for the identification of neural point processes,” 1986, Ph.D. Thesis. The Johns Hopkins School of Medicine, Baltimore, Md.
-  M. J. Berry II and M. Meister, “Refractoriness and neural precision,” in Advances in Neural Information Processing Systems, 1998, pp. 110–116.
-  S. Ito, F.-C. Yeh et al., “Large-scale, high-resolution multielectrode-array recording depicts functional network differences of cortical and hippocampal cultures,” PLOS ONE, vol. 9, no. 8, pp. 1–16, 08 2014.
-  A. M. Litke, N. Bezayiff et al., “What does the eye tell the brain?: Development of a system for the large-scale recording of retinal output activity,” IEEE Transactions on Nuclear Science, vol. 51, no. 4, pp. 1434–1440, Aug 2004.
-  N. Timme, S. Ito et al., “Multiplex networks of cortical and hippocampal neurons revealed at different timescales,” PLOS ONE, vol. 9, no. 12, pp. 1–43, 12 2014.
-  S. Nigam, M. Shimono et al., “Rich-club organization in effective connectivity among cortical neurons,” Journal of Neuroscience, vol. 36, no. 3, pp. 670–684, 2016.
-  M. Shimono and J. M. Beggs, “Functional clusters, hubs, and communities in the cortical microconnectome,” Cerebral Cortex, vol. 25, no. 10, pp. 3743–3757, 2015.
-  N. M. Timme, S. Ito et al., “High-degree neurons feed cortical computations,” PLOS Computational Biology, vol. 12, no. 5, pp. 1–31, 05 2016.
-  S. Ito, F.-C. Yeh et al., “Spontaneous spiking activity of hundreds of neurons in mouse somatosensory cortex slice cultures recorded using a dense 512 electrode array,” 2016, CRCNS.org. [Online]. Available: http://dx.doi.org/10.6080/K07D2S2F
-  J. L. Lefebvre, Y. Zhang et al., “Gamma protocadherins regulate neuronal survival but are dispensable for circuit formation in retina,” Development, vol. 135, no. 24, pp. 4141–4151, 2008.
-  Y.-F. Zhang, H. Asari, and M. Meister, “Multi-electrode recordings from retinal ganglion cells,” 2014, CRCNS.org. [Online]. Available: http://dx.doi.org/10.6080/K0RF5RZT
-  J. Inoue, S. Sato, and L. M. Ricciardi, “On the parameter estimation for diffusion models of single neuron’s activities,” Biological Cybernetics, vol. 73, no. 3, pp. 209–221, Aug 1995.
-  M. Yamamoto, H. Nakahama et al., “Markov-dependency and spectral analyses on spike-counts in mesencephalic reticular neurons during sleep and attentive states,” Brain Research, vol. 366, no. 1, pp. 279 – 289, 1986.
-  K. P. Murphy, Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.
-  H. Raiffa, “Applied statistical decision theory,” 1974.
-  H. Demirhan and C. Hamurkaroglu, “On a multivariate log-gamma distribution and the use of the distribution in the bayesian analysis,” Journal of Statistical Planning and Inference, vol. 141, no. 3, pp. 1141–1152, 2011.
-  C. Peters and W. A. Coberly, “The numerical evaluation of the maximum-likelihood estimate of mixture proportions,” Communications in Statistics-Theory and Methods, vol. 5, no. 12, pp. 1127–1135, 1976.
-  D. R. H. P. Pattern, Classification and Scene Analysis. John Wiley and Sons, New York, 1973.
-  A. Householder, “The theory of matrices in numerical analysis, blaisdell publ,” Co., New York, 1964.
Appendix A Proof of Theorem 1
The Expectation step of the algorithm is concerned with computation of which is not computationally tractable is various cases. In this work, we will approximate it as (this is sometimes referred as Hard EM ), where
The equation (16) can be expanded using Bayesian formulation, and we will assume the log-Gamma prior for
. While any other prior could have been selected as this is a modeling assumption, but in this work we are motivated by using conjugate priors to have computable expressions. The log-Gamma distribution  can be mathematically written as
where is the shape parameter and is the scale parameter. We also assume that the unknown artifacts behavior are independent and identical distributed across time and for each unknown sources. Therefore, the equation (16) can be expanded as
where in (a) we have used the definition of and from equation (7) and (8), respectively. Now for maximization, the equation (18) is an unconstrained optimization problem and we solve it by setting the partial derivatives with respect to to be zero for each , and . Therefore, we obtain
The maximum likelihood (ML) estimate of is computed with a fixed-point iterative method. We rearrange the terms in equation 19 with additional exponent and build the fixed-point function as following.
where the fixed-point iterations would be at iteration . The exponent need to be chosen carefully such that the fixed-point iterations would converge to the ML solution. We follow the procedure as mentioned in  to prove that is a local contraction and hence find the value of such that the fixed-point iterations are convergent.
Let us denote the ML estimate as the solution of fixed point equation . Now is a local contraction if , where is a matrix such that
where the matrix
is non-negative and it has a positive eigenvector
with positive eigenvalue. Now, using the Perron-Frobenius theorem , given the condition that is the positive eigenvalue of with positive eigenvector, the absolute value of all other eigenvalues are less or equal than . Thus, the spectral radius . Hence, the fixed-point equations are convergent. It should be noted that the denominator expression is depending on the ML estimate which is not available during iterations. We approximate the using the similar counting arguments of  to write the final expression as in equation (5). ∎