I Introduction
Gut motility describes the coordinated pattern of contractions and relaxations of the longitudinal and circular smooth muscle layers of the gut wall. This motility plays a central role in normal digestive health, mixing and propelling gut content in a controlled fashion. Motility disorders in children and adults cause significant personal, societal and financial burdens, costing the healthcare systems many billions of dollars per year [1]. In order to improve treatment options, we need to gain insight into the changes in gut dysmotility associated with a disorder. For example, in the human colon, large propulsive propagating contractions have been shown to be absent or diminished in patients with severe constipation [2]. In all regions of the gut, muscle contractions can be detected using intraluminal manometry. This procedure involves placement, through the nose or anus, of a flexible, small diameter catheter, containing multiple sensors. In recent years the number of sensors on these catheters has increased dramatically. For colonic manometry recordings, these highresolution manometry catheters have allowed for a far greater insight into the nature of normal [3] and abnormal [4] contractile motor patterns. Analysis of these data is difficult. Manometric recordings consist of peaks of pressure which reflect contractile activity of the intestinal muscle. While, visual analysis can easily identify patterns of motor activity, the process is very time consuming, unstandardised, subject to personal bias, and in some instances the sheer volume of pressure peaks, occurring at multiple frequencies across many sensors can be overwhelming.
An automated approach to overcome these issues is needed. In this paper, a twostage process to achieve this is presented. The first stage involves computing a frequency spectrum of the spatiotemporal pressure recordings with the wavelet transform. The second stage compares the spectra via a functional mixedeffects model to reveal where differences over frequencies may lie in response to various known predictors (e.g. response to a meal, age, colonic region). Although we focus on colonic manometry, the two stages are applicable to other kinds of recorded phenomena of a similar quasiperiodic nature. The second stage is also widely applicable to data where each observation can be represented as a function over a shared gridded domain.
The continuous wavelet transform [5] converts a signal from the time domain to the timefrequency^{1}^{1}1The transformation is actually to the timescale domain, where an appropriate conversion from scale to frequency can be chosen. domain, such that for any point in time we get a distribution of frequencies (wavelet coefficients) occurring in the signal at that time (Fig 1b).
The application of the wavelet transform for the analysis of acquired data have been used in many disciplines, from the recording of pressure signals in the human rectum [6] to analysis of black hole gravitational waves [7]. In particular, the wavelet transform has many generally applicable expositions can be found in the geophysical sciences [8, 9, 10, 11, 12]
. In this paper, we focus on the timeaveraged wavelet coefficients, including 2D histograms over frequencies and phasedifferences between pairs of signals to get an estimate of propagation of quasiperiodic pressure signals. This is similar to the work done on the solar atmosphere using the FFT
[13] and wavelets [14].For statistical analysis, we use functiononscalar rather than the more conventional scalaronscalar regression. Scalaronscalar regression covers the commonly encountered regression such as the twosample ttest, ANOVA, or generalised linear model. Observations from this framework consist of a scalar (single number) response, and a set of scalar predictors encoding the experimental conditions (e.g. response to stimulus) and individual characteristics of each subject being recorded (e.g. age or sex). Functiononscalar regression extends this so that responses are not scalars but functions, such as frequency spectra. See
[15] for a review on functional regression.There are many previous publications on functional data analysis, with one of the first text books on the topic published in 1997 [16]. Functions in mixedeffects models are often represented by sums of basis functions, such as splines [17, 18, 19], sines and cosines [20, 21], or discrete wavelet bases [22, 23, 24, 25, 26]. Nonparametric forms such as Gaussian processes (GP) are more expressive but computationally expensive. GPs have seen a slower adoption in functional mixedeffects models, with simpler designs such as a single function [27], fixedeffects models [28], or simple hierarchical models [29].
When considering a single function model, authors have explored more complex GP structures such as heteroscedasticity and other nonstationarities. Goldberg
et al. [27] and Titsias and L’azaroGredilla [30]presented a GP model with a GPbased inputdependent noise variance. Tolvanen
et al. [31] presented a GP with a GPbased inputdependent signal and noise variance. They noted that inputdependent lengthscale and signal variance are underidentifiable, and opted for an inputdependent signal variance. Despite this underidentifiability, Heinonen et al. [32] successfully fit a GP with a squaredexponential kernel function with GPbased inputdependent signal variance, noise variance, and lengthscale.Novelty
Here we present a heteroscedastic Gaussian Process mixedeffect model with Gaussian Process residuals over a common grid. A substantial performance improvement is achieved on a 2D grid by exploiting separable kernels via Kronecker factorisation, inspired by [33]. The model is conveniently written in the Stan [34] probabilistic programming language.
We also present a novel amplitudeweighted coherence summary of the crosswavelet transform useful for focusing on evident activity by effectively downweighting periods of quiescence, when little to no contractile activity is present.
Layout
This paper is organised as follows. The wavelet transform is summarised in section II, including the calculation of global wavelet power spectrum used for inference. Section III describes how the wavelet spectra from two signals can be combined to calculate the wavelet crossspectrum and the associated coherence, also outlining the calculation of histograms over phasedifferences from the power crossspectra and coherence. Section IV presents a method for statistically modelling the wavelet results as a heteroscedastic functional mixedeffects model on the basis of Gaussian processes, with inference conducted by Hamiltonian Monte Carlo (HMC) using the Stan software ecosystem [34]. We apply our method to signals recorded with colonic manometry in 11 healthy volunteers, presenting the results in section V, and conclude with a discussion in section VI.
Ii Wavelet Transform
The continuous wavelet transform [5, 10] is a useful tool for analysing nonstationary quasiperiodic signals. It decomposes a time domain signal into the timescale domain with
(1) 
where is an admissable wavelet function, and the
superscript represents the complex conjugate. An admissible wavelet function is one which has zero mean and its Fourier transform is continuously differentiable
[35], with an extra desirable property that it be localised in both time and frequency. Intuitively, measures the variation of within a neighbourhood at of size proportional to .In practice, we choose from finite set of logarithmicallyspaced scales
, specify the wavelet basis function in the frequency domain, and perform the convolution in (
1) via FFT utilising the convolution theorem with(2) 
where , is frequencydomain wavelet function, is the frequencydomain signal, and are the Fourier and inverse Fourier transforms, and represents the frequencydomain locations in radians per second.
To map from scales (seconds) to frequencies (Hz) we use “Synchrosqueezing” [36]. Synchrosqueezing redistributes the wavelet coefficients based on the first timederivative of the phase (also known as “instantaneous frequency”). For a given set of equallyandlogarithmically spaced positive frequency bins with ordered edges , where is the set of halfopen intervals characterising each bin’s domain, synchrosqueezing can be described as
(3)  
(4)  
(5) 
where represents the timedifferentiable “unwrappedintime” phase in radians with the complex argument (or angle) denoted by the parenthesesless function with domaincodomain . The function returns the interval from that contains , and is the indicator function. The equallyandlogarithmically spaced condition on the bins can be satisfied with a constant , where for all . An example is shown in Fig. 1b.
Let be an element of . Since and the equality clearly holds, then we only need to consider a single arbitrary to represent bin . We will use the logarithmiccentre of each bin, and define the set of bin centres as
(6)  
(7) 
Switching to discretetime representation with samples recorded at times we can view the wavelet spectrum as . The timeaverage of the squared amplitudes produces the global wavelet power spectrum
(8) 
An example is shown in Fig. 1c.
Iii CrossWavelet Transform
The crosswavelet transform combines two wavelet spectra with the complexconjugated product
(9) 
where and are the synchrosqueezed wavelet transforms of the two signals labelled and . The combined subscript denotes the crosswavelet transform between the two signals.
A global wavelet power crossspectrum could be computed in the same way for as shown for in equation (8). However, this discards the useful phase information contained in . The effect of the complexconjugated product is that the resulting phase represents the difference in phase between the two signals. For each frequency, computing a squaredamplitudeweighted histogram of the phasedifferences yields a 2D histogram in the frequencyphase domain, analogous to the global wavelet power spectrum but stratified by phasedifferences.
Given a set of equallyspaced bins with ordered edges representing phasedifferences, where is the set of halfopen intervals characterising each bin’s domain, we define the 2D histogram of frequencies and phasedifferences by
(10)  
(11) 
where returns the interval from that contains , and is the set of all the time samples such that is in the bin containing . To ensure all phases are covered and . Let be an element of . Since and the equality clearly holds, then we only need to consider a single arbitrary to represent bin . We will use the centre of each bin, defining the set of bin centres as
(12)  
(13) 
If pairs of sensors are spaced sufficiently close together in the environment being recorded, then the crosswavelet transform between sensors in such a pair allows us to measure propagating quasiperiodic activity. The sign of the phasedifference determines the direction of propagation. An example is shown in Fig. 1d, where the dark feature in the upper half of the panel and slightly to the left of the dotted centre line indicates a strong tendency of the pressure waves in Fig. 1a to move bottomtotop, i.e. in an oral direction. The value of the phasedifference (rad) at the frequency of interest (Hz) and the separation between the pair of sensors (cm) can be used to determine the apparent velocity of propagation (cm/s) with the simple formula
(14) 
For quasiperiodic pressure signals in these data, a more appropriate measure of propagation may be “pace” which is the inverse velocity (s/cm), where synchronous events (or phaselocking) between the two signals have a more robustformodelling pace of 0, rather than a velocity at .
Coherence
If the distance between pairs of sensors is too large in relation to the propagation speed and frequency of the phenomena being recorded, then aliasing occurs whenever events are out of step with each other by more than half a cycle. This causes the phasedifference to misrepresent the true direction of propagation and its apparent velocity. In this case coherence can be used as a measure of coordination between pairs of sensors, even if we are forfeiting a measure of velocity. Coherence is a value between 0 and 1 that measures the coefficientofdetermination between two time series as a function of time and frequency [8].
To calculate the coherence, the method in [11] uses the power crossspectrum divided by each of the individual power spectra and . However, this is always identical to 1, and so [11] suggest smoothing in time and scale the crossspectrum and the individual power spectra before calculating the coherence. However, since we are using the synchronsqueezed coefficients and , then we smooth only in time and not scale, because the frequencybased redistribution of coefficients makes smoothing in scale no longer relevant. Here, we measure coherence as
(15) 
where the angle brackets denote smoothingintime, allowing for an dependent smoothing width. The smoothing width is somewhat arbitrary [11], but one can still tailor it to the wavelet function and perhaps the phenomena being recorded. For the examples presented in this paper we use Gaussian smoothing with a width parameter of since we’re using a Morlet wavelet function with an intrinsic frequency of radians, which corresponds to approximately 4 oscillations occurring within about 95% of the wavelet’s Gaussian envelope.
It is clear that (15) is no greater than 1 due to Jensen’s inequality, and no less than 0 due to the squared modulus. A shorter smoothing width will tend to produce higher coherence since the region over which the signal influences the coherence measurement is shorter. In the limit of no smoothing, substituting (9) into (15) yields 1. In the limit of total smoothing or averaging over the entire time range, and assuming that is zeromean random processes, then (15) will tend to 0.
To obtain a global measure of coherence, we use the logittransformed squaredamplitudeweighted average over time
(16)  
(17) 
where calculates the transformed average of weighted by , over the domain . A centre dot in a function denotes the variable of a curried function, that is, for some function we have . Utilising , equation (10) could be written
where is the identity function.
Similarly to in equation (10) we can calculate the histogram of over phases, after a logit transformation and weighting by the power crossspectrum , with
(18) 
which is only meaningful in the absence of aliasing where sensors are sufficiently close together, such that the phase domain correctly reflects the direction of propagation.
When numerically calculating the coherence, we find that some values may be equal to 1 or 0, which go to under the logit transformation, contaminating the sum in (17) for all other locations in . To circumvent this problem we clip the coherence to a range where the maximum is practically equivalent to 1 and minimum practically equivalent to 0. For the examples presented in this paper we clipped to a range of .
Iv Statistics
For each unit of statistical data, we obtain from the wavelet analysis a 1D curve or , or a 2D surface or . Such a curve or surface is considered to be a response under the influence of a set of predictors which can be any number of categorical or numerical variables specified by the experimental design. We want to measure and compare the effects of the given predictors.
An independent regression model could be fit for each location in either the frequency or frequencyphase domains. However, performing an independent fit at each location would require a multiplecomparison adjustment, and would fail to account for correlations between locations, effectively weakening the power of the analysis.
Instead, we capture correlations between locations by treating the response curves and surfaces as individual functions rather than simply collections of independent points. We model these functions as samples from Gaussian processes, which allow us to specify a formula for correlation between locations, without needing to specify a formula for the shape of the functions themselves.
A Gaussian process (GP) is a probability distribution with an infinite number of random variables, such that any finite set of variables form a multivariate Gaussian distribution. This is achieved by specifying a covariance kernel function
, which when given a finite set of locations allows us to build an covariance matrix with elements . We have only finite data, and so the kernel function is evaluated only at the available data locations when fitting the GP. However, we can inspect the GP at any number of arbitrary locations in the kernel’s domain, hence the infinite nature of the model as a step beyond a multivariate Gaussian. An analogy is fitting a simple regression line. The line is fit only to a finite set of data, but once we have an intercept and slope we can define a function , where ylocations can be calculated for any arbitrary set of xlocations, not just those for which we have data. See [37] for a text book introduction to GPs.Iva Model
The latent GP functiononscalar mixedeffect model we use can be written in the form
(19)  
(20)  
(21)  
(22) 
where represents the Gaussian process distribution, is a kernel function describing the structured standardized noise covariance, represents unstructured standardized noise variance, and is the response function for observation . The responses are based on the transformed amplitude, , or coherence, . The intuition behind the standardized noise co/variance can be seen by rearranging the terms in (19) and (20) to
(23) 
which facilitates efficient inference by not requiring the structured residuals on the righthandside of (23) to be sampled, nor requiring a matrix inversion per observation. When evaluating the likelihood specified by (23), for the 1D case a simple Cholesky decomposition is sufficient, but for the 2D case an eigendecomposition is needed to separate the kernel functions from the unstructured noise (see https://github.com/lwiklendt/gp_kron_stan for a Stan model source code example).
In the mean specified by (21), is a design matrix of populationlevel predictors (a.k.a. fixedeffects) with
representing the row vector of predictors pertaining to observation
. is a vector of iid latent GPs representing the populationlevel effects. is a design matrix of grouplevel predictors (a.k.a. randomeffects). is a vector of potentially correlated latent GPs representing the grouplevel effects. Depending on the experimental design, an optional offset term is included in (21) which may be either set to the mean of all as a way of centring the data, inferred to include a measure of variability in the centring, or given a different value per observation if some measure of exposure needs to be incorporated that would not otherwise fit as its own predictor in or .Analogous to the predictors and for the mean, the matrices and
are respectively the populationlevel and grouplevel predictors for the log standard deviation (
22), with corresponding effects and . An explicit offset term is missing here since such an offset is implicitly handled by the scale of .Each GP function in each vector of populationeffects is given an iid prior
(24)  
(25) 
However, for the vectors of groupeffects functions we include correlations between functions via multivariate or multioutput GPs
(26)  
(27) 
where and are covariance matrices dependent on the structure of the and design matrices. These matrices will generally be blocksparse, facilitating efficient computation.
The kernel functions and their parameters, also known as hyperparameters of the GPs, will be covered in the next subsection IVB.
The response functions , and the design matrices , , , and are the supplied “input” data. The vectors of functions , , , , and hyperparameters, are to be estimated and correspond to “outputs” of the inference. The structure of the design matrices depends on the experimental design, and we find it easiest to derive the design matrices (also known as “model matrices”) based on formula notation as specified in section 2 of [38]. We provide an example application in section V using the formulae (37) and (38).
We are interested in modelling amplitudes or coherences that were calculated using the wavelet transform as described in sections II and III. To fit amplitudes or coherences over frequencies, is a scalar that represents frequencies. To fit over frequencies and phasedifferences, is a 2D point that represents frequencies in one dimension and phasedifferences in the other.
IvB Kernel Functions
The form and parameters of the kernel functions depend on whether the response functions are 1D or 2D. There are many potential kernels to choose from, and they can even be built up from smaller kernels [39], but for the sake of brevity we will limit our exposition to one concrete kernel function for each type of domain. For the case of 1D curves over frequencies we use a logspace squaredexponential kernel
(28) 
with specifying the lengthscale of the correlation based on the distance between any two frequencies and . At a distance of 0 we have equal frequencies , where the correlation is 1 and covariance is . As the distance approaches the correlation and covariance approach 0. For the case of 2D surfaces over frequences and phasedifferences we use a product of the logspace squaredexponential kernel and a periodic kernel
(29) 
where and specify the logfrequency and phasedifference lengthscales. When the difference in phasedifferences and is either 0 or , or any integer multiple of , then the phasedifference component of the kernel will be 1, identifying the locations and .
For equations (24) and (25), represents the kernel function used in constructing a covariance matrix, but for equations (26) and (27) is a kernel function used in constructing a correlation matrix by setting , since including a free parameter for variance in would make the model nonidentifiable due to the variance parameters already defined in and .
The kernel in (29) is separable, such that we can write it as
(30)  
(31) 
where with abuse of notation we are identifying kernel functions based on their argument symbols, such that and are different functions, with defined in (28) and defined in (31). Since we can factorise the 2D kernel (30), we can create a covariance matrix using the Kronecker product of the individual covariance matrices built from kernels (28) and (31)
(32)  
(33)  
(34) 
The Kronecker factorisation of the kernel matrices also allows for a substantial speed up in the numerical calculation of the Cholesky and eigen decompositions of the covariance matrices [40] used in inference.
Prior distributions for hyperparameters and are experiment dependent, and will in general depend on the scale of the data. For the application presented in section V, each subscript (omitted for brevity) is treated independently, unless otherwise specified. We used while ensuring , for the correlation between and we used , and for a halfStudent
distribution with 3 degrees of freedom.
IvC Implementation
A coarse grid was chosen for the functional domain so that posterior sampling could complete within a reasonable time. The grid can be refined relatively quickly after the expensive sampling step. Rather than the naïve linear or cubic interpolation, we can use GP prediction such that the covariance between locations is faithfully preserved in the refinement.
Given a vector of grid coordinates , a vector of refined coordinates , a vector of function values corresponding to , and the kernel function , then we can produce a vector of refined function values at with
(35) 
where is the covariance matrix obtained by applying to the coordinates in , and is the matrix given by the covariances obtained by applying to and .
Note, for the 2D case we can take advantage of
(36) 
We use the Hamiltonian Monte Carlo sampler from the Stan package to obtain a posterior distribution of GPs which can be inspected to detect where and how locations may differ between various categorical predictors such as anatomical regions, treatment periods, patient types, and between various numerical predictors such as age, weight, caloric content of a meal, etc.
V Application
We applied the method to colonic manometry data recorded from the descending and sigmoid colon of 11 healthy volunteers during 1hr preprandial and postprandial periods. Data from these studies have been published previously [3]. Using formula notation
(37)  
(38) 
the design matrices and are constructed from formula (37), and and from (38), according to the construction process described in [38], where the in (38) is a transformation of . The
predictor is a categorical variable with two levels, one for each of the recorded regions of the colon: descending and sigmoid. The
predictor is a binary variable indicating whether a recording was obtained during the preprandial “0” or postprandial “1” state, corresponding to a meal effect. The categorical variable
identifies the individual subject of a recording.The predictor in (38) is a realvalued standardised count of the number of sensors (or channels) in the recording, which varies per subject per region. When computing weightedaverages over time as specified in sections II and III, we average not only over time but over both time and channels by effectively flattening the wavelet results into a single channel of length , where is the number of channels and is the number of time samples. Fewer channels are expected to result in a greater variation in the global averages, which is why we included it as a confounding factor of the signal variance. We set with the formula (38) since we don’t have repeated measurements, and so a withinsubject variation is poorly identified.
Four types of responses were analysed, given by the 1D and 2D amplitudes from equations (8) and (10), and the 1D and 2D coherences from equations (16) and (18). The amplitudes were logtransformed and coherences logittransformed to obtain the ’s in model (1922). For the 1D responses 29 frequencybins were used, and for the 2D responses 15 frequencybins and 16 phasebins were used. After sampling from the posterior, 200 frequency bins and 201 phase bins were used in refinement via GP interpolation (35), with results plotted in Fig. 2 for 1D responses and Fig. 3 for 2D responses.
For each response type, the Hamiltonian Monte Carlo run consisted of 200 warmup iterations and 500 sampling iterations over 4 randomlyinitialised chains resulting in 2000 samples from the posterior distribution. We used an adaptdelta of 0.95. Diagnostics showed no divergent transitions, a top tree depth of 9, and visual inspection of trace plots showed good convergence that was validated by an ^{2}^{2}2Here refers to the convergence diagnostic in Stan, and not our coherence measure from equations (16) and (18).
. Data appeared consistent with the posterior predictive distribution. On a 2013 MacBook Pro with 2.7GHz Core i7 running macOS Mojave with 16GB 1600MHz RAM using PyStan
^{3}^{3}3Stan Development Team. 2018. PyStan: the Python interface to Stan, Version 2.18. http://mcstan.org with 4 parallel CPU cores (1 per chain), each of the 1D response types completed sampling in minutes, and each of the 2D response types completed in hours. The computation process from raw pressure recording to timeaveraged wavelet spectra (Fig. 1) for each of the 42 individual observations took seconds.functions. Due to the log and logit transforms of the responses, the effects are logarithmically additive, which are represented as ratios. We plot all 2000 posterior samples as overlaid thin white lines, with 95% “credible intervals” outlined by white dotted lines. A substantial meal effect is evident in the amplitude responses (i) where the posterior curves in cells (c) and (f) are clearly above a ratio of 1, with greatest effect around the 34 cyclesperminute (cpm) frequency. In (ii), although there is no such clear meal effect, we can see that in the sigmoid region the meal effect (f) around 34 cpm is increasing compared to the decreasing effect adjacent at around 1cpm and above 8cpm. We also notice the difference in meal effect between the descending and sigmoid regions (j) as an increase in coherence between 310cpm. An uptick near 16cpm in (a, b, d, e) is likely due to respiration artefact.
Vi Discussion
We have presented a method for analysing spatiotemporal manometry data by computing various timeaveraged spectra and using them as responses in a functional mixedeffects model, inferred via HMC. This has allowed us to identify and contrast features in the spectra between various predictors of each spectrum.
Although the statistical model presented was developed to analyse quasiperiodic spatiotemporal manometry data, the model can be used (with appropriate choice of kernel function) to practically any kind of functional data that can be represented on a shared grid.
For the statistical analysis, we are not limited to testing for differences across predictors within the same locations. An example of this is already shown in Fig. 3 where the blackandwhite hatched lines encircle regions where differences are inspected not between predictors but between phase locations within the same set of predictors. Since HMC produces samples from a distribution over all parameters given the data , that is , we are free to interrogate this distribution by applying any wellposed^{4}^{4}4Wellposed questions are in the form of an expectation over a function applied to the random variable , with distribution . This is achieved by summing over the samples from the posterior distribution, formally, . This is not as strong of a limitation as may first appear. For example, could include the function over a subset of parameters, allowing us to obtain the peak amplitude or coherence as a distribution over frequencies, and to potentially test how this peak frequency depends on predictors. question of interest to the samples.
The method presented in [41] was a first step in automated analysis of colonic propagation, but reduced the analysis to a single indicator that could discern neither speed of propagation nor frequency of activity. The results herein are confirmed by our previous manual analysis in [3]. In that publication we demonstrated significant increased in the post prandial colonic cyclic motor pattern. That motor pattern consisted primarily of propagating pressure waves at a 34cpm frequency that travelled mostly in a retrograde direction. In our previous publication we also showed that the majority (76%) of the cyclic motor patterns was identified in the sigmoid colon. With our novel technique, in this paper these physiological features are all shown. The manual analysis in our previous paper [3] took several weeks to perform. To obtain similar results with the developed analysis in this paper took under 7 hours. Thus the detailed analysis provided by our developed technique is orders of magnitude beyond the methods currently available in both detail, speed of analysis, and manual labor saved. A postprandial increase in colonic activity was also shown in [4] using FFT and [42] using wavelets but without the rigorous statistical analysis the current paper provides.
Caveats and Limitations
When applying synchrosqueezing, we sometimes notice frequencyedge artefacts for activity close the the highest or lowest frequencies in . This may be due to an insufficient source of instantaneous frequency to be redistributed near the edge frequencies with equation (3). We suggest to choose sufficiently high and low scales to capture activity at least one octave above and below the highest and lowest frequencies to be examined, as long as the data is of sufficient sampling frequency and duration to permit such a margin.
In the application presented here, our data was recorded from a catheter with equallyspaced sensors. For unequallyspaced sensors, sensor pairs at different separations showing the same phasedifference will not represent the same propagation speed. Either a persensorpair adjustment to the phasedifference should be made, or the inversevelocity (pace) should be considered as a substitute for the phasedifference. This will require choosing a more appropriate phasedomain kernel since the phasedependent result will no longer neatly fall into a periodic kernel with a constant period over all frequencies.
There is a clear use of amplitudeweighted coherence, equations (16) and (17), for cases where sensors are too far apart to obtain a meaningful measure of phasedifference. However, the amplitudeweighted coherence is also useful even when sensors are sufficiently close, if the experiment requires periods of quiescence to be ignored. We may want to focus only on periods of higher activity (higher amplitudes) if quiescent periods (lower amplitudes) occur at random periods in the signal and/or have coherences that are unrepresentative or independent of any predictors.
If the amplitudeweighted coherence, (16) and (18), is applied to a weak signal where a momentary but large amplitude is present, then that momentary event will dominate the rest of the recording. To account for the fact that substantially fewer time samples influence the calculation of the coherence in such cases, a predictor for the signal variance (38) can be added, which quantifies the proportion of the signal from which coherence is effectively retained due to amplitudeweighting.
Formula (23) allows for GP residuals. It is unclear how one could extend the responses to other distributions, such as Poisson or Bernoulli, while retaining residual correlation. Perhaps one would need to either sample latent structured residuals, or abandon the residual correlation altogether.
Data where some grid points have missing values is not handled. Imputing the missing values could be a solution, achieved by sampling from the residual GP at the missing locations. This would require computing the mean and covariance of the residual GP at the missing values per observation (Algorithm 2.1 in
[37]). Had we not used structured residuals, such imputation would be trivial to implement and computationally inexpensive.References
 [1] A. F. Peery, S. D. Crockett, C. C. Murphy, J. L. Lund, E. S. Dellon, J. L. Williams, E. T. Jensen, N. J. Shaheen, A. S. Barritt, S. R. Lieber et al., “Burden and cost of gastrointestinal, liver, and pancreatic diseases in the United States: update 2018,” Gastroenterology, 2018.
 [2] G. Bassotti, M. Gaburri, B. Imbimbo, L. Rossi, F. Farroni, M. Pelli, and A. Morelli, “Colonic mass movements in idiopathic chronic constipation.” Gut, vol. 29, no. 9, pp. 1173–1179, 1988.
 [3] P. G. Dinning, L. Wiklendt, L. Maslen, I. Gibbins, V. Patton, J. W. Arkwright, D. Z. Lubowski, G. O’Grady, P. A. Bampton, S. J. Brookes, and M. Costa, “Quantification of in vivo colonic motor patterns in healthy humans before and after a meal revealed by highresolution fiberoptic manometry,” Neurogastroenterology & Motility, pp. n/a–n/a, 2014. [Online]. Available: http://dx.doi.org/10.1111/nmo.12408
 [4] P. Dinning, L. Wiklendt, L. Maslen, V. Patton, H. Lewis, J. Arkwright, D. Wattchow, D. Lubowski, M. Costa, and P. Bampton, “Colonic motor abnormalities in slow transit constipation defined by high resolution, fibreoptic manometry,” Neurogastroenterology & Motility, vol. 27, no. 3, pp. 379–388, 2015.
 [5] S. Mallat, A wavelet tour of signal processing: the sparse way. Academic press, 2008.

[6]
E. Jiang, P. Zan, S. Zhang, X. Zhu, and Y. Shao, “Optimal wavelet packet decomposition for rectal pressure signal feature extraction,” in
Advanced Computational Intelligence (ICACI), 2012 IEEE Fifth International Conference on. IEEE, 2012, pp. 1204–1208.  [7] B. P. Abbott, R. Abbott, T. Abbott, M. Abernathy, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, R. Adhikari et al., “Observation of gravitational waves from a binary black hole merger,” Physical review letters, vol. 116, no. 6, p. 061102, 2016.
 [8] P. C. Liu, “Wavelet spectrum analysis and ocean wind waves,” in Wavelet Analysis and Its Applications. Elsevier, 1994, vol. 4, pp. 151–166.
 [9] K.M. Lau and H. Weng, “Climate signal detection using wavelet transform: How to make a time series sing,” Bulletin of the American meteorological society, vol. 76, no. 12, pp. 2391–2402, 1995.
 [10] C. Torrence and G. P. Compo, “A practical guide to wavelet analysis,” Bulletin of the American Meteorological society, vol. 79, no. 1, pp. 61–78, 1998.
 [11] C. Torrence and P. J. Webster, “Interdecadal changes in the ENSOmonsoon system,” Journal of Climate, vol. 12, no. 8, pp. 2679–2690, 1999.
 [12] A. Grinsted, J. C. Moore, and S. Jevrejeva, “Application of the cross wavelet transform and wavelet coherence to geophysical time series,” Nonlinear processes in geophysics, vol. 11, no. 5/6, pp. 561–566, 2004.
 [13] B. Lites and E. Chipman, “The vertical propagation of waves in the solar atmosphere. IObservations of phase delays,” The Astrophysical Journal, vol. 231, pp. 570–588, 1979.
 [14] A. Andić, “Propagation of high frequency waves in the quiet solar atmosphere,” Serbian Astronomical Journal, no. 177, pp. 1–14, 2008.
 [15] J. S. Morris, “Functional regression,” Annual Review of Statistics and Its Application, vol. 2, pp. 321–359, 2015.
 [16] J. O. Ramsay and B. W. Silverman, Functional data analysis. Springer New York, 1997.
 [17] W. Guo, “Functional mixed effects models,” Biometrics, vol. 58, no. 1, pp. 121–128, 2002.
 [18] J. Shi, B. Wang, E. Will, and R. West, “Mixedeffects Gaussian process functional regression models with application to dose–response curve prediction,” Statistics in medicine, vol. 31, no. 26, pp. 3165–3177, 2012.
 [19] J. Yang, D. D. Cox, J. S. Lee, P. Ren, and T. Choi, “Efficient Bayesian hierarchical functional data analysis with basis function approximations using Gaussian–Wishart processes,” Biometrics, vol. 73, no. 4, pp. 1082–1091, 2017.
 [20] P. J. Diggle and I. Al Wasel, “Spectral analysis of replicated biomedical time series,” Journal of the Royal Statistical Society: Series C (Applied Statistics), vol. 46, no. 1, pp. 31–71, 1997.
 [21] M. Bose, J. S. Hodges, and S. Banerjee, “Toward a diagnostic toolkit for linear models with Gaussianprocess distributed random effects,” Biometrics, 2018.
 [22] J. S. Morris, M. Vannucci, P. J. Brown, and R. J. Carroll, “Waveletbased nonparametric modeling of hierarchical functions in colon carcinogenesis,” Journal of the American Statistical Association, vol. 98, no. 463, pp. 573–583, 2003.
 [23] J. S. Morris and R. J. Carroll, “Waveletbased functional mixed models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 68, no. 2, pp. 179–199, 2006.
 [24] J. S. Morris, P. J. Brown, R. C. Herrick, K. A. Baggerly, and K. R. Coombes, “Bayesian Analysis of Mass Spectrometry Proteomic Data Using WaveletBased Functional Mixed Models,” Biometrics, vol. 64, no. 2, pp. 479–489, 2008.
 [25] J. S. Morris, V. Baladandayuthapani, R. C. Herrick, P. Sanna, and H. Gutstein, “Automated analysis of quantitative image data using isomorphic functional mixed models, with application to proteomics data,” The annals of applied statistics, vol. 5, no. 2A, p. 894, 2011.
 [26] J. G. Martinez, K. M. Bohn, R. J. Carroll, and J. S. Morris, “A study of Mexican freetailed bat chirp syllables: Bayesian functional mixed models for nonstationary acoustic time series,” Journal of the American Statistical Association, vol. 108, no. 502, pp. 514–526, 2013.
 [27] P. W. Goldberg, C. K. Williams, and C. M. Bishop, “Regression with inputdependent noise: A Gaussian process treatment,” Advances in neural information processing systems, pp. 493–499, 1998.
 [28] J. Q. Shi, R. MurraySmith, and D. Titterington, “Hierarchical Gaussian process mixtures for regression,” Statistics and Computing, vol. 15, no. 1, pp. 31–41, 2005.
 [29] J. Yang, H. Zhu, T. Choi, D. D. Cox et al., “Smoothing and Mean–Covariance Estimation of Functional Data with a Bayesian Hierarchical Model,” Bayesian Analysis, vol. 11, no. 3, pp. 649–670, 2016.

[30]
M. K. Titsias and M. LázaroGredilla, “Variational heteroscedastic
Gaussian process regression,” in
Proceedings of the 28th International Conference on Machine Learning (ICML11)
, 2011, pp. 841–848.  [31] V. Tolvanen, P. Jylänki, and A. Vehtari, “Expectation propagation for nonstationary heteroscedastic Gaussian process regression,” in Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on. IEEE, 2014, pp. 1–6.
 [32] M. Heinonen, H. Mannerström, J. Rousu, S. Kaski, and H. Lähdesmäki, “NonStationary Gaussian Process Regression with Hamiltonian Monte Carlo,” in Artificial Intelligence and Statistics, 2016, pp. 732–740.
 [33] S. R. Flaxman, A. Gelman, D. B. Neill, A. J. Smola, A. Vehtari, and A. G. Wilson, “Fast hierarchical Gaussian processes,” 2015. [Online]. Available: http://sethrf.com/files/fasthierarchicalGPs.pdf
 [34] B. Carpenter, A. Gelman, M. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li, and A. Riddell, “Stan: A Probabilistic Programming Language,” Journal of Statistical Software, Articles, vol. 76, no. 1, pp. 1–32, 2017. [Online]. Available: https://www.jstatsoft.org/v076/i01
 [35] M. Farge, “Wavelet transforms and their applications to turbulence,” Annual review of fluid mechanics, vol. 24, no. 1, pp. 395–458, 1992.
 [36] I. Daubechies, J. Lu, and H.T. Wu, “Synchrosqueezed wavelet transforms: an empirical mode decompositionlike tool,” Applied and computational harmonic analysis, vol. 30, no. 2, pp. 243–261, 2011.
 [37] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning. The MIT Press, 2006.
 [38] D. Bates, M. Mächler, B. Bolker, and S. Walker, “Fitting Linear MixedEffects Models Using lme4,” Journal of Statistical Software, vol. 67, no. i01, 2015.
 [39] D. Duvenaud, “Automatic model construction with Gaussian processes,” Ph.D. dissertation, University of Cambridge, 2014.
 [40] Y. Saatçi, “Scalable inference for structured Gaussian process models,” Ph.D. dissertation, University of Cambridge, 2012.
 [41] L. Wiklendt, S. D. Mohammed, S. M. Scott, and P. G. Dinning, “Classification of normal and abnormal colonic motility based on crosscorrelations of pancolonic manometry data,” Neurogastroenterology & Motility, vol. 25, no. 3, pp. e215–e223, March 2013.
 [42] P. Dinning, T. Sia, R. Kumar, R. Mohd Rosli, M. Kyloh, D. Wattchow, L. Wiklendt, S. Brookes, M. Costa, and N. Spencer, “Highresolution colonic motility recordings in vivo compared with ex vivo recordings after colectomy, in patients with slow transit constipation,” Neurogastroenterology & Motility, vol. 28, no. 12, pp. 1824–1835, 2016.
Comments
There are no comments yet.