Abstract
Stateoftheart robotic hand prosthetics generate finger and wrist movement through pattern recognition (PR) algorithms using features of forearm electromyogram (EMG) signals, but requires extensive training and is prone to poor predictions for conditions outside the training data (Scheme et al., 2010; Peerdeman et al., 2011). We propose a novel approach to develop a dynamic robotic limb by utilizing the recent history of EMG signals in a model that accounts for physiological features of hand movement which are ignored by PR algorithms. We do this by viewing EMG signals as functional covariates and develop a functional linear model that quantifies the effect of the EMG signals on finger/wrist velocity through a bivariate coefficient function that is allowed to vary with current finger/wrist position. The model is made parsimonious and interpretable through a twostep variable selection procedure, called Sequential Adaptive Functional Empirical group LASSO (SAFEgLASSO). Numerical studies show excellent selection and prediction properties of SAFEgLASSO compared to popular alternatives. For our motivating dataset, the method correctly identifies the few EMG signals that are known to be important for an ablebodied subject with negligible false positives and the model can be directly implemented in a robotic prosthetic.
Keywords: Electromyography signal; Varying functional regression; Functional variable selection; Group LASSO; Postselection predictive inference
1 Introduction
More than 160,000 Americans are transradial (i.e. belowelbow) amputees, henceforth TRAs, and must relearn how to perform tasks that typically require an intact hand (ZieglerGraham et al., 2008). Passive hand prostheses and related devices are useful, but cannot emulate the full functionality of an intact limb. Multifunctional robotic prosthetics, such as the FDAapproved DEKA arm system (Linda Resnik et al., 2011; Resnik et al., 2011), have become very popular with recent advancements in their mechanical systems. The available software that provides user control of the hardware, however, is often nonintuitive to operate. For example, one approach is to have the user control the prosthetic with their foot. Forearm muscle contractions are known to cause hand movement for an ablebodied subject, henceforth AB subject. A TRA’s residual forearm, which once caused hand movement, will still contract and these contractions exhibit measurable EMG signals that are somewhat consistent with hand movement even though they no longer use that hand. A reliable mapping of a TRA’s forearm electromyogram (EMG) signals to hand movement would then provide a more intuitive approach for prosthesis control.
Figure (1) shows the process of movement intention to movement production for both an AB subject and TRA. For an AB subject, tendons are attached to the forearm muscles, travel through the wrist and connect to bones in the hand. The muscle contractions transmit force through tendons to the bones, generating finger or wrist movement. For a TRA, this physical connection among muscles, tendons, and bones no longer exists, but they may still sense movement in their phantom limb, accompanied by observable muscle contractions in their residual limb. Their generated EMG signal data feed into a prosthesis controller that processes the data and predicts movement, which is then produced by the robotic limb.
Direct myoelectric control, a traditional approach for prosthesis control, uses EMG signals from two antagonistic muscles and assigns to each a specific type of hand movement, such as finger flexion or extension. The EMG signal can only control that class of movement and, if activated for any reason, will produce that movement with speed proportional to the magnitude of the signal. Direct control is limited in its ability to direct multiple degrees of freedom of hand movement. The user must switch manually between functions (e.g. from wrist pronation/supination to hand open/close), but then the user’s attempted movement may not match with the prosthesis action. Since it is cumbersome for controlling multiple limb functions, direct control prostheses have shown a 75% rejection rate by users
(Biddiss and Chau, 2007; Peerdeman et al., 2011). Stateoftheart prosthetics overcome the restrictions of direct myoelectric control using datadriven pattern recognition (PR) algorithms capable of synthesizing a large number of EMG signals (potentially over 100) into classes of intended movement (Englehart and Hudgins, 2003; Zhou et al., 2007; Huang et al., 2008).PR is more flexible than direct myoelectric control, but has its own set of limitations that prohibit effective, intuitive user control. PR does not incorporate knowledge of human neurophysiology and biomechanics and relies heavily on the representativeness of the data used to train the algorithm in identifying EMG patterns associated with classes of hand movements. This leads to poor prediction performance for conditions not directly observed in the training data. It assumes that a constant, repeatable pattern of muscle contractions leads to a specific movement class and users may find it difficult to perfectly repeat this pattern. PR also significantly reduces the EMG information, using summarizing features of the EMG signals such as the mean absolute value and number of slope sign changes instead of the original signals. The loss of important information due to data reduction as well as redundancy of predictor information leads to overfitting to the training data, as indicated by Scheme et al. (2010).
Recently, Crouch and Huang (2016) and Crouch and Huang (2017) proposed and implemented an EMGbased controller using a planar linksegment dynamic model that directly incorporates features of the neurophysiological and biomechanical system. Using EMG data from only four forearm muscles, they were able to accurately predict wrist and finger movement. The model they use, however, is nonlinear and it is not clear how well it will translate to a TRA, since their model requires the subset of muscles to be prespecified. Their results do show that incorporating the neurophysiological and biomechanical properties can yield movement predictions that are more typical of biological movement than movements predicted by PR. This suggests that incorporating such properties in a datadriven approach may help improve movement predictions.
In this paper, we propose to model two degrees of freedom for hand movement, finger and wrist flexion/extension, using recent past behavior of the EMG signals. Our statistical model uses either finger or wrist velocity as its response and allows the effect of each EMG signal on the response to vary with the current finger or wrist position. This model retains much of the EMG information and respects known biomechanical constraints, thus allowing it to explain a broader range of hand movements using fewer EMG signals. We fit the model using a novel algorithm for functional regression which we call Sequential Adaptive Functional Empirical group LASSO (SAFEgLASSO) that selects only a subset of the EMG to be included in the model while simultaneously requiring that the estimated functional coefficients are smooth and interpretable. SAFEgLASSO pairs a generalization of the adaptive sparsesmoothness penalty from
Gertheiss et al. (2013) with an efficient, sequential variable selection and fitting approach inspired by the relaxed LASSO (Meinshausen, 2007). We also extend the ideas of Lei et al. (2017) to generate postselection prediction confidence bands.The paper is organized as follows. Section 2 reviews the neurophysical and biomechanical relationships involved in hand movement that we address in our proposed statistical model. Section 3 describes the collection process for collecting EMG and hand movement data. Section 4 details the analysis methodology starting with the proposed functional linear model and explains how it accounts for known underlying biomechanical relationships. We then describe the variable selection and postselection fit procedures. Section 5 presents the analysis results from data collected from an AB subject, where the underlying truth is known. An extensive simulation study is performed in sections 6.1 and 6.2, showing the robustness of our method to varying assumptions. Section 7 concludes the paper with a discussion of limitations and possible avenues of extension.
2 Biomechanics of hand movement
This section describes the underlying biomechanical process that generates intentional hand movement for an AB subject and highlights the unique challenges that will need to be addressed by the proposed model. Incorporating features of this process in the statistical analysis and the robotic prosthetic software should significantly aid user control of robotic prosthetics by TRAs. In particular, this section motivates the use of recent past behavior of the EMG signals as functional predictors and justifies the need for positiondependent effects to map the relationship between these EMG signals and velocity. Velocity was chosen as the desired response instead of position because almost all the upper limb prostheses on the market have direction and velocity control as their outputs (Scheme and Englehart, 2011).
The biological process of hand movement begins with initiation of action potentials that are first conducted along motor neurons from the central nervous system. At the neuromuscular junction, the action potential causes the release of neurotransmitters from the nerve that initiates an action potential on the muscle fiber membrane. That motor unit action potential (MUAP) conducts along the surface of the muscle fiber membrane that extends over and within the muscle fiber. This stimulates the release of calcium ions within myofibrils and initiates crossbridge cycling of overlapping protein chains that cause the entire muscle fiber to shorten (contract), generating mechanical energy. The output force of an entire muscle varies depending on the proportion of fibers in the muscle that are contracted and the frequency at which they are stimulated. The EMG signal measured from surface electrodes represents the sum of individual MUAPs and thus conveys information about the magnitude and duration of muscle contraction.
The mechanical aspect of hand movement explains how the mechanical energy generated by the muscle contractions lead to specific hand movements. For AB subjects, tendons connect forearm muscles to bones in the hand and it is through this connection that movements are generated. That is, a muscle contraction will pull its tendon which in turn moves the hand depending on the path of the tendon relative to the joint(s) that it crosses. As mentioned earlier, AB subjects have an internal representation that maps muscle contractions to hand movements, and this mapping is maintained for TRAs following surgery, although it may become distorted. Our goal is to build and implement a statistical model that decodes this internal biomechanical representation, specifically for relating finger and wrist movement to EMG signal data.
The top panel of Figure 2 demonstrates what is meant by finger and wrist movement. For example, finger flexion and extension refers to the simultaneous opening and closing of all finger digits, respectively, except for the thumb. The bottom panel of Figure 2 shows a range of arm postures considered in this paper. A PRbased controller trained on one posture setting would likely perform poorly in one of the other postures because EMG patterns change with limb posture due to, for example, a shift in electrode locations relative to the underlying muscles (Scheme et al., 2010). A biomechanically motivated model should behave consistently across postures because the biomechanical process does not change significantly with posture.
The known biomechanical process of AB subjects suggests that these two movement degrees of freedom can be explained by relatively few forearm muscles. The nonlinear model of Crouch and Huang (2016) is capable of accurate predictions with few EMG signals but does not clearly admit a variable selection procedure to decide which muscles to use. This presents a problem for its use by a TRA for whom the intended biomechanical actions of the EMG are not directly observable and may be altered following amputation. Thus we require a statistical model that is capable of accurate predictions using few EMG signals and the model must lend itself to a variable selection procedure.
There are many challenges and constraints that the proposed statistical model must accommodate to successfully approximate this biomechanical process. First, the fingers and the wrist are limited in how far they can be extended/flexed. Second, muscles generate passive movement forces when stretched, which may produce movement in the absence of EMG information. For instance, if one relaxes their forearm muscles following a contraction, the tendons will return back to their resting length, generating a passive force, and the hand will return to a neutral configuration. Therefore, we may observe movement without observing concurrent EMG activation. While this tendon connection no longer exists for a TRA, they may still anticipate these passive forces.
Figure 3 illustrates some finger position data (in radians, shown in black) taken from an AB subject for a short timewindow and overlays two concurrent EMG signals, labeled EMG 7 (green) and EMG 12 (red), known to contribute to finger movement. Figure 3(a) is an event window with an active EMG 7 signal but no movement; the result of a physical constraint. Figure 3(b) demonstrates finger movement in the absence of active, concurrent EMG signals, due to passive forces. This figure demonstrates the opportunity for past EMG behavior to predict passive movement, as seen by the declining green line preceding Figure 3(b).
3 Data collection and processing
This section and the analysis in section 5 focuses on the collection and analysis of data from an AB subject. The process of calibrating a robotic arm is subjectspecific and we are not interested in estimating a population model. The data collection procedure described here and the analysis methodology described in Section 4 could be modified to perform inference across multiple subjects. A data collection strategy is given at the end of Section 3.1 for TRAs consistent with our analysis approach, requiring measured hand movements.
Surface EMG electrodes (Biometrics, Newport, UK) were placed at different locations of the forearm. The subject was then asked to perform specific types of hand movement and the EMG signals were recorded continuously at 960 Hz. Electrodes were placed over 16 specific forearm muscles considered to be potentially important muscles for generating hand movements of interest. Raw EMG data were highpass filtered at 40 Hz, rectified, and lowpass filtered at 6 Hz using a 4th order Butterworth zerophase filter (Crouch and Huang, 2016). Prior to data collection, the subject was asked to perform movements exhibiting maximal muscle contraction, allowing normalization of the subject’s EMG signals to be between 0 (no contraction) and 1 (maximal contraction).
Movement data were collected by reflective markers placed on 9 anatomical locations on the forearm, wrist, and hand. Threedimensional marker positions were recorded at 120 Hz using an infrared motion capture system (Vicon Motion Systems Ltd., UK), and processed (filtered) at 6 Hz using a 4th order Butterworth filter (Butterworth, 1930). The joint angles (in radians), which we call positions, were then calculated from filtered marker data through a musculoskeletal model (Holzbaur et al., 2005) in OpenSim (Delp et al., 2007). EMG and joint angle data were collected synchronously for a given period of time during which a subject was asked to perform a set of finger and/or wrist movement. Figure 3 provides a snapshot of this data while a subject is performing finger flexion and extension with consistent movements in a fixed posture.
3.1 Data collection procedure
The subject was asked to perform basic finger and wrist movements in each arm posture as shown in the bottom panel of Figure 2. For a given posture, the subject was asked to perform single degreeoffreedom movements for their fingers/wrist following either a consistent or random pattern. For example, the subject performed consistent finger movement that continuously alternated between extending (opening) and flexing (closing) their fingers, while keeping their wrist in a neutral position. Such a protocol has been followed by others; see for example Kawashima and Mita (2016). The subject was also asked to perform a series of random movements, which was chosen solely by the subject, providing a challenging benchmark to determine how robust our fitted models are to different movements.
For a TRA, there would be no movement data for the corresponding hand. Instead, movement data can be collected from the other, intact limb while the TRA performs (attempts) mirrored movements with both limbs; see Scheme and Englehart (2011).
3.2 Postcollection data processing for analysis
To accommodate the biomechanical constraints described in section 2, we utilize recent past behavior of EMG signals, as well as current finger/wrist position, to predict finger/wrist velocity. For example, a concurrent EMG value of 0.50 leads to different movements depending on the historical EMG trend. We briefly describe here the data processing that was done that allowed us to fit the model described in the next section.
The velocity values were estimated from a penalized smoother of the entire series of recorded position data across time using the R package fda. In particular, we used a secondorder smooth regularization penalty to control the goodness of fit and smoothness of the fitted curve, where the smoothing parameter is selected by crossvalidation. We then generated a velocity estimate for each observed position data point. For each velocity data point, we extracted the previous EMG observations that ended with the concurrent EMG value to the velocity value. In this application, we chose a past time window of roughly 1/3 seconds. The value was chosen based on observed passive force movement (see Figure 3). This was done for all EMG signals, so each velocity estimate had associated with it a position value and equallyspaced past measurements across 1/3 seconds for all EMG signals collected. A visualization of this data restructuring may be found in the Supplementary Materials, Section LABEL:AppendixChp3. We propose to model the current velocity of the finger/wrist movement as a function of the recent history of the EMG signals which is to be discussed next.
4 Variable selection and inferential framework
Denote the observed data by where indexes the instance at which data are collected, is the scalar response, is the continuous scalar covariate, indexes the functional predictors, and is the th functional predictor observed at point such that . It is assumed that and both and are closed compact sets. In our application, is the current velocity, is the current position, and is the recent history of the th EMG signal and is a window of time that depicts the “recent” past.
We consider a functional linear model with varying smooth effects:
(1) 
where is an intercept and is an unknown bivariate function defined on that quantifies the effect of the th functional predictor on the mean response of conditional on . Model (1) is a direct extension of the functional linear model (FLM) described in Ramsay and Silverman (2005); Cardot et al. (2003); James et al. (2009); Goldsmith et al. (2011); Ferraty et al. (2012); McLean et al. (2014) who assume nonvarying functional coefficients. In our application accounting for positions in the statistical model is crucial for modeling of passive forces since the effect of the EMG signals on finger/wrist movement heavily depends on the current state of the hand as argued in section 2. Varying coefficients for scalar covariates characterize the effect of a covariate on response through another covariate and have been studied extensively in nonparametric and semiparametric literature; see, for example, Fan and Zhang (2008); Maity and Huang (2012); Fan et al. (2014); Davenport et al. (2015); Bandyopadhyay and Maity (2017). Recently, the varyingcoefficient model has been extended to analyze functional data in Cardot and Sarda (2008); Wu et al. (2010); Davenport (2013).
Our primary objective is to select the functional predictors whose effects on the mean response as described by (1) is nonzero. For our application this would allow us to select the most important forearm muscles in explaining finger and wrist movements. Let denote the true index set where and let denote the set of true effects. Established fitting approaches focus on the estimation of the smooth coefficients rather than variable selection. Variable selection in scalaronfunction regression with invariant smooth coefficients, , has been discussed in Matsui and Konishi (2011); Gertheiss et al. (2013); Fan et al. (2015); Pannu and Billor (2017). In this paper, we extend these approaches to the varying functional linear model (1) that is motivated by our data application.
4.1 Approximation to linear model
Following the procedure described in Wood (2006) and Eilers and Marx (2003), we approximate
using a tensor product of two univariate basis functions. Let
and be two truncated univariate bases defined in and respectively; and let , where ’s are the unknown basis coefficients associated with the th functional predictor. For simplicity of exposition, we use the same bases for all Generalization to different bases for different predictors is straightforward.It is more convenient to rewrite the tensor product in matrix form where , , and is the matrix of basis coefficients, which is our target for inference. The generic summand in (1) can be approximated by Riemann sum approximation as
(2) 
where is a
row vector and
. By an abuse of notation, define the vector where indicates the Kronecker product. For notational simplicity, we omit indices and in the subscript of and denote it by . Vectorize by appropriately stacking its columns vertically and denote the vector of basis coefficients ’s by . This yields an approximating linear model to (1) with unknown regression coefficients ’s(3) 
4.2 Penalized criterion for variable selection
We allow and to be sufficiently large to capture the complexity of the regression surfaces and penalize the degree of smoothness. We adopt the penalized least squares approach for estimation that simultaneously induces group sparsity and controls smoothness of the corresponding regression surfaces.
Let be the norm of , and let and be the second partial derivatives of with respect to and respectively. Prior to analysis, we center the response and functional predictors ’s, and remove from the model. By an abuse of notation, we use the same notation for the centered model. The estimates of the ’s are then selected to be the minimizers of the following penalized criterion
(4) 
where controls the model sparsity, and control the smoothness of in the and dimension, respectively. Let and The proposed penalty function was first introduced by Meier et al. (2009) for variable selection in high dimensional additive model. As increases, more penalty weight is placed on both the magnitude and smoothness of the surface which results in , excluding the corresponding functional predictor from the model. Although is assumed to be a smooth function, the fitted regression surfaces may be rough if the penalty on the second order derivatives and are not included. The extent of smoothing of the nonzero regression surfaces is controlled by the vector of smoothing parameters . For example, as increases, the model more heavily penalizes the departure from linearity, producing linear effects in both directions. The proposed penalty ensures smoothness and sparseness in two dimensions; it can be easily extended for multiple dimensions without loss of generality.
We now turn our attention to efficient calculation of for a given function . For simplicity, we further assume that the bases and are orthogonal Bsplines. Using the orthogonal property of the bases, define by the identity matrix with the th element as if and 0 otherwise, and define similarly in terms of the ’s. General bases will not have and equal to the identity. Let and be the second derivatives of and respectively; let and be the vectors of these and functions. Define and
It follows that
where the last equality follows from the orthogonality of the bases. Similarly, we obtain and The penalty may now be expressed as a sparsesmooth regularization (Meier et al., 2008; Meier, 2009)
where is a symmetric positivedefinite matrix. Cholesky decomposition on gives where is a lower triangular nonsingular matrix. Using (3) and reparametrize the model coefficients as and transform each to It follows that (4) can be written equivalently as
(5) 
which is similar to the group LASSO penalized criterion described in Yuan and Lin (2006); Yang and Zou (2013, 2015).
We use the groupwisemajorizationdescent (GMD) algorithm described by Yang and Zou (2013) to solve (5), which is implemented by the R package gglasso (Yang and Zou, 2013). A benefit of the GMD algorithm is that it does not require groupwise orthonormality and is applicable for a general design matrix. In addition, for given values of the tuning parameters and the minimizer of (5) with respect to has a closeform solution; denote the solution by by omitting its dependence on and .
It is straightforward to restructure the estimated basis coefficients in to give , and we have
Let be the estimated index set after solving under and and denote the set of their corresponding estimates by . In reality, the sparseness and smoothing parameters, and are unknown and need to be chosen empirically; for instance by crossvalidation. We discuss this aspect in more detail in Section 4.4.
The penalized criterion (4
) does not allow for different shrinkage and smoothness for different functional predictors, and assumes equal weight. This may inflate the number of false positives in variable selection and to mitigate this problem, adaptive estimation is recommended for variable selection in highdimensional data
(Meier, 2009; Tutz and Gertheiss, 2010; Gertheiss et al., 2013). We too adopt this approach and discuss it next.4.3 Adaptive penalized criterion
We generalize the criterion (4) by adaptive penalized criterion. Specifically, we use initial weights to introduce prior information on the relative importance and smoothness of functional predictors, where is related to the sparse penalty factor and and are related to the smooth penalty factors in the and dimension, respectively. Define the adaptive penalty function by
(6) 
where the weights are strictly positive and calculated based on the parameter estimates associated with a functional additive model (McLean et al., 2014; Scheipl et al., 2015) fit without a sparseness penalty. The subscript 0 is used to distinguish the initial estimates from the model estimates of (5). Define the weights by and for details we refer to Zou (2006); Gertheiss et al. (2013); Guo et al. (2015); Ciuperca (2016); Ivanoff et al. (2016). Alternatively, one can adopt a semiadaptive approach using only while keeping equal weights for smooth penalty factors such as and for all Like Yang and Zou (2015), the weights based on the number of parameters within each group can also be adopted alternatively. Here, as decreases to , increases to which yields sparser solution for for a given Similarly, as and increase to , the regression surface yields linear pattern in both directions for a given We denote the minimizer of the criteria with penalty (6) by associated with for a given and
4.4 Selection of the tuning parameters, and
A widely used method to select tuning parameters is fold crossvalidation (CV). We use 5fold block CV which has been found appealing for data with temporal correlations (see Roberts et al. (2017)).
Specifically, we partition the data into 5 equallysized sections; let index the folds and denote by be the th ordered response of the th fold corresponding to the instance , and be the total number of observations in the th subsample. We aim for the to be as equal as possible across . The test set is formed by one of the five folds; the remaining four folds form the training set.
For fixed values of and the model parameters are estimated based on the training set (e.g., all data with the th fold removed), and the prediction accuracy is evaluated based on the performance of the test set (e.g., the th fold); i.e., where is the predicted value for
For each value of the tuning parameters, we compute the average
’s and select the values corresponding to the minimum prediction error. Furthermore, CV with the one standard error rule can be also adopted; the main idea is to select the simplest model whose numerical performance is comparable with the optimal model that is chosen by minimum prediction error
(Friedman et al., 2001; Krstajic et al., 2014; Yang and Zou, 2015).4.5 Sequential Adaptive Functional Empirical (SAFE) Selection
Most variable selection algorithms can not guarantee the exclusion of noise variables (Meinshausen et al., 2009). In particular, the group LASSO approach has been shown to select a larger number of groups than necessary (Meier et al., 2008; Gertheiss et al., 2013); this may happen due to the uncertainty in selecting the optimum combination of tuning parameters. As a remedy, we propose to perform the selection step twice where the second stage considers only the covariates ; where the superscript (1) denotes the first stage, and and are the optimal tuning parameters determined by CV in the first stage. The adaptive weights for the second stage are then calculated based on
The proposed idea is motivated by the relaxed LASSO (Meinshausen, 2007) approach, however, the fitting procedure is different in terms of calculating the adaptive weights for the second stage. In contrast to Meinshausen (2007), our solution path of the second stage is driven by both and the selection of secondstage tuning parameters and In Meinshausen (2007), the estimates of the second stage are not driven by weights but rather based on the exhaustive search of the sparse tuning parameter. This idea was also promoted by Wei and Huang (2010) and Guo et al. (2015) who adopted a twostage variable selection strategy for the scalaronscalar regression.
With weights recalculated, we minimize the objective function (6) using only those with respect to and Let be the estimated index set resulting from the second stage of the regularization fit under the optimal combination of tuning parameters and found by CV.
4.6 Postselection inference and prediction
Shrinkage penalties cause the estimates of the nonzero coefficients to be biased towards zero (Zhao et al., 2017; Friedman et al., 2001). In the similar spirit to Leeb et al. (2015); Zhao et al. (2017), once is determined, we refit the model using the second order smooth regularization to reduce the prediction bias. Specifically, we solve the following penalized criterion
(7) 
where are the smoothing parameters that control the curvature of the fit and is the smoothing penalty function as described in Wood (2006) and Eilers and Marx (2003). We choose the optimal tuning parameters using CV as discussed in Section 4.4.
The model (7) is conditional on the event that the same data is being used twice: once in the variable selection stage and again in the postselection fit on the selected subset. Wu et al. (2009); Zhao et al. (2017)
argued that the use of standard prediction (confidence) intervals or
pvalues without adjustment are invalid as they neglect the complex selection procedure to define the reduced model in the first place. Thus an adjustment is required to construct valid prediction intervals or obtain valid pvalues (Tibshirani and Johnstone, 2014; Fithian et al., 2014; Lee et al., 2016). To the best of our knowledge, postselection inference for the group LASSO in the context of functional data is still an open problem. Alternatively, inference after selection based on data splitting is commonly adopted for highdimensional data; see Wasserman and Roeder (2009); Meinshausen et al. (2009). We extend these ideas to the case of functional covariate and focus on postselection predictive inference based on data splitting and construct split conformal prediction bands following Lei et al. (2017); for details of the algorithm we refer to the Supplementary Material, Section E.5 Data Application: EMG selection for finger/wrist movement
In this section, we present the variable selection and prediction results for our data collected from an AB subject across multiple postures and movement patterns as described in Section 3. Recall there are 16 EMG signals: 14 coming from different forearm muscles that could potentially contribute to finger or wrist movements and 2 that are randomly generated noise. The ideal selection scheme would pick one EMG signal for extension and another EMG for flexion for both finger and wrist movements, but we do not enforce this restriction in the estimation. This selection should be consistent across postures and movement patterns. In addition to identifying these EMG signals, we are also interested in a model with clear interpretabililty of the regression surfaces and high predictive ability.
Here we consider fitting the procedure described in Section 4 to our data application. Let denote the current time point and be the velocity at time point . Let be the finger or wrist position at time point . Define for some integer as the time window for the recent past EMG signal relative to the current time so the ’s make up the sequence . For , let be the realizations of the th EMG signal at the original time points Recall from Section 3.2 that we proposed and justified a recent past time window that spanned roughly 1/3 second, making since data were collected at 120Hz. The following results were found to be relatively insensitive to other within a reasonable neighborhood of 40.
The underlying musclemovement mechanism for an AB person is known to clinicians; this prior information about the underlying true signals allows us to define the correct model size in the data application. To evaluate the model selection performance, we partition according to important movement classes based on expert knowledge. Let and denote the partitioning subsets contributing to flexion and extension movement, respectively, which depend on finger or wrist movements. Figure 4 shows in circled labels the clinically relevant muscles for different movement classes for an AB person. Specifically for finger movements, and these muscles are known as flexor digitorum and extensor digitorum, respectively. Similarly for wrist movements, which are known as flexor carpi ulnaris, flexor digitorum superficialis, flexor carpi ulnaris, and flexor carpi radials, respectively, and which are coined as extensor carpi radialis longus, extensor digitorum, extensor carpi radialis brevis, and extensor carpi ulnaris, respectively. All muscles associated with a specific movement class produce highly correlated signals. The expert consensus is that one muscle in and another in are sufficient to define the relationship between the movements and EMG signals. The ideal index set identifies one muscle in each movement class or equivalently and , where denotes the cardinality of a set.
We use the method described in Section 4 to study which EMG signals ’s are related to the velocity and determine representative members of the sets and . We also consider three alternative approaches; the first competitor is the method proposed by Gertheiss et al. (2013) in which the authors propose the functional variable selection technique using adaptive group LASSO (agLASSO) type penalized criterion inducing both sparsity and smoothness. The second competitor is the method proposed by Pannu and Billor (2017) extending the idea of Gertheiss et al. (2013)
; here the authors use the objective function based on the least absolute deviation and select the functional variables using group lasso penalty. This approach is explicitly designed to account for potential outliers in the functional predictors, reducing overfitting of the regression surfaces. We refer this approach to LADgLASSO. The third competitor
(Fan et al., 2015) uses functional additive regression (FAR) with a groupwise smoothly clipped absolute deviation (gSCAD) penalty, denoted by FARgSCAD. Unlike other approaches, FARgSCAD imposes penalty on the integral assuming the integral is welldefined. Note that all these methods are designed to assess the relationship between the scalar responses and functional predictors. In stark contrast to SAFEgLASSO, all three competitors assume nonvarying smooth coefficients, ’s defined on , which accounts for the passive forces but assumes the relationship is not position dependent.The performance of all the methods is evaluated in terms of how close they yield an ideal selection of the EMG signals as well as their prediction ability. Specifically if denotes an estimator produced by one of the methods, we evaluate its performance by

Size ideally size = 2 with and

Sparsity (SP) In our application, is 16 and the optimal sparsity is 0.88 which is associated with the ideal model size 2. Define the relative sparsity (RSP) with respect to the optimal sparsity by RSP = SP/0.88;

False positive rate (FPR) where we define the number of false positives (FP) or falsely identified EMG signals in by , and is the complement of

True positive rate (TPR) where the number of true positives (TP) is defined by and focuses only on whether we capture the two index sets and

Mean squared error (MSE)
Note that in general one can not calculate FPRs and TPRs in a data application as the underlying truth is unknown. Our definitions for the above matrices are somewhat unconventional focusing on group identification of and rather than identification of all
5.1 Computational Details
We briefly describe the computational details of our implementation as performed in R, starting with the necessary steps after data processing including construction of the recent past EMG curves using . We approximate the using a tensor product of two orthogonal Bsplines with basis functions in the direction and basis functions in the direction. Knots are placed uniformly throughout and . The tuning parameters to produce the final estimates at each stage are chosen using 5fold CV following the procedure described in Roberts et al. (2017).
For the competitors, the smooth coefficient functions are modeled using Bsplines with 12 basis functions and the sparsitysmoothness tuning parameters are estimated by 5fold CV (Roberts et al., 2017) for all the competing methods. We used the R package grplasso (Meier, 2009) to fit the agLASSO (Gertheiss et al., 2013) and rqPen (Sherwood et al., 2017) to fit the robust LADgLASSO (Pannu and Billor, 2017). We used the code provided by the corresponding author Fan et al. (2015) for FARgSCAD.
5.2 Variable selection, prediction, and implication
Table 1
shows the results of the model selection performance for the finger movements. In most cases all the competing methods select one EMG signal that contributes for finger extension and another for finger flexion. All methods perform fairly well in terms of TPR. In particular, the proposed SAFEgLASSO attains desirable values of TPRs, FPRs, and RSPs across all settings. We found that the first stage of SAFEgLASSO tended to select more EMG signals than needed but the second stage corrected the surplus. In contrast, agLASSO exhibits good numerical performance reflected by the optimal levels of FPRs and RSPs but did have poor TPR in the last two settings. LADgLASSO exhibited large FPRs and RSPs, indicating a high Type I error rate. FARgSCAD demonstrates good model selection performance but selected more EMG signals than necessary, as indicated by its RSP values. The numerical performances are similar for the wrist movements and due to the interest of the space, we include the results in Table
4 of the Supplementary Materials, Section B.3.Figure 5 illustrates graphically the MSEs of the competitive methods. For both finger and wrist movements, SAFEgLASSO outperforms the competitors across all patterns and postures. The proposed approach was able to consistently fit the data better with a smaller model size, specifically relative to FARgSCAD and LADgLASSO.
Figure 6 shows a segment of the fitted velocities based on SAFEgLASSO and agLASSO where the shaded region corresponds to the split conformal prediction bands following Lei et al. (2017). SAFEgLASSO yields superior prediction trajectory than agLASSO in the events with a larger absolute velocity. In the highlighted event of Figure 6, fingers return to the neutral position around 1.45 to 1.88 seconds due to passive forces and while both methods predict movement, agLASSO greatly underestimates the observed velocity. The improvement is most directly attributable to the inclusion of position information. The agLASSO relies on the assumption that the effect of the EMG signals on finger/wrist movements is invariant across which appears to limit the fit across a variety of movement events.
Since our methodology offers lowdimensional modeling with negligible false positives, biomedical engineers can use our method as a screening approach and focus primarily on the selected subsets of forearm muscles in collecting data from TRAs. This approach significantly reduces the burden of data collection with respect to time and cost. Furthermore, since our model and that by Crouch and Huang (2016) account for similar biomechanical phenomenon, our variable selection results may be directly applied to their method.
5.3 Estimation of smooth coefficients
Figure 7 illustrates the estimated coefficient functions for the finger (top) and wrist (bottom) movements. For the purpose of illustration, we focus on interpreting the second posture shown in Table 1. In particular the top and bottom panels of Figure 7 correspond to consistent finger and wrist movement, respectively, for the two identified EMG signals from SAFEgLASSO. For finger movement, there is a clear distinction in the movement contributions: concurrent contraction of extensor digitorum is responsible for finger extension (positive velocity) while flexor digitorum leads to flexion (negative velocity). This matches with the intuition of the biomechanical system. As expected, these associations also depend on the positions. Indeed, the impact of the left surface on the velocity is most important for angles between and radians which correspond to finger flexion. Similarly, the impact of the right surface is most relevant between angles and radians which are primarily associated with finger extension. The flexor digitorum only leads to finger flexion when the hand is in a neutral position (0 to 15 radians).
Finger flexion occurs at other positions occur due to past behavior of the extensor digitorum. The observed concurrent relationship described above has the opposite historical relationship for the positions when the signals are presently active. This implies that past activation of one of these EMG signals can lead to the opposite type of concurrent movement they produce. In particular, this corresponds to two difficult cases where the model tries to establish the systematic relationship between: (1) passive force movements in the absence of muscle contraction, and (2) lack of movements in the presence of consistent muscle contraction due to physical constraints. Our model borrows information from the past, where the muscles were active, to predict such passive movements. Failing to appropriately account for passive forces results in poor prediction performance in those cases as shown in Figure 6.
The bottom panels of Figure 7 plot the estimated regression coefficients for the selected EMG channels for wrist movements. The interpretation of the regression surfaces for the wrist flexion/extension follows the same intuition as of hand movements. Concurrently, flexor carpi ulnaris leads to wrist flexion around 0 to 50 radians while extensor carpi ulnaris yields wrist extension around 60 to 15 radians. When the wrist is in a neutral position, flexor carpi ulnaris has to be flexed to keep the wrist upright around 5 to 15 radians. As before, the observed concurrent relationship exhibits the opposite historical association for wrist flexion and extension.
The regression surfaces for other patterns of finger/wrist movements can be interpreted in the similar manner; we report the regression surfaces corresponding to other patterns in Section B.5 of the Supplementary Material.
6 Simulation study
6.1 Simulation mimicking the data application
In this section, we consider a simulation study that mimicks the EMG data from Section 5. Specifically, we focus on finger movements with consistent movement. Let the observed data be where is now the simulated velocity at time and all other data match the descriptions from section 5. In particular, consider the generating model where for all except and and and are the estimated effects from one of the scenarios. In addition, is a zeromean error process with isotropic covariance function described by
Here is related to the dominant sources of dependence; means that the responses are uncorrelated while large reflects higher degree of autocorrelation between the responses. In addition, controls the strength of correlation between any two measurements. The choice of is driven by the parameter , the correlation between any two consecutive measurements, using the relationship .
The simulation study varies over three factors. The first factor we examine is the dominant sources of dependence determined by

Equal process.

Dominant dependent process.
The second factor is the strength of the correlation between successive measurements.

Low correlation.

High correlation.
The third factor is the magnitude of noise variance

Small noise. corresponds to the situation of having smaller variance than that of the original data.

Large noise. corresponds to the case of equal variance.
Calculate signaltonoise ratio (SNR) as SNR = where In particular, we consider SNR = {875, 445, 80, 8, 4, 0.8}. The results are based on 100 independent samples for each combination of the simulation settings.
Table 2 illustrates the numerical performance corresponding to the simulated kinematic data for different SNRs. The results are consistent to the findings of finger movements in Section 5. As before, we observe that the mean model size of the second stage of the proposed is nearer to the truth (i.e. 2) than that of the first stage; see the “SP” column under SAFEgLASSO in Table 2. This is expected due to the fact that by adopting the twostage variable selection scheme, we shrink the surplus variables in the second stage of the procedure. The proposed approach exhibits lower FPRs than that of the agLASSO, LADgLASSO, and FARgSCAD across all simulation settings. Similar to the other approaches, SAFEgLASSO also attains high TPRs.
SAFEgLASSO exhibited superior prediction performance relative to the competing methods across all simulation settings; see Figure 8. These findings are in agreement with the results of Section 5. As expected the prediction accuracy improves with the SNR; compare the boxplots corresponding to the SAFEgLASSO for different SNRs in Figure 8. Unlike the proposed approach, the numerical performance of the alternative methods suffers due to not considering positingvarying smooth coefficients as the methods fail to quantify the systematic relationship between the velocity and muscle activities of a biomechanical model. We report the additional simulation results in the Supplementary Materials, Section C.
6.2 Numerical experiment
Next we consider another simulation study where data are generated from a purely mathematical perspective. The simulated data is where is a scalar response at an instance is a scalar covariate observed at , ’s are the realizations of the th functional predictor such that . The 10 functional predictors are generated similar to Gertheiss et al. (2013) and Pannu and Billor (2017). The nonzero functional coefficients ’s are varying over and defined as and . Note that controls the strength of functional coefficients over The error variance is calculated based on different SNRs; i.e., SNR = More details about the functional predictors, functional coefficients, and simulation results are summarized in the Supplementary Materials, Section D.
As expected, when meaning the functional coefficient should have only one dimension, , all competitors showed comparable performance across all metrics, having high TPR and similar MSE. All methods except LADgLASSO had low FPR, which consistently selected 5 of the 10 possible variables. As the difficulty of the model increases, say for the competitors perform much worse than SAFEgLASSO. FARgSCAD and agLASSO have low FPR, but tend to select only 1 of the 2 important variables, while LADgLASSO has high TPR but also high FPR, similar to the scenario. As expected, the MSE performance also deteriorates for the competitors as increases.
We also investigated the coverage probabilities and length of the prediction intervals for the SAFEgLASSO models. Denote the lower and upper bound of the insample prediction interval by
, which is constructed following rankoneout (ROO) split conformal prediction inference (Lei et al., 2017). Define the pointwise average coverage probability by where is the total number of insample observations. Calculate the expected length of the interval by Let index the instance at which new information of the predictor is recorded; where and M is the total number of future observations. Define the pointwise average coverage probability for the future observations by where is the prediction intervals computed using the split conformal prediction inference (Lei et al., 2017). In our simulation, we considerTable 3 demonstrates the predictive inference of the proposed approach in terms of the actual coverage and expected length of the interval for both the insample () and future () observations. In general, the average coverage stays around the nominal levels irrespective of the complexity of the model as defined by see the results for and As expected, the width of the interval increases as departs from 0 but reduces for larger SNR; see the average length of the interval for at miscoverage level for We also observe the similar phenomenon with the prediction intervals for the future observations.
7 Discussion
In this paper, we proposed a covariatedependent, scalaronfunction regression model that appropriately accounts for the biomechanical processes involved in hand movement, such as passive forces and physical constraints. The functional predictors were the recent past behavior of EMG signals measured across multiple muscles in the subject’s forearm and the responses were finger and wrist velocity. The functional coefficients for each EMG signal were allowed to vary based on the current finger or wrist position. The bivariate coefficients were then approximated using a tensor product of rich basis expansions that were then estimated with a combined multidimensional smoothing and sparseness penalty, which is an extension of Gertheiss et al. (2013). We developed a twostep variable selection procedure, called Sequential Adaptive Functional Empirical group LASSO (SAFEgLASSO), that was shown through numerical investigations to have superior performance over standard selection approaches (Gertheiss et al., 2013; Pannu and Billor, 2017; Fan et al., 2015) by reducing the number of false positives irrespective of model complexity.
The results of the data application showed SAFEgLASSO was able to identify the important EMG signals for finger and wrist movement for an AB subject. Furthermore, the estimated varying functional coefficients were relatively sparse, easy to interpret, and had exceptional predictive performance compared to standard selection approaches. Our model and fitting algorithm have great potential to outperform current stateoftheart data driven methods for prosthesis control such as pattern recognition because they ignore biomechanical constraints, do not perform variable selection, and are prone to overfitting. The variable selection results from SAFEgLASSO could also be incorporated in the method by Crouch and Huang (2016), which currently does not perform variable selection and uses a planar linksegment dynamic model. Although our model mathematically differs from theirs, both models account for the biomechanical system with enough similarity that our variable selection results still apply. Our model does have the advantage in that, after training our model, there is minimal data processing required to produce predictions in the prosthetic limb, reducing the burden of realtime data collection.
There are many opportunities for extensions and new applications of the approaches taken in this paper. The twostep fitting approach used in SAFEgLASSO easily applies to the more common univariate functional coefficients. We also discussed an approach to assess the predictive ability based on data splitting, which as far as the authors are aware has not been applied to functional regression. Additionally, our model, which uses a tensor product basis, can also be applied in a more general setting with many functional predictors having functional coefficients varying over multiple covariates. This would be of interest for this data application if the coefficients were considered to vary significantly across postures.
In our current developments, the functional covariates were assumed to be observed without error on a fine grid of points. Extensions to a situation where the functional measurements are perturbed by error or when the grid points ’s are sparse would require a preliminary smoothing of the functional covariates using existing approaches Yao et al. (2005); Xiao et al. (2016) and then SAFEgLASSO may be employed on the smoothed functional covariates. In a preliminary investigation, not reported here, we observed that the variable selection performance of SAFEgLASSO is unaffected due to the use of estimated smooth profiles in the place of true functional predictors. However prediction error of the responses is affected by large noise variance of the functional predictors, as would any method.
As in nonparametric regressions, our method relies on the tensor products of basis functions; as a result, the number of parameters to estimate can explode very quickly. Even though our method can tackle the dimensionality issue, it becomes computationally intensive with large number of basis functions. Note that we adopt CV based approach in selecting the tuning parameters for the data application. We also acknowledge that there are other methods to select optimum tuning parameters and CV based approach may not be theoretically the best approach (Gertheiss et al., 2013). While the method works well for our data application, future research is needed to investigate the optimality in selecting the tuning parameters. In addition, we focused on the development of a subjectspecific modeling procedure and were not concerned with estimating subjectspecific variability. By accounting for this source of variability, we can develop highly functional, userspecific robotic prosthetic limb that perform well across multiple subjects through estimation of population parameters.
Acknowledgments
The project described was supported by Division of Information and Intelligent System, National Science Foundation through award number 1527202. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NSF. A.M. Staicu’s research was supported by NSF Grant number DMS1454942.
References
 Bandyopadhyay and Maity (2017) Bandyopadhyay, S. and Maity, A. (2017). Asymptotic theory for varying coefficient regression models with dependent data. Annals of the Institute of Statistical Mathematics, pages 1–15.
 Biddiss and Chau (2007) Biddiss, E. and Chau, T. (2007). Upperlimb prosthetics: critical factors in device abandonment. American Journal of Physical Medicine and Rehabilitation, 86:977–987.
 Butterworth (1930) Butterworth, S. (1930). On the theory of filter amplifiers. Wireless Engineer, 7:536–541.
 Cardot et al. (2003) Cardot, H., Ferraty, F., and Sarda, P. (2003). Spline estimators for the functional linear model. Statistica Sinica, 13:571–591.

Cardot and Sarda (2008)
Cardot, H. and Sarda, P. (2008).
Varyingcoefficient functional linear regression models.
Communications in Statistics—Theory and Methods, 37:3186–3203. 
Ciuperca (2016)
Ciuperca, G. (2016).
Adaptive group lasso selection in quantile models.
Statistical Papers, pages https://doi.org/10.1007/s00362–016–0832–1.  Crouch and Huang (2016) Crouch, D. L. and Huang, H. (2016). Lumpedparameter electromyogramdriven musculoskeletal hand model: A potential platform for realtime prosthesis control. Journal of Biomechanics, 49:3901–3907.
 Crouch and Huang (2017) Crouch, D. L. and Huang, H. (2017). Musculoskeletal modelbased control interface mimics physiologic hand dynamics during path tracing task. Journal of Neural Engineering, 14(3):036008.
 Davenport (2013) Davenport, C. A. (2013). Semiparametric Regression Models for Interacting Covariates. PhD thesis, North Carolina State University, Raleigh, NC.
 Davenport et al. (2015) Davenport, C. A., Maity, A., and Wu, Y. (2015). Parametrically guided estimation in nonparametric varying coefficient models with quasilikelihood. Journal of Nonparametric Statistics, 27:195–213.
 Delp et al. (2007) Delp, S. L., Anderson, F. C., Arnold, A. S., Loan, P., Habib, A., John, C. T., Guendelman, E., and Thelen, D. G. (2007). Opensim: opensource software to create and analyze dynamic simulations of movement. IEEE transactions on biomedical engineering, 54:1940–1950.
 Eilers and Marx (2003) Eilers, P. H. and Marx, B. D. (2003). Multivariate calibration with temperature interaction using twodimensional penalized signal regression. Chemometrics and Intelligent Laboratory Systems, 66:159–174.
 Englehart and Hudgins (2003) Englehart, K. and Hudgins, B. (2003). A robust, realtime control scheme for multifunction myoelectric control. IEEE Transactions on Biomedical Engineering, 50(7):848–854.
 Fan et al. (2014) Fan, J., Ma, Y., and Dai, W. (2014). Nonparametric independence screening in sparse ultrahighdimensional varying coefficient models. Journal of the American Statistical Association, 109:1270–1284.
 Fan and Zhang (2008) Fan, J. and Zhang, W. (2008). Statistical methods with varying coefficient models. Statistics and Its Interface, 1:179–195.
 Fan et al. (2015) Fan, Y., James, G. M., Radchenko, P., et al. (2015). Functional additive regression. The Annals of Statistics, 43:2296–2325.
 Ferraty et al. (2012) Ferraty, F., GonzálezManteiga, W., MartínezCalvo, A., and Vieu, P. (2012). Presmoothing in functional linear regression. Statistica Sinica, 22:69–94.
 Fithian et al. (2014) Fithian, W., Sun, D., and Taylor, J. (2014). Optimal inference after model selection. arXiv preprint arXiv:1410.2597.
 Friedman et al. (2001) Friedman, J., Hastie, T., and Tibshirani, R. (2001). The Elements of Statistical Learning. Springer, New York, NY.
 Gertheiss et al. (2013) Gertheiss, J., Maity, A., and Staicu, A.M. (2013). Variable selection in generalized functional linear models. Stat, 2:86–101.

Goldsmith et al. (2011)
Goldsmith, J., Crainiceanu, C. M., Caffo, B. S., and Reich, D. S. (2011).
Penalized functional regression analysis of whitematter tract profiles in multiple sclerosis.
NeuroImage, 57:431–439.  Guo et al. (2015) Guo, P., Zeng, F., Hu, X., Zhang, D., Zhu, S., Deng, Y., and Hao, Y. (2015). Improved variable selection algorithm using a lassotype penalty, with an application to assessing hepatitis b infection relevant factors in community residents. PLoS One, 10:e0134151.
 Holzbaur et al. (2005) Holzbaur, K. R., Murray, W. M., and Delp, S. L. (2005). A model of the upper extremity for simulating musculoskeletal surgery and analyzing neuromuscular control. Annals of Biomedical Engineering, 33:829–840.
 Huang et al. (2008) Huang, H., Zhou, P., Li, G., and Kuiken, T. A. (2008). An analysis of emg electrode configuration for targeted muscle reinnervation based neural machine interface. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 16(1):37–45.

Ivanoff et al. (2016)
Ivanoff, S., Picard, F., and Rivoirard, V. (2016).
Adaptive lasso and grouplasso for functional poisson regression.
Journal of Machine Learning Research
, 17:1–46.  James et al. (2009) James, G. M., Wang, J., and Zhu, J. (2009). Functional linear regression that’s interpretable. The Annals of Statistics, 37(5A):2083–2108.
 Kawashima and Mita (2016) Kawashima, N. and Mita, T. (2016). Psychophysical evaluation of the capability for phantom limb movement in forearm amputees. PLoS One, 11:e0156349.
 Krstajic et al. (2014) Krstajic, D., Buturovic, L. J., Leahy, D. E., and Thomas, S. (2014). Crossvalidation pitfalls when selecting and assessing regression and classification models. Journal of Cheminformatics, 6(10):https://doi.org/10.1186/1758–2946–6–10.
 Lee et al. (2016) Lee, J. D., Sun, D. L., Sun, Y., Taylor, J. E., et al. (2016). Exact postselection inference, with application to the lasso. The Annals of Statistics, 44:907–927.
 Leeb et al. (2015) Leeb, H., Pötscher, B. M., Ewald, K., et al. (2015). On various confidence intervals postmodelselection. Statistical Science, 30:216–227.
 Lei et al. (2017) Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J., and Wasserman, L. (2017). Distributionfree predictive inference for regression. Journal of the American Statistical Association, page 10.1080/01621459.2017.1307116.

Lei et al. (2015)
Lei, J., Rinaldo, A., and Wasserman, L. (2015).
A conformal prediction approach to explore functional data.
Annals of Mathematics and Artificial Intelligence
, 74(12):29–43.  Linda Resnik et al. (2011) Linda Resnik, P. et al. (2011). Development and testing of new upperlimb prosthetic devices: research designs for usability testing. Journal of Rehabilitation Research and Development, 48(6):697–706.
 Maity and Huang (2012) Maity, A. and Huang, J. Z. (2012). Partially linear varying coefficient models stratified by a functional covariate. Statistics and Probability Letters, 82:1807–1814.
 Matsui and Konishi (2011) Matsui, H. and Konishi, S. (2011). Variable selection for functional regression models via the l1 regularization. Computational Statistics & Data Analysis, 55:3304–3310.
 McLean et al. (2014) McLean, M. W., Hooker, G., Staicu, A.M., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23:249–269.
 Meier (2009) Meier, L. (2009). grplasso: Fitting user specified models with group lasso penalty. R Foundation for Statistical Computing.

Meier et al. (2008)
Meier, L., Van De Geer, S., and Bühlmann, P. (2008).
The group lasso for logistic regression.
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70:53–71.  Meier et al. (2009) Meier, L., Van de Geer, S., Bühlmann, P., et al. (2009). Highdimensional additive modeling. The Annals of Statistics, 37:3779–3821.
 Meinshausen (2007) Meinshausen, N. (2007). Relaxed lasso. Computational Statistics & Data Analysis, 52:374–393.
 Meinshausen et al. (2009) Meinshausen, N., Meier, L., and Bühlmann, P. (2009). Pvalues for highdimensional regression. Journal of the American Statistical Association, 104:1671–1681.
 Pannu and Billor (2017) Pannu, J. and Billor, N. (2017). Robust grouplasso for functional regression model. Communications in StatisticsSimulation and Computation, 46:3356–3374.
 Peerdeman et al. (2011) Peerdeman, B., Boere, D., Witteveen, H., Hermens, H., Stramigioli, S., Rietman, J., Veltink, P., Misra, S., et al. (2011). Myoelectric forearm prostheses: State of the art from a usercentered perspective. Journal of Rehabilitation Research and Development, 48(6):719–737.
 Ramsay and Silverman (2005) Ramsay, J. and Silverman, B. (2005). Functional Data Analysis. Springer, New York, NY, 2 edition.
 Resnik et al. (2011) Resnik, L., Etter, K., Klinger, S. L., and Kambe, C. (2011). Using virtual reality environment to facilitate training with advanced upperlimb prosthesis. Journal of Rehabilitation Research and Development, 48(6):707–718.
 Roberts et al. (2017) Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., GuilleraArroita, G., Hauenstein, S., LahozMonfort, J. J., Schröder, B., Thuiller, W., et al. (2017). Crossvalidation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8):913–929.
 Ruppert et al. (2003) Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric regression. Number 12. Cambridge university press.
 Scheipl et al. (2015) Scheipl, F., Staicu, A.M., and Greven, S. (2015). Functional additive mixed models. Journal of Computational and Graphical Statistics, 24:477–501.
 Scheme and Englehart (2011) Scheme, E. and Englehart, K. (2011). Electromyogram pattern recognition for control of powered upperlimb prostheses: State of the art and challenges for clinical use. Journal of Rehabilitation Research and Development, 48(6):643–659.
 Scheme et al. (2010) Scheme, E., Fougner, A., Stavdahl, O., Chan, A. C., and Englehart, K. (2010). Examining the adverse effects of limb position on pattern recognition based myoelectric control. In 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, pages 6337–6340. IEEE.
 Sherwood et al. (2017) Sherwood, B., Maidman, A., Sherwood, M. B., and ByteCompile, T. (2017). Package ‘rqpen’. R Foundation for Statistical Computing.
 Tibshirani and Johnstone (2014) Tibshirani, R. and Johnstone, I. (2014). A significance test for the lasso. Annals of Statistics, 42(2):413–468.
 Tutz and Gertheiss (2010) Tutz, G. and Gertheiss, J. (2010). Feature extraction in signal regression: a boosting technique for functional data regression. Journal of Computational and Graphical Statistics, 19:154–174.
 Wasserman and Roeder (2009) Wasserman, L. and Roeder, K. (2009). High dimensional variable selection. Annals of Statistics, 37(5A):2178–2201.
 Wei and Huang (2010) Wei, F. and Huang, J. (2010). Consistent group selection in highdimensional linear regression. Bernoulli, 16:1369.
 Wood (2006) Wood, S. N. (2006). Lowrank scaleinvariant tensor product smooths for generalized additive mixed models. Biometrics, 62:1025–1036.
 Wood (2017) Wood, S. N. (2017). Generalized additive models: an introduction with R. CRC press.
 Wu et al. (2009) Wu, T. T., Chen, Y. F., Hastie, T., Sobel, E., and Lange, K. (2009). Genomewide association analysis by lasso penalized logistic regression. Bioinformatics, 25:714–721.
 Wu et al. (2010) Wu, Y., Fan, J., Müller, H.G., et al. (2010). Varyingcoefficient functional linear regression. Bernoulli, 16:730–758.
 Xiao et al. (2016) Xiao, L., Zipunnikov, V., Ruppert, D., and Crainiceanu, C. (2016). Fast covariance estimation for highdimensional functional data. Statistics and Computing, 26:409–421.
 Yang and Zou (2013) Yang, Y. and Zou, H. (2013). gglasso: group lasso penalized learning using a unified bmd algorithm. R Foundation for Statistical Computing.
 Yang and Zou (2015) Yang, Y. and Zou, H. (2015). A fast unified algorithm for solving grouplasso penalize learning problems. Statistics and Computing, 25:1129–1141.
 Yao et al. (2005) Yao, F., Müller, H.G., and Wang, J.L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100:577–590.
 Yuan and Lin (2006) Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49–67.
 Zhao et al. (2017) Zhao, S., Shojaie, A., and Witten, D. (2017). In defense of the indefensible: A very naive approach to highdimensional inference. arXiv preprint arXiv:1705.05543.
 Zhou et al. (2007) Zhou, P., Lowery, M. M., Englehart, K. B., Huang, H., Li, G., Hargrove, L., Dewald, J. P. A., and Kuiken, T. A. (2007). Decoding a new neuralmachine interface for control of artificial limbs. Journal of Neuralphysiology, 98:2974–2982.
 ZieglerGraham et al. (2008) ZieglerGraham, K., MacKenzie, E. J., Ephraim, P. L., Travison, T. G., and Brookmeyer, R. (2008). Estimating the prevalence of limb loss in the united states: 2005 to 2050. Archives of Physical Medicine and Rehabilitation, 89:422–429.
 Zou (2006) Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429.
Supplementary Material
This Supplementary Material consists of five sections. Section A provides details for the fit without sparse penalty. Section B presents additional analysis results for the kinematic data. Section C and Section D provide additional simulation results. Section E details the algorithm of the postselection predictive inference.
Appendix A Generalized additive model
Let the observed data, in general, be where all terms bear the usual meaning as before and are described in the manuscript. Write the penalized criterion for the model (4.3) with the smoothing penalty only in matrix form as
where is a vector with elements ’s, is an matrix having the th row , is a vector of parameters, and Following Ruppert et al. (2003); Wood (2017), taking the derivative with respect to yields we obtain the initial estimates of the regression coefficients as ’s for a given . The corresponding Bayesian posterior covariance matrix is where is estimated from the residual sum of squares. Predict the responses as
The and are unknown in practice; one approach to select the optimal tuning parameters is block CV. The postselection model (4.7) is fitted using the above intuition but with the reduced subset
Appendix B Additional results of EMG selection for movement data
In this section, we present more results of the analysis of the EMG and movement data.
b.1 Data restructuring of EMG signals
Our idea is to characterize the velocity at an instance as a function of the recent past EMG signals. Figure 9 displays visually the data reconstruction of functional predictors using the recent past EMG signals.
b.2 Variation in EMG curves
We use functional principal component analysis technique to examine the main sources of variability in the curves. Figure
10 illustrates the EMG curves ’s associated with a muscle (flexor digitorum) in the forearm that contributes to finger flexion. The first three functional principal components (FPCs) of the estimated marginal covariance of ’s are also plotted. We observe one key feature that solely explains the majority of variation in the curves associated with the EMG signal; see the first FPC in Figure 10.muscle and the first three eigenfunctions from left to right.
b.3 Variable selection for wrist movements
Table 4 shows the results of the model selection performance for the wrist movements with consistent and varying patterns. In most cases, the competing methods select one muscle for wrist extension and another for flexion. As described in the manuscript, there are many potential muscles that contribute to similar wrist movements. Therefore selection between the alike muscles is desired to reduce the redundancy of EMG information. Notice both agLASSO and SAFEgLASSO attain optimal RSPs. In contrast FARgSCAD shows suboptimal RSPs and LADgLASSO performs poorly in selecting desired variables; follow the column “RSP” for competing methods in Table 4. While both these methods have tendency to select noise variables in addition to the true positives, the problem in the selection by LADgLASSO is severe; follow the column “FPR” in Table 4.
Comments
There are no comments yet.