Functional Variable Selection for EMG-based Control of a Robotic Hand Prosthetic

05/08/2018 ∙ by Md Nazmul Islam, et al. ∙ NC State University The University of Tennessee, Knoxville 0

State-of-the-art robotic hand prosthetics generate finger and wrist movement through pattern recognition (PR) algorithms using features of forearm electromyogram (EMG) signals, but re- quires extensive training and is prone to poor predictions for conditions outside the training data (Peerdeman et al., 2011; Scheme et al., 2010). We propose a novel approach to develop a dynamic robotic limb by utilizing the recent history of EMG signals in a model that accounts for physiological features of hand movement which are ignored by PR algorithms. We do this by viewing EMG signals as functional covariates and develop a functional linear model that quantifies the effect of the EMG signals on finger/wrist velocity through a bivariate coefficient function that is allowed to vary with current finger/wrist position. The model is made par- simonious and interpretable through a two-step variable selection procedure, called Sequential Adaptive Functional Empirical group LASSO (SAFE-gLASSO). Numerical studies show excel- lent selection and prediction properties of SAFE-gLASSO compared to popular alternatives. For our motivating dataset, the method correctly identifies the few EMG signals that are known to be important for an able-bodied subject with negligible false positives and the model can be directly implemented in a robotic prosthetic.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 19

page 33

page 34

page 35

page 36

page 37

page 38

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Abstract

State-of-the-art robotic hand prosthetics generate finger and wrist movement through pattern recognition (PR) algorithms using features of forearm electromyogram (EMG) signals, but requires extensive training and is prone to poor predictions for conditions outside the training data (Scheme et al., 2010; Peerdeman et al., 2011). We propose a novel approach to develop a dynamic robotic limb by utilizing the recent history of EMG signals in a model that accounts for physiological features of hand movement which are ignored by PR algorithms. We do this by viewing EMG signals as functional covariates and develop a functional linear model that quantifies the effect of the EMG signals on finger/wrist velocity through a bivariate coefficient function that is allowed to vary with current finger/wrist position. The model is made parsimonious and interpretable through a two-step variable selection procedure, called Sequential Adaptive Functional Empirical group LASSO (SAFE-gLASSO). Numerical studies show excellent selection and prediction properties of SAFE-gLASSO compared to popular alternatives. For our motivating dataset, the method correctly identifies the few EMG signals that are known to be important for an able-bodied subject with negligible false positives and the model can be directly implemented in a robotic prosthetic.

Keywords: Electromyography signal; Varying functional regression; Functional variable selection; Group LASSO; Post-selection predictive inference

1 Introduction

More than 160,000 Americans are transradial (i.e. below-elbow) amputees, henceforth TRAs, and must relearn how to perform tasks that typically require an intact hand (Ziegler-Graham et al., 2008). Passive hand prostheses and related devices are useful, but cannot emulate the full functionality of an intact limb. Multifunctional robotic prosthetics, such as the FDA-approved DEKA arm system (Linda Resnik et al., 2011; Resnik et al., 2011), have become very popular with recent advancements in their mechanical systems. The available software that provides user control of the hardware, however, is often nonintuitive to operate. For example, one approach is to have the user control the prosthetic with their foot. Forearm muscle contractions are known to cause hand movement for an able-bodied subject, henceforth AB subject. A TRA’s residual forearm, which once caused hand movement, will still contract and these contractions exhibit measurable EMG signals that are somewhat consistent with hand movement even though they no longer use that hand. A reliable mapping of a TRA’s forearm electromyogram (EMG) signals to hand movement would then provide a more intuitive approach for prosthesis control.

Figure (1) shows the process of movement intention to movement production for both an AB subject and TRA. For an AB subject, tendons are attached to the forearm muscles, travel through the wrist and connect to bones in the hand. The muscle contractions transmit force through tendons to the bones, generating finger or wrist movement. For a TRA, this physical connection among muscles, tendons, and bones no longer exists, but they may still sense movement in their phantom limb, accompanied by observable muscle contractions in their residual limb. Their generated EMG signal data feed into a prosthesis controller that processes the data and predicts movement, which is then produced by the robotic limb.

Figure 1: Visualization of biomechanical system for hand movement for AB subject and TRA. (A) Internal limb representation in the motor cortex. (B) Neural signal sends motor commands to forearm muscles. (C) Forearm muscles contract to perform desired movement, which, for an AB subject, results in direct hand movement through tendon connections (red lines) to the hand. (D) For a TRA, a prosthesis controller projects real time hand movements using forearm EMG information, and (E) the robotic limb performs the intended movement.

Direct myoelectric control, a traditional approach for prosthesis control, uses EMG signals from two antagonistic muscles and assigns to each a specific type of hand movement, such as finger flexion or extension. The EMG signal can only control that class of movement and, if activated for any reason, will produce that movement with speed proportional to the magnitude of the signal. Direct control is limited in its ability to direct multiple degrees of freedom of hand movement. The user must switch manually between functions (e.g. from wrist pronation/supination to hand open/close), but then the user’s attempted movement may not match with the prosthesis action. Since it is cumbersome for controlling multiple limb functions, direct control prostheses have shown a 75% rejection rate by users

(Biddiss and Chau, 2007; Peerdeman et al., 2011). State-of-the-art prosthetics overcome the restrictions of direct myoelectric control using data-driven pattern recognition (PR) algorithms capable of synthesizing a large number of EMG signals (potentially over 100) into classes of intended movement (Englehart and Hudgins, 2003; Zhou et al., 2007; Huang et al., 2008).

PR is more flexible than direct myoelectric control, but has its own set of limitations that prohibit effective, intuitive user control. PR does not incorporate knowledge of human neurophysiology and biomechanics and relies heavily on the representativeness of the data used to train the algorithm in identifying EMG patterns associated with classes of hand movements. This leads to poor prediction performance for conditions not directly observed in the training data. It assumes that a constant, repeatable pattern of muscle contractions leads to a specific movement class and users may find it difficult to perfectly repeat this pattern. PR also significantly reduces the EMG information, using summarizing features of the EMG signals such as the mean absolute value and number of slope sign changes instead of the original signals. The loss of important information due to data reduction as well as redundancy of predictor information leads to overfitting to the training data, as indicated by Scheme et al. (2010).

Recently, Crouch and Huang (2016) and Crouch and Huang (2017) proposed and implemented an EMG-based controller using a planar link-segment dynamic model that directly incorporates features of the neurophysiological and biomechanical system. Using EMG data from only four forearm muscles, they were able to accurately predict wrist and finger movement. The model they use, however, is nonlinear and it is not clear how well it will translate to a TRA, since their model requires the subset of muscles to be pre-specified. Their results do show that incorporating the neurophysiological and biomechanical properties can yield movement predictions that are more typical of biological movement than movements predicted by PR. This suggests that incorporating such properties in a data-driven approach may help improve movement predictions.

In this paper, we propose to model two degrees of freedom for hand movement, finger and wrist flexion/extension, using recent past behavior of the EMG signals. Our statistical model uses either finger or wrist velocity as its response and allows the effect of each EMG signal on the response to vary with the current finger or wrist position. This model retains much of the EMG information and respects known biomechanical constraints, thus allowing it to explain a broader range of hand movements using fewer EMG signals. We fit the model using a novel algorithm for functional regression which we call Sequential Adaptive Functional Empirical group LASSO (SAFE-gLASSO) that selects only a subset of the EMG to be included in the model while simultaneously requiring that the estimated functional coefficients are smooth and interpretable. SAFE-gLASSO pairs a generalization of the adaptive sparse-smoothness penalty from

Gertheiss et al. (2013) with an efficient, sequential variable selection and fitting approach inspired by the relaxed LASSO (Meinshausen, 2007). We also extend the ideas of Lei et al. (2017) to generate post-selection prediction confidence bands.

The paper is organized as follows. Section 2 reviews the neurophysical and biomechanical relationships involved in hand movement that we address in our proposed statistical model. Section 3 describes the collection process for collecting EMG and hand movement data. Section 4 details the analysis methodology starting with the proposed functional linear model and explains how it accounts for known underlying biomechanical relationships. We then describe the variable selection and post-selection fit procedures. Section 5 presents the analysis results from data collected from an AB subject, where the underlying truth is known. An extensive simulation study is performed in sections 6.1 and 6.2, showing the robustness of our method to varying assumptions. Section 7 concludes the paper with a discussion of limitations and possible avenues of extension.

2 Biomechanics of hand movement

This section describes the underlying biomechanical process that generates intentional hand movement for an AB subject and highlights the unique challenges that will need to be addressed by the proposed model. Incorporating features of this process in the statistical analysis and the robotic prosthetic software should significantly aid user control of robotic prosthetics by TRAs. In particular, this section motivates the use of recent past behavior of the EMG signals as functional predictors and justifies the need for position-dependent effects to map the relationship between these EMG signals and velocity. Velocity was chosen as the desired response instead of position because almost all the upper limb prostheses on the market have direction and velocity control as their outputs (Scheme and Englehart, 2011).

The biological process of hand movement begins with initiation of action potentials that are first conducted along motor neurons from the central nervous system. At the neuromuscular junction, the action potential causes the release of neurotransmitters from the nerve that initiates an action potential on the muscle fiber membrane. That motor unit action potential (MUAP) conducts along the surface of the muscle fiber membrane that extends over and within the muscle fiber. This stimulates the release of calcium ions within myofibrils and initiates cross-bridge cycling of overlapping protein chains that cause the entire muscle fiber to shorten (contract), generating mechanical energy. The output force of an entire muscle varies depending on the proportion of fibers in the muscle that are contracted and the frequency at which they are stimulated. The EMG signal measured from surface electrodes represents the sum of individual MUAPs and thus conveys information about the magnitude and duration of muscle contraction.

The mechanical aspect of hand movement explains how the mechanical energy generated by the muscle contractions lead to specific hand movements. For AB subjects, tendons connect forearm muscles to bones in the hand and it is through this connection that movements are generated. That is, a muscle contraction will pull its tendon which in turn moves the hand depending on the path of the tendon relative to the joint(s) that it crosses. As mentioned earlier, AB subjects have an internal representation that maps muscle contractions to hand movements, and this mapping is maintained for TRAs following surgery, although it may become distorted. Our goal is to build and implement a statistical model that decodes this internal biomechanical representation, specifically for relating finger and wrist movement to EMG signal data.

The top panel of Figure 2 demonstrates what is meant by finger and wrist movement. For example, finger flexion and extension refers to the simultaneous opening and closing of all finger digits, respectively, except for the thumb. The bottom panel of Figure 2 shows a range of arm postures considered in this paper. A PR-based controller trained on one posture setting would likely perform poorly in one of the other postures because EMG patterns change with limb posture due to, for example, a shift in electrode locations relative to the underlying muscles (Scheme et al., 2010). A biomechanically motivated model should behave consistently across postures because the biomechanical process does not change significantly with posture.

Figure 2: Top panel corresponds to finger and wrist movement: finger extension, finger flexion, wrist extension, and wrist flexion (from left to right). Bottom panel illustrates the arm postures at which data is collected during the experiment.

The known biomechanical process of AB subjects suggests that these two movement degrees of freedom can be explained by relatively few forearm muscles. The nonlinear model of Crouch and Huang (2016) is capable of accurate predictions with few EMG signals but does not clearly admit a variable selection procedure to decide which muscles to use. This presents a problem for its use by a TRA for whom the intended biomechanical actions of the EMG are not directly observable and may be altered following amputation. Thus we require a statistical model that is capable of accurate predictions using few EMG signals and the model must lend itself to a variable selection procedure.

There are many challenges and constraints that the proposed statistical model must accommodate to successfully approximate this biomechanical process. First, the fingers and the wrist are limited in how far they can be extended/flexed. Second, muscles generate passive movement forces when stretched, which may produce movement in the absence of EMG information. For instance, if one relaxes their forearm muscles following a contraction, the tendons will return back to their resting length, generating a passive force, and the hand will return to a neutral configuration. Therefore, we may observe movement without observing concurrent EMG activation. While this tendon connection no longer exists for a TRA, they may still anticipate these passive forces.

Figure 3 illustrates some finger position data (in radians, shown in black) taken from an AB subject for a short time-window and overlays two concurrent EMG signals, labeled EMG 7 (green) and EMG 12 (red), known to contribute to finger movement. Figure 3(a) is an event window with an active EMG 7 signal but no movement; the result of a physical constraint. Figure 3(b) demonstrates finger movement in the absence of active, concurrent EMG signals, due to passive forces. This figure demonstrates the opportunity for past EMG behavior to predict passive movement, as seen by the declining green line preceding Figure 3(b).

Figure 3: Depicted are the joint finger angle (radians) for flexion and extension movements, in black, and two normalized EMG signals, in red and green. Left and right Y-axes correspond to finger angle and EMG signals, respectively. Event (a) demonstrates restricted movement due to a physical constraint, in this case maximal finger extension. Event (b) demonstrates movement due to passive forces, lacking active EMG signals.

3 Data collection and processing

This section and the analysis in section 5 focuses on the collection and analysis of data from an AB subject. The process of calibrating a robotic arm is subject-specific and we are not interested in estimating a population model. The data collection procedure described here and the analysis methodology described in Section 4 could be modified to perform inference across multiple subjects. A data collection strategy is given at the end of Section 3.1 for TRAs consistent with our analysis approach, requiring measured hand movements.

Surface EMG electrodes (Biometrics, Newport, UK) were placed at different locations of the forearm. The subject was then asked to perform specific types of hand movement and the EMG signals were recorded continuously at 960 Hz. Electrodes were placed over 16 specific forearm muscles considered to be potentially important muscles for generating hand movements of interest. Raw EMG data were high-pass filtered at 40 Hz, rectified, and low-pass filtered at 6 Hz using a 4th order Butterworth zero-phase filter (Crouch and Huang, 2016). Prior to data collection, the subject was asked to perform movements exhibiting maximal muscle contraction, allowing normalization of the subject’s EMG signals to be between 0 (no contraction) and 1 (maximal contraction).

Movement data were collected by reflective markers placed on 9 anatomical locations on the forearm, wrist, and hand. Three-dimensional marker positions were recorded at 120 Hz using an infrared motion capture system (Vicon Motion Systems Ltd., UK), and processed (filtered) at 6 Hz using a 4th order Butterworth filter (Butterworth, 1930). The joint angles (in radians), which we call positions, were then calculated from filtered marker data through a musculoskeletal model (Holzbaur et al., 2005) in OpenSim (Delp et al., 2007). EMG and joint angle data were collected synchronously for a given period of time during which a subject was asked to perform a set of finger and/or wrist movement. Figure 3 provides a snapshot of this data while a subject is performing finger flexion and extension with consistent movements in a fixed posture.

3.1 Data collection procedure

The subject was asked to perform basic finger and wrist movements in each arm posture as shown in the bottom panel of Figure 2. For a given posture, the subject was asked to perform single degree-of-freedom movements for their fingers/wrist following either a consistent or random pattern. For example, the subject performed consistent finger movement that continuously alternated between extending (opening) and flexing (closing) their fingers, while keeping their wrist in a neutral position. Such a protocol has been followed by others; see for example Kawashima and Mita (2016). The subject was also asked to perform a series of random movements, which was chosen solely by the subject, providing a challenging benchmark to determine how robust our fitted models are to different movements.

For a TRA, there would be no movement data for the corresponding hand. Instead, movement data can be collected from the other, intact limb while the TRA performs (attempts) mirrored movements with both limbs; see Scheme and Englehart (2011).

3.2 Post-collection data processing for analysis

To accommodate the biomechanical constraints described in section 2, we utilize recent past behavior of EMG signals, as well as current finger/wrist position, to predict finger/wrist velocity. For example, a concurrent EMG value of 0.50 leads to different movements depending on the historical EMG trend. We briefly describe here the data processing that was done that allowed us to fit the model described in the next section.

The velocity values were estimated from a penalized smoother of the entire series of recorded position data across time using the R package fda. In particular, we used a second-order smooth regularization penalty to control the goodness of fit and smoothness of the fitted curve, where the smoothing parameter is selected by cross-validation. We then generated a velocity estimate for each observed position data point. For each velocity data point, we extracted the previous EMG observations that ended with the concurrent EMG value to the velocity value. In this application, we chose a past time window of roughly 1/3 seconds. The value was chosen based on observed passive force movement (see Figure 3). This was done for all EMG signals, so each velocity estimate had associated with it a position value and equally-spaced past measurements across 1/3 seconds for all EMG signals collected. A visualization of this data restructuring may be found in the Supplementary Materials, Section LABEL:AppendixChp3. We propose to model the current velocity of the finger/wrist movement as a function of the recent history of the EMG signals which is to be discussed next.

4 Variable selection and inferential framework

Denote the observed data by where indexes the instance at which data are collected, is the scalar response, is the continuous scalar covariate, indexes the functional predictors, and is the th functional predictor observed at point such that . It is assumed that and both and are closed compact sets. In our application, is the current velocity, is the current position, and is the recent history of the th EMG signal and is a window of time that depicts the “recent” past.

We consider a functional linear model with varying smooth effects:

(1)

where is an intercept and is an unknown bivariate function defined on that quantifies the effect of the th functional predictor on the mean response of conditional on . Model (1) is a direct extension of the functional linear model (FLM) described in Ramsay and Silverman (2005); Cardot et al. (2003); James et al. (2009); Goldsmith et al. (2011); Ferraty et al. (2012); McLean et al. (2014) who assume nonvarying functional coefficients. In our application accounting for positions in the statistical model is crucial for modeling of passive forces since the effect of the EMG signals on finger/wrist movement heavily depends on the current state of the hand as argued in section 2. Varying coefficients for scalar covariates characterize the effect of a covariate on response through another covariate and have been studied extensively in nonparametric and semiparametric literature; see, for example, Fan and Zhang (2008); Maity and Huang (2012); Fan et al. (2014); Davenport et al. (2015); Bandyopadhyay and Maity (2017). Recently, the varying-coefficient model has been extended to analyze functional data in Cardot and Sarda (2008); Wu et al. (2010); Davenport (2013).

Our primary objective is to select the functional predictors whose effects on the mean response as described by (1) is non-zero. For our application this would allow us to select the most important forearm muscles in explaining finger and wrist movements. Let denote the true index set where and let denote the set of true effects. Established fitting approaches focus on the estimation of the smooth coefficients rather than variable selection. Variable selection in scalar-on-function regression with invariant smooth coefficients, , has been discussed in Matsui and Konishi (2011); Gertheiss et al. (2013); Fan et al. (2015); Pannu and Billor (2017). In this paper, we extend these approaches to the varying functional linear model (1) that is motivated by our data application.

4.1 Approximation to linear model

Following the procedure described in Wood (2006) and Eilers and Marx (2003), we approximate

using a tensor product of two univariate basis functions. Let

and be two truncated univariate bases defined in and respectively; and let , where ’s are the unknown basis coefficients associated with the th functional predictor. For simplicity of exposition, we use the same bases for all Generalization to different bases for different predictors is straightforward.

It is more convenient to rewrite the tensor product in matrix form where , , and is the matrix of basis coefficients, which is our target for inference. The generic summand in (1) can be approximated by Riemann sum approximation as

(2)

where is a

row vector and

. By an abuse of notation, define the vector where indicates the Kronecker product. For notational simplicity, we omit indices and in the subscript of and denote it by . Vectorize by appropriately stacking its columns vertically and denote the vector of basis coefficients ’s by . This yields an approximating linear model to (1) with unknown regression coefficients ’s

(3)

4.2 Penalized criterion for variable selection

We allow and to be sufficiently large to capture the complexity of the regression surfaces and penalize the degree of smoothness. We adopt the penalized least squares approach for estimation that simultaneously induces group sparsity and controls smoothness of the corresponding regression surfaces.

Let be the norm of , and let and be the second partial derivatives of with respect to and respectively. Prior to analysis, we center the response and functional predictors ’s, and remove from the model. By an abuse of notation, we use the same notation for the centered model. The estimates of the ’s are then selected to be the minimizers of the following penalized criterion

(4)

where controls the model sparsity, and control the smoothness of in the and dimension, respectively. Let and The proposed penalty function was first introduced by Meier et al. (2009) for variable selection in high dimensional additive model. As increases, more penalty weight is placed on both the magnitude and smoothness of the surface which results in , excluding the corresponding functional predictor from the model. Although is assumed to be a smooth function, the fitted regression surfaces may be rough if the penalty on the second order derivatives and are not included. The extent of smoothing of the non-zero regression surfaces is controlled by the vector of smoothing parameters . For example, as increases, the model more heavily penalizes the departure from linearity, producing linear effects in both directions. The proposed penalty ensures smoothness and sparseness in two dimensions; it can be easily extended for multiple dimensions without loss of generality.

We now turn our attention to efficient calculation of for a given function . For simplicity, we further assume that the bases and are orthogonal B-splines. Using the orthogonal property of the bases, define by the identity matrix with the th element as if and 0 otherwise, and define similarly in terms of the ’s. General bases will not have and equal to the identity. Let and be the second derivatives of and respectively; let and be the vectors of these and functions. Define and

It follows that

where the last equality follows from the orthogonality of the bases. Similarly, we obtain and The penalty may now be expressed as a sparse-smooth regularization (Meier et al., 2008; Meier, 2009)

where is a symmetric positive-definite matrix. Cholesky decomposition on gives where is a lower triangular non-singular matrix. Using (3) and reparametrize the model coefficients as and transform each to It follows that (4) can be written equivalently as

(5)

which is similar to the group LASSO penalized criterion described in Yuan and Lin (2006); Yang and Zou (2013, 2015).

We use the groupwise-majorization-descent (GMD) algorithm described by Yang and Zou (2013) to solve (5), which is implemented by the R package gglasso (Yang and Zou, 2013). A benefit of the GMD algorithm is that it does not require group-wise orthonormality and is applicable for a general design matrix. In addition, for given values of the tuning parameters and the minimizer of (5) with respect to has a close-form solution; denote the solution by by omitting its dependence on and .

It is straightforward to restructure the estimated basis coefficients in to give , and we have

Let be the estimated index set after solving under and and denote the set of their corresponding estimates by . In reality, the sparseness and smoothing parameters, and are unknown and need to be chosen empirically; for instance by cross-validation. We discuss this aspect in more detail in Section 4.4.

The penalized criterion (4

) does not allow for different shrinkage and smoothness for different functional predictors, and assumes equal weight. This may inflate the number of false positives in variable selection and to mitigate this problem, adaptive estimation is recommended for variable selection in high-dimensional data

(Meier, 2009; Tutz and Gertheiss, 2010; Gertheiss et al., 2013). We too adopt this approach and discuss it next.

4.3 Adaptive penalized criterion

We generalize the criterion (4) by adaptive penalized criterion. Specifically, we use initial weights to introduce prior information on the relative importance and smoothness of functional predictors, where is related to the sparse penalty factor and and are related to the smooth penalty factors in the and dimension, respectively. Define the adaptive penalty function by

(6)

where the weights are strictly positive and calculated based on the parameter estimates associated with a functional additive model (McLean et al., 2014; Scheipl et al., 2015) fit without a sparseness penalty. The subscript 0 is used to distinguish the initial estimates from the model estimates of (5). Define the weights by and for details we refer to Zou (2006); Gertheiss et al. (2013); Guo et al. (2015); Ciuperca (2016); Ivanoff et al. (2016). Alternatively, one can adopt a semi-adaptive approach using only while keeping equal weights for smooth penalty factors such as and for all Like Yang and Zou (2015), the weights based on the number of parameters within each group can also be adopted alternatively. Here, as decreases to , increases to which yields sparser solution for for a given Similarly, as and increase to , the regression surface yields linear pattern in both directions for a given We denote the minimizer of the criteria with penalty (6) by associated with for a given and

4.4 Selection of the tuning parameters, and

A widely used method to select tuning parameters is -fold cross-validation (CV). We use 5-fold block CV which has been found appealing for data with temporal correlations (see Roberts et al. (2017)).

Specifically, we partition the data into 5 equally-sized sections; let index the folds and denote by be the th ordered response of the th fold corresponding to the instance , and be the total number of observations in the th subsample. We aim for the to be as equal as possible across . The test set is formed by one of the five folds; the remaining four folds form the training set.

For fixed values of and the model parameters are estimated based on the training set (e.g., all data with the th fold removed), and the prediction accuracy is evaluated based on the performance of the test set (e.g., the th fold); i.e., where is the predicted value for

For each value of the tuning parameters, we compute the average

’s and select the values corresponding to the minimum prediction error. Furthermore, CV with the one standard error rule can be also adopted; the main idea is to select the simplest model whose numerical performance is comparable with the optimal model that is chosen by minimum prediction error

(Friedman et al., 2001; Krstajic et al., 2014; Yang and Zou, 2015).

4.5 Sequential Adaptive Functional Empirical (SAFE) Selection

Most variable selection algorithms can not guarantee the exclusion of noise variables (Meinshausen et al., 2009). In particular, the group LASSO approach has been shown to select a larger number of groups than necessary (Meier et al., 2008; Gertheiss et al., 2013); this may happen due to the uncertainty in selecting the optimum combination of tuning parameters. As a remedy, we propose to perform the selection step twice where the second stage considers only the covariates ; where the superscript (1) denotes the first stage, and and are the optimal tuning parameters determined by CV in the first stage. The adaptive weights for the second stage are then calculated based on

The proposed idea is motivated by the relaxed LASSO (Meinshausen, 2007) approach, however, the fitting procedure is different in terms of calculating the adaptive weights for the second stage. In contrast to Meinshausen (2007), our solution path of the second stage is driven by both and the selection of second-stage tuning parameters and In Meinshausen (2007), the estimates of the second stage are not driven by weights but rather based on the exhaustive search of the sparse tuning parameter. This idea was also promoted by Wei and Huang (2010) and Guo et al. (2015) who adopted a two-stage variable selection strategy for the scalar-on-scalar regression.

With weights re-calculated, we minimize the objective function (6) using only those with respect to and Let be the estimated index set resulting from the second stage of the regularization fit under the optimal combination of tuning parameters and found by CV.

4.6 Post-selection inference and prediction

Shrinkage penalties cause the estimates of the non-zero coefficients to be biased towards zero (Zhao et al., 2017; Friedman et al., 2001). In the similar spirit to Leeb et al. (2015); Zhao et al. (2017), once is determined, we refit the model using the second order smooth regularization to reduce the prediction bias. Specifically, we solve the following penalized criterion

(7)

where are the smoothing parameters that control the curvature of the fit and is the smoothing penalty function as described in Wood (2006) and Eilers and Marx (2003). We choose the optimal tuning parameters using CV as discussed in Section 4.4.

The model (7) is conditional on the event that the same data is being used twice: once in the variable selection stage and again in the post-selection fit on the selected subset. Wu et al. (2009); Zhao et al. (2017)

argued that the use of standard prediction (confidence) intervals or

p-values without adjustment are invalid as they neglect the complex selection procedure to define the reduced model in the first place. Thus an adjustment is required to construct valid prediction intervals or obtain valid p-values (Tibshirani and Johnstone, 2014; Fithian et al., 2014; Lee et al., 2016). To the best of our knowledge, post-selection inference for the group LASSO in the context of functional data is still an open problem. Alternatively, inference after selection based on data splitting is commonly adopted for high-dimensional data; see Wasserman and Roeder (2009); Meinshausen et al. (2009). We extend these ideas to the case of functional covariate and focus on post-selection predictive inference based on data splitting and construct split conformal prediction bands following Lei et al. (2017); for details of the algorithm we refer to the Supplementary Material, Section E.

5 Data Application: EMG selection for finger/wrist movement

In this section, we present the variable selection and prediction results for our data collected from an AB subject across multiple postures and movement patterns as described in Section 3. Recall there are 16 EMG signals: 14 coming from different forearm muscles that could potentially contribute to finger or wrist movements and 2 that are randomly generated noise. The ideal selection scheme would pick one EMG signal for extension and another EMG for flexion for both finger and wrist movements, but we do not enforce this restriction in the estimation. This selection should be consistent across postures and movement patterns. In addition to identifying these EMG signals, we are also interested in a model with clear interpretabililty of the regression surfaces and high predictive ability.

Here we consider fitting the procedure described in Section 4 to our data application. Let denote the current time point and be the velocity at time point . Let be the finger or wrist position at time point . Define for some integer as the time window for the recent past EMG signal relative to the current time so the ’s make up the sequence . For , let be the realizations of the th EMG signal at the original time points Recall from Section 3.2 that we proposed and justified a recent past time window that spanned roughly 1/3 second, making since data were collected at 120Hz. The following results were found to be relatively insensitive to other within a reasonable neighborhood of 40.

The underlying muscle-movement mechanism for an AB person is known to clinicians; this prior information about the underlying true signals allows us to define the correct model size in the data application. To evaluate the model selection performance, we partition according to important movement classes based on expert knowledge. Let and denote the partitioning subsets contributing to flexion and extension movement, respectively, which depend on finger or wrist movements. Figure 4 shows in circled labels the clinically relevant muscles for different movement classes for an AB person. Specifically for finger movements, and these muscles are known as flexor digitorum and extensor digitorum, respectively. Similarly for wrist movements, which are known as flexor carpi ulnaris, flexor digitorum superficialis, flexor carpi ulnaris, and flexor carpi radials, respectively, and which are coined as extensor carpi radialis longus, extensor digitorum, extensor carpi radialis brevis, and extensor carpi ulnaris, respectively. All muscles associated with a specific movement class produce highly correlated signals. The expert consensus is that one muscle in and another in are sufficient to define the relationship between the movements and EMG signals. The ideal index set identifies one muscle in each movement class or equivalently and , where denotes the cardinality of a set.

Figure 4: Reference forearm muscles for finger (top) and wrist (bottom) movements.

We use the method described in Section 4 to study which EMG signals ’s are related to the velocity and determine representative members of the sets and . We also consider three alternative approaches; the first competitor is the method proposed by Gertheiss et al. (2013) in which the authors propose the functional variable selection technique using adaptive group LASSO (agLASSO) type penalized criterion inducing both sparsity and smoothness. The second competitor is the method proposed by Pannu and Billor (2017) extending the idea of Gertheiss et al. (2013)

; here the authors use the objective function based on the least absolute deviation and select the functional variables using group lasso penalty. This approach is explicitly designed to account for potential outliers in the functional predictors, reducing overfitting of the regression surfaces. We refer this approach to LAD-gLASSO. The third competitor

(Fan et al., 2015) uses functional additive regression (FAR) with a groupwise smoothly clipped absolute deviation (gSCAD) penalty, denoted by FAR-gSCAD. Unlike other approaches, FAR-gSCAD imposes penalty on the integral assuming the integral is well-defined. Note that all these methods are designed to assess the relationship between the scalar responses and functional predictors. In stark contrast to SAFE-gLASSO, all three competitors assume non-varying smooth coefficients, ’s defined on , which accounts for the passive forces but assumes the relationship is not position dependent.

The performance of all the methods is evaluated in terms of how close they yield an ideal selection of the EMG signals as well as their prediction ability. Specifically if denotes an estimator produced by one of the methods, we evaluate its performance by

  • Size ideally size = 2 with and

  • Sparsity (SP) In our application, is 16 and the optimal sparsity is 0.88 which is associated with the ideal model size 2. Define the relative sparsity (RSP) with respect to the optimal sparsity by RSP = SP/0.88;

  • False positive rate (FPR) where we define the number of false positives (FP) or falsely identified EMG signals in by , and is the complement of

  • True positive rate (TPR) where the number of true positives (TP) is defined by and focuses only on whether we capture the two index sets and

  • Mean squared error (MSE)

Note that in general one can not calculate FPRs and TPRs in a data application as the underlying truth is unknown. Our definitions for the above matrices are somewhat unconventional focusing on group identification of and rather than identification of all

5.1 Computational Details

We briefly describe the computational details of our implementation as performed in R, starting with the necessary steps after data processing including construction of the recent past EMG curves using . We approximate the using a tensor product of two orthogonal B-splines with basis functions in the direction and basis functions in the direction. Knots are placed uniformly throughout and . The tuning parameters to produce the final estimates at each stage are chosen using 5-fold CV following the procedure described in Roberts et al. (2017).

For the competitors, the smooth coefficient functions are modeled using B-splines with 12 basis functions and the sparsity-smoothness tuning parameters are estimated by 5-fold CV (Roberts et al., 2017) for all the competing methods. We used the R package grplasso (Meier, 2009) to fit the agLASSO (Gertheiss et al., 2013) and rqPen (Sherwood et al., 2017) to fit the robust LAD-gLASSO (Pannu and Billor, 2017). We used the code provided by the corresponding author Fan et al. (2015) for FAR-gSCAD.

5.2 Variable selection, prediction, and implication

Table 1

shows the results of the model selection performance for the finger movements. In most cases all the competing methods select one EMG signal that contributes for finger extension and another for finger flexion. All methods perform fairly well in terms of TPR. In particular, the proposed SAFE-gLASSO attains desirable values of TPRs, FPRs, and RSPs across all settings. We found that the first stage of SAFE-gLASSO tended to select more EMG signals than needed but the second stage corrected the surplus. In contrast, agLASSO exhibits good numerical performance reflected by the optimal levels of FPRs and RSPs but did have poor TPR in the last two settings. LAD-gLASSO exhibited large FPRs and RSPs, indicating a high Type I error rate. FAR-gSCAD demonstrates good model selection performance but selected more EMG signals than necessary, as indicated by its RSP values. The numerical performances are similar for the wrist movements and due to the interest of the space, we include the results in Table

4 of the Supplementary Materials, Section B.3.

Figure 5 illustrates graphically the MSEs of the competitive methods. For both finger and wrist movements, SAFE-gLASSO outperforms the competitors across all patterns and postures. The proposed approach was able to consistently fit the data better with a smaller model size, specifically relative to FAR-gSCAD and LAD-gLASSO.

Figure 6 shows a segment of the fitted velocities based on SAFE-gLASSO and agLASSO where the shaded region corresponds to the split conformal prediction bands following Lei et al. (2017). SAFE-gLASSO yields superior prediction trajectory than agLASSO in the events with a larger absolute velocity. In the highlighted event of Figure 6, fingers return to the neutral position around 1.45 to 1.88 seconds due to passive forces and while both methods predict movement, agLASSO greatly underestimates the observed velocity. The improvement is most directly attributable to the inclusion of position information. The agLASSO relies on the assumption that the effect of the EMG signals on finger/wrist movements is invariant across which appears to limit the fit across a variety of movement events.

Since our methodology offers low-dimensional modeling with negligible false positives, biomedical engineers can use our method as a screening approach and focus primarily on the selected subsets of forearm muscles in collecting data from TRAs. This approach significantly reduces the burden of data collection with respect to time and cost. Furthermore, since our model and that by Crouch and Huang (2016) account for similar biomechanical phenomenon, our variable selection results may be directly applied to their method.

agLASSO LAD-gLASSO FAR-gSCAD SAFE-gLASSO Pattern RSP TPR FPR RSP TPR FPR RSP TPR FPR RSP TPR FPR 1.00 [2] 100 [2] 0 [0] 0.57 [8] 100 [3] 38 [5] 0.92 [3] 100 [3] 0 [0] 1.00 [2][2] 100 [2][2] 0 [0][0] 1.00 [2] 100 [2] 0 [0] 0.72 [6] 100 [3] 23 [3] 0.92 [3] 100 [3] 0 [0] 1.00 [2][2] 100 [2][2] 0 [0][0] 1.00 [2] 100 [2] 0 [0] 0.57 [8] 100 [3] 38 [5] 0.92 [3] 100 [3] 0 [0] 0.92 [3][3] 100 [3][3] 0 [0][0] 1.00 [2] 100 [2] 0 [0] 0.78 [5] 100 [2] 23 [3] 1.00 [2] 100 [2] 0 [0] 1.00 [2][3] 100 [2][2] 0 [0][1] 1.06 [1] 50 [1] 0 [0] 0.85 [4] 100 [2] 15 [2] 0.92 [3] 100 [3] 0 [0] 1.00 [2][3] 100 [2][3] 0 [0][0] 1.06 [1] 50 [1] 0 [0] 0.85 [4] 100 [3] 8 [1] 0.92 [3] 100 [3] 0 [0] 0.92 [3][3] 100 [2][2] 0 [1][1]

Table 1: Finger movements EMG signal selection for consistent (top three rows) and random (bottom three row) patterns at different postures and results with superscript correspond to the first stage of SAFE-gLASSO. RSPs, number of variables (square brackets), and the percentages of TPRs and FPRs are presented.
Figure 5: Comparison of MSEs based on alternative approaches. Results correspond to finger (left) and wrist (right) movements with different patterns and postures.
Figure 6: Predicted velocity corresponding to consistent finger movements based on SAFE-gLASSO (red dashed line) and agLASSO (blue dashed line). Vertical reference lines are drawn at 1.45 and 1.88 second at which movements due to passive forces occur in the absence of muscle contractions. Shaded regions correspond to pointwise 95% split conformal prediction bands for SAFE-gLASSO.

5.3 Estimation of smooth coefficients

Figure 7 illustrates the estimated coefficient functions for the finger (top) and wrist (bottom) movements. For the purpose of illustration, we focus on interpreting the second posture shown in Table 1. In particular the top and bottom panels of Figure 7 correspond to consistent finger and wrist movement, respectively, for the two identified EMG signals from SAFE-gLASSO. For finger movement, there is a clear distinction in the movement contributions: concurrent contraction of extensor digitorum is responsible for finger extension (positive velocity) while flexor digitorum leads to flexion (negative velocity). This matches with the intuition of the biomechanical system. As expected, these associations also depend on the positions. Indeed, the impact of the left surface on the velocity is most important for angles between and radians which correspond to finger flexion. Similarly, the impact of the right surface is most relevant between angles and radians which are primarily associated with finger extension. The flexor digitorum only leads to finger flexion when the hand is in a neutral position (0 to 15 radians).

Finger flexion occurs at other positions occur due to past behavior of the extensor digitorum. The observed concurrent relationship described above has the opposite historical relationship for the positions when the signals are presently active. This implies that past activation of one of these EMG signals can lead to the opposite type of concurrent movement they produce. In particular, this corresponds to two difficult cases where the model tries to establish the systematic relationship between: (1) passive force movements in the absence of muscle contraction, and (2) lack of movements in the presence of consistent muscle contraction due to physical constraints. Our model borrows information from the past, where the muscles were active, to predict such passive movements. Failing to appropriately account for passive forces results in poor prediction performance in those cases as shown in Figure 6.

The bottom panels of Figure 7 plot the estimated regression coefficients for the selected EMG channels for wrist movements. The interpretation of the regression surfaces for the wrist flexion/extension follows the same intuition as of hand movements. Concurrently, flexor carpi ulnaris leads to wrist flexion around 0 to 50 radians while extensor carpi ulnaris yields wrist extension around -60 to -15 radians. When the wrist is in a neutral position, flexor carpi ulnaris has to be flexed to keep the wrist upright around -5 to -15 radians. As before, the observed concurrent relationship exhibits the opposite historical association for wrist flexion and extension.

The regression surfaces for other patterns of finger/wrist movements can be interpreted in the similar manner; we report the regression surfaces corresponding to other patterns in Section B.5 of the Supplementary Material.

Figure 7: Regression surfaces for finger flexion (top-left), finger extension (top-right), wrist flexion (bottom-left), and wrist extension (bottom-right). The selected forearm muscles are pointed out by red circles.

6 Simulation study

6.1 Simulation mimicking the data application

In this section, we consider a simulation study that mimicks the EMG data from Section 5. Specifically, we focus on finger movements with consistent movement. Let the observed data be where is now the simulated velocity at time and all other data match the descriptions from section 5. In particular, consider the generating model where for all except and and and are the estimated effects from one of the scenarios. In addition, is a zero-mean error process with isotropic covariance function described by

Here is related to the dominant sources of dependence; means that the responses are uncorrelated while large reflects higher degree of autocorrelation between the responses. In addition, controls the strength of correlation between any two measurements. The choice of is driven by the parameter , the correlation between any two consecutive measurements, using the relationship .

The simulation study varies over three factors. The first factor we examine is the dominant sources of dependence determined by

  • Dominant white noise.

  • Equal process.

  • Dominant dependent process.

The second factor is the strength of the correlation between successive measurements.

  • Low correlation.

  • High correlation.

The third factor is the magnitude of noise variance

  • Small noise. corresponds to the situation of having smaller variance than that of the original data.

  • Large noise. corresponds to the case of equal variance.

Calculate signal-to-noise ratio (SNR) as SNR = where In particular, we consider SNR = {875, 445, 80, 8, 4, 0.8}. The results are based on 100 independent samples for each combination of the simulation settings.

Table 2 illustrates the numerical performance corresponding to the simulated kinematic data for different SNRs. The results are consistent to the findings of finger movements in Section 5. As before, we observe that the mean model size of the second stage of the proposed is nearer to the truth (i.e. 2) than that of the first stage; see the “SP” column under SAFE-gLASSO in Table 2. This is expected due to the fact that by adopting the two-stage variable selection scheme, we shrink the surplus variables in the second stage of the procedure. The proposed approach exhibits lower FPRs than that of the agLASSO, LAD-gLASSO, and FAR-gSCAD across all simulation settings. Similar to the other approaches, SAFE-gLASSO also attains high TPRs.

SAFE-gLASSO exhibited superior prediction performance relative to the competing methods across all simulation settings; see Figure 8. These findings are in agreement with the results of Section 5. As expected the prediction accuracy improves with the SNR; compare the box-plots corresponding to the SAFE-gLASSO for different SNRs in Figure 8. Unlike the proposed approach, the numerical performance of the alternative methods suffers due to not considering positing-varying smooth coefficients as the methods fail to quantify the systematic relationship between the velocity and muscle activities of a biomechanical model. We report the additional simulation results in the Supplementary Materials, Section C.

agLASSO LAD-gLASSO FAR-gSCAD SAFE-gLASSO Setting SNR SP TPR FPR SP TPR FPR SP TPR FPR SP TPR FPR 875 87 [2.09] 100 1 66 [5.41] 100 24 81 [3.00] 100 7 88 [2.00][2.00] 100 [100] 0 445 86 [2.18] 100 1 66 [5.42] 100 24 81 [3.00] 100 7 88 [2.00][2.00] 100 [100] 0 [0] 80 87 [2.08] 99 1 68 [5.09] 100 22 81 [2.98] 100 7 88 [2.00][2.00] 100 [100] 0 [0] 8 88 [2.00] 100 0 74 [4.22] 100 15 81 [2.97] 100 7 88 [2.00][2.00] 100 [100] 0 [0] 4 87 [2.03] 100 0 71 [4.58] 100 18 81 [2.91] 100 7 88 [2.00][2.00] 100 [100] 0 [0] 0.8 87 [2.16] 98 2 69 [5.00] 100 21 82 [2.96] 100 7 87 [2.02][2.27] 95 [99] 1 [2]

Table 2: Analysis of finger movements with fixed motion. Data is generated assuming noise variance C1 and C2 with different dominant processes (A1-A3) for high correlation coefficient (B2). Reported are the SPs (%), model size (in square brackets), TPRs (%), and FPRs (%) averaged over 100 simulations. Results with superscript correspond to the first stage of SAFE-gLASSO.
Figure 8: Comparison of MSEs based on competing approaches for high (left) and low (right) SNRs with high correlation coefficient (B2). Simulated data corresponds to the consistent finger movements. Reported are the results based on 100 simulations.

6.2 Numerical experiment

Next we consider another simulation study where data are generated from a purely mathematical perspective. The simulated data is where is a scalar response at an instance is a scalar covariate observed at , ’s are the realizations of the th functional predictor such that . The 10 functional predictors are generated similar to Gertheiss et al. (2013) and Pannu and Billor (2017). The nonzero functional coefficients ’s are varying over and defined as and . Note that controls the strength of functional coefficients over The error variance is calculated based on different SNRs; i.e., SNR = More details about the functional predictors, functional coefficients, and simulation results are summarized in the Supplementary Materials, Section D.

As expected, when meaning the functional coefficient should have only one dimension, , all competitors showed comparable performance across all metrics, having high TPR and similar MSE. All methods except LAD-gLASSO had low FPR, which consistently selected 5 of the 10 possible variables. As the difficulty of the model increases, say for the competitors perform much worse than SAFE-gLASSO. FAR-gSCAD and agLASSO have low FPR, but tend to select only 1 of the 2 important variables, while LAD-gLASSO has high TPR but also high FPR, similar to the scenario. As expected, the MSE performance also deteriorates for the competitors as increases.

We also investigated the coverage probabilities and length of the prediction intervals for the SAFE-gLASSO models. Denote the lower and upper bound of the in-sample prediction interval by

, which is constructed following rank-one-out (ROO) split conformal prediction inference (Lei et al., 2017). Define the pointwise average coverage probability by where is the total number of in-sample observations. Calculate the expected length of the interval by Let index the instance at which new information of the predictor is recorded; where and M is the total number of future observations. Define the pointwise average coverage probability for the future observations by where is the prediction intervals computed using the split conformal prediction inference (Lei et al., 2017). In our simulation, we consider

Table 3 demonstrates the predictive inference of the proposed approach in terms of the actual coverage and expected length of the interval for both the in-sample () and future () observations. In general, the average coverage stays around the nominal levels irrespective of the complexity of the model as defined by see the results for and As expected, the width of the interval increases as departs from 0 but reduces for larger SNR; see the average length of the interval for at miscoverage level for We also observe the similar phenomenon with the prediction intervals for the future observations.

SNR = 0.5 SNR = 5 C = 0 C = 5 C = 10 C = 0 C = 5 C = 10 0.01 0.990 (0.010) [1.169] 0.990 (0.010) [5.202] 0.991 (0.010) [9.572] 0.990 (0.010) [0.367] 0.990 (0.010) [2.626] 0.990 (0.010) [5.120] 0.05 0.949 (0.022) [0.875] 0.950 (0.022) [3.870] 0.950 (0.022) [7.123] 0.949 (0.022) [0.277] 0.950 (0.022) [1.848] 0.950 (0.022) [3.586] 0.10 0.900 (0.030) [0.735] 0.900 (0.030) [3.234] 0.900 (0.030) [5.971] 0.900 (0.030) [0.232] 0.900 (0.030) [1.489] 0.900 (0.030) [2.874] 0.20 0.800 (0.040) [0.572] 0.799 (0.040) [2.504] 0.800 (0.040) [4.653] 0.800 (0.040) [0.181] 0.800 (0.040) [1.115] 0.799 (0.040) [2.136] 0.01 0.992 (0.009) [1.168] 0.990 (0.010) [5.209] 0.992 (0.009) [9.657] 0.990 (0.010) [0.366] 0.989 (0.011) [2.598] 0.989 (0.011) [5.062] 0.05 0.949 (0.022) [0.871] 0.948 (0.022) [3.860] 0.951 (0.022) [7.162] 0.948 (0.022) [0.277] 0.951 (0.022) [1.841] 0.951 (0.022) [3.568] 0.10 0.898 (0.030) [0.732] 0.900 (0.030) [3.228] 0.904 (0.029) [5.992] 0.898 (0.030) [0.232] 0.902 (0.030) [1.484] 0.901 (0.030) [2.860] 0.20 0.802 (0.040) [0.571] 0.802 (0.040) [2.817] 0.806 (0.040) [4.667] 0.798 (0.040) [0.182] 0.803 (0.040) [1.115] 0.803 (0.040) [2.136]

Table 3: Average pointwise coverage of 80%, 90%, 95%, and 99% prediction bands for the in-sample () and future () observations using ROO and split conformal prediction inference, respectively; standard errors (in parenthesis) and average lengths (in square bracket) of prediction bands are reported. Results are based on 100 simulations.

7 Discussion

In this paper, we proposed a covariate-dependent, scalar-on-function regression model that appropriately accounts for the biomechanical processes involved in hand movement, such as passive forces and physical constraints. The functional predictors were the recent past behavior of EMG signals measured across multiple muscles in the subject’s forearm and the responses were finger and wrist velocity. The functional coefficients for each EMG signal were allowed to vary based on the current finger or wrist position. The bivariate coefficients were then approximated using a tensor product of rich basis expansions that were then estimated with a combined multi-dimensional smoothing and sparseness penalty, which is an extension of Gertheiss et al. (2013). We developed a two-step variable selection procedure, called Sequential Adaptive Functional Empirical group LASSO (SAFE-gLASSO), that was shown through numerical investigations to have superior performance over standard selection approaches (Gertheiss et al., 2013; Pannu and Billor, 2017; Fan et al., 2015) by reducing the number of false positives irrespective of model complexity.

The results of the data application showed SAFE-gLASSO was able to identify the important EMG signals for finger and wrist movement for an AB subject. Furthermore, the estimated varying functional coefficients were relatively sparse, easy to interpret, and had exceptional predictive performance compared to standard selection approaches. Our model and fitting algorithm have great potential to outperform current state-of-the-art data driven methods for prosthesis control such as pattern recognition because they ignore biomechanical constraints, do not perform variable selection, and are prone to overfitting. The variable selection results from SAFE-gLASSO could also be incorporated in the method by Crouch and Huang (2016), which currently does not perform variable selection and uses a planar link-segment dynamic model. Although our model mathematically differs from theirs, both models account for the biomechanical system with enough similarity that our variable selection results still apply. Our model does have the advantage in that, after training our model, there is minimal data processing required to produce predictions in the prosthetic limb, reducing the burden of real-time data collection.

There are many opportunities for extensions and new applications of the approaches taken in this paper. The two-step fitting approach used in SAFE-gLASSO easily applies to the more common univariate functional coefficients. We also discussed an approach to assess the predictive ability based on data splitting, which as far as the authors are aware has not been applied to functional regression. Additionally, our model, which uses a tensor product basis, can also be applied in a more general setting with many functional predictors having functional coefficients varying over multiple covariates. This would be of interest for this data application if the coefficients were considered to vary significantly across postures.

In our current developments, the functional covariates were assumed to be observed without error on a fine grid of points. Extensions to a situation where the functional measurements are perturbed by error or when the grid points ’s are sparse would require a preliminary smoothing of the functional covariates using existing approaches Yao et al. (2005); Xiao et al. (2016) and then SAFE-gLASSO may be employed on the smoothed functional covariates. In a preliminary investigation, not reported here, we observed that the variable selection performance of SAFE-gLASSO is unaffected due to the use of estimated smooth profiles in the place of true functional predictors. However prediction error of the responses is affected by large noise variance of the functional predictors, as would any method.

As in non-parametric regressions, our method relies on the tensor products of basis functions; as a result, the number of parameters to estimate can explode very quickly. Even though our method can tackle the dimensionality issue, it becomes computationally intensive with large number of basis functions. Note that we adopt CV based approach in selecting the tuning parameters for the data application. We also acknowledge that there are other methods to select optimum tuning parameters and CV based approach may not be theoretically the best approach (Gertheiss et al., 2013). While the method works well for our data application, future research is needed to investigate the optimality in selecting the tuning parameters. In addition, we focused on the development of a subject-specific modeling procedure and were not concerned with estimating subject-specific variability. By accounting for this source of variability, we can develop highly functional, user-specific robotic prosthetic limb that perform well across multiple subjects through estimation of population parameters.

Acknowledgments

The project described was supported by Division of Information and Intelligent System, National Science Foundation through award number 1527202. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NSF. A.-M. Staicu’s research was supported by NSF Grant number DMS1454942.

References

  • Bandyopadhyay and Maity (2017) Bandyopadhyay, S. and Maity, A. (2017). Asymptotic theory for varying coefficient regression models with dependent data. Annals of the Institute of Statistical Mathematics, pages 1–15.
  • Biddiss and Chau (2007) Biddiss, E. and Chau, T. (2007). Upper-limb prosthetics: critical factors in device abandonment. American Journal of Physical Medicine and Rehabilitation, 86:977–987.
  • Butterworth (1930) Butterworth, S. (1930). On the theory of filter amplifiers. Wireless Engineer, 7:536–541.
  • Cardot et al. (2003) Cardot, H., Ferraty, F., and Sarda, P. (2003). Spline estimators for the functional linear model. Statistica Sinica, 13:571–591.
  • Cardot and Sarda (2008) Cardot, H. and Sarda, P. (2008).

    Varying-coefficient functional linear regression models.

    Communications in Statistics—Theory and Methods, 37:3186–3203.
  • Ciuperca (2016) Ciuperca, G. (2016).

    Adaptive group lasso selection in quantile models.

    Statistical Papers, pages https://doi.org/10.1007/s00362–016–0832–1.
  • Crouch and Huang (2016) Crouch, D. L. and Huang, H. (2016). Lumped-parameter electromyogram-driven musculoskeletal hand model: A potential platform for real-time prosthesis control. Journal of Biomechanics, 49:3901–3907.
  • Crouch and Huang (2017) Crouch, D. L. and Huang, H. (2017). Musculoskeletal model-based control interface mimics physiologic hand dynamics during path tracing task. Journal of Neural Engineering, 14(3):036008.
  • Davenport (2013) Davenport, C. A. (2013). Semiparametric Regression Models for Interacting Covariates. PhD thesis, North Carolina State University, Raleigh, NC.
  • Davenport et al. (2015) Davenport, C. A., Maity, A., and Wu, Y. (2015). Parametrically guided estimation in nonparametric varying coefficient models with quasi-likelihood. Journal of Nonparametric Statistics, 27:195–213.
  • Delp et al. (2007) Delp, S. L., Anderson, F. C., Arnold, A. S., Loan, P., Habib, A., John, C. T., Guendelman, E., and Thelen, D. G. (2007). Opensim: open-source software to create and analyze dynamic simulations of movement. IEEE transactions on biomedical engineering, 54:1940–1950.
  • Eilers and Marx (2003) Eilers, P. H. and Marx, B. D. (2003). Multivariate calibration with temperature interaction using two-dimensional penalized signal regression. Chemometrics and Intelligent Laboratory Systems, 66:159–174.
  • Englehart and Hudgins (2003) Englehart, K. and Hudgins, B. (2003). A robust, real-time control scheme for multifunction myoelectric control. IEEE Transactions on Biomedical Engineering, 50(7):848–854.
  • Fan et al. (2014) Fan, J., Ma, Y., and Dai, W. (2014). Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models. Journal of the American Statistical Association, 109:1270–1284.
  • Fan and Zhang (2008) Fan, J. and Zhang, W. (2008). Statistical methods with varying coefficient models. Statistics and Its Interface, 1:179–195.
  • Fan et al. (2015) Fan, Y., James, G. M., Radchenko, P., et al. (2015). Functional additive regression. The Annals of Statistics, 43:2296–2325.
  • Ferraty et al. (2012) Ferraty, F., González-Manteiga, W., Martínez-Calvo, A., and Vieu, P. (2012). Presmoothing in functional linear regression. Statistica Sinica, 22:69–94.
  • Fithian et al. (2014) Fithian, W., Sun, D., and Taylor, J. (2014). Optimal inference after model selection. arXiv preprint arXiv:1410.2597.
  • Friedman et al. (2001) Friedman, J., Hastie, T., and Tibshirani, R. (2001). The Elements of Statistical Learning. Springer, New York, NY.
  • Gertheiss et al. (2013) Gertheiss, J., Maity, A., and Staicu, A.-M. (2013). Variable selection in generalized functional linear models. Stat, 2:86–101.
  • Goldsmith et al. (2011) Goldsmith, J., Crainiceanu, C. M., Caffo, B. S., and Reich, D. S. (2011).

    Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis.

    NeuroImage, 57:431–439.
  • Guo et al. (2015) Guo, P., Zeng, F., Hu, X., Zhang, D., Zhu, S., Deng, Y., and Hao, Y. (2015). Improved variable selection algorithm using a lasso-type penalty, with an application to assessing hepatitis b infection relevant factors in community residents. PLoS One, 10:e0134151.
  • Holzbaur et al. (2005) Holzbaur, K. R., Murray, W. M., and Delp, S. L. (2005). A model of the upper extremity for simulating musculoskeletal surgery and analyzing neuromuscular control. Annals of Biomedical Engineering, 33:829–840.
  • Huang et al. (2008) Huang, H., Zhou, P., Li, G., and Kuiken, T. A. (2008). An analysis of emg electrode configuration for targeted muscle reinnervation based neural machine interface. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 16(1):37–45.
  • Ivanoff et al. (2016) Ivanoff, S., Picard, F., and Rivoirard, V. (2016). Adaptive lasso and group-lasso for functional poisson regression.

    Journal of Machine Learning Research

    , 17:1–46.
  • James et al. (2009) James, G. M., Wang, J., and Zhu, J. (2009). Functional linear regression that’s interpretable. The Annals of Statistics, 37(5A):2083–2108.
  • Kawashima and Mita (2016) Kawashima, N. and Mita, T. (2016). Psychophysical evaluation of the capability for phantom limb movement in forearm amputees. PLoS One, 11:e0156349.
  • Krstajic et al. (2014) Krstajic, D., Buturovic, L. J., Leahy, D. E., and Thomas, S. (2014). Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of Cheminformatics, 6(10):https://doi.org/10.1186/1758–2946–6–10.
  • Lee et al. (2016) Lee, J. D., Sun, D. L., Sun, Y., Taylor, J. E., et al. (2016). Exact post-selection inference, with application to the lasso. The Annals of Statistics, 44:907–927.
  • Leeb et al. (2015) Leeb, H., Pötscher, B. M., Ewald, K., et al. (2015). On various confidence intervals post-model-selection. Statistical Science, 30:216–227.
  • Lei et al. (2017) Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J., and Wasserman, L. (2017). Distribution-free predictive inference for regression. Journal of the American Statistical Association, page 10.1080/01621459.2017.1307116.
  • Lei et al. (2015) Lei, J., Rinaldo, A., and Wasserman, L. (2015). A conformal prediction approach to explore functional data.

    Annals of Mathematics and Artificial Intelligence

    , 74(1-2):29–43.
  • Linda Resnik et al. (2011) Linda Resnik, P. et al. (2011). Development and testing of new upper-limb prosthetic devices: research designs for usability testing. Journal of Rehabilitation Research and Development, 48(6):697–706.
  • Maity and Huang (2012) Maity, A. and Huang, J. Z. (2012). Partially linear varying coefficient models stratified by a functional covariate. Statistics and Probability Letters, 82:1807–1814.
  • Matsui and Konishi (2011) Matsui, H. and Konishi, S. (2011). Variable selection for functional regression models via the l1 regularization. Computational Statistics & Data Analysis, 55:3304–3310.
  • McLean et al. (2014) McLean, M. W., Hooker, G., Staicu, A.-M., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23:249–269.
  • Meier (2009) Meier, L. (2009). grplasso: Fitting user specified models with group lasso penalty. R Foundation for Statistical Computing.
  • Meier et al. (2008) Meier, L., Van De Geer, S., and Bühlmann, P. (2008).

    The group lasso for logistic regression.

    Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70:53–71.
  • Meier et al. (2009) Meier, L., Van de Geer, S., Bühlmann, P., et al. (2009). High-dimensional additive modeling. The Annals of Statistics, 37:3779–3821.
  • Meinshausen (2007) Meinshausen, N. (2007). Relaxed lasso. Computational Statistics & Data Analysis, 52:374–393.
  • Meinshausen et al. (2009) Meinshausen, N., Meier, L., and Bühlmann, P. (2009). P-values for high-dimensional regression. Journal of the American Statistical Association, 104:1671–1681.
  • Pannu and Billor (2017) Pannu, J. and Billor, N. (2017). Robust group-lasso for functional regression model. Communications in Statistics-Simulation and Computation, 46:3356–3374.
  • Peerdeman et al. (2011) Peerdeman, B., Boere, D., Witteveen, H., Hermens, H., Stramigioli, S., Rietman, J., Veltink, P., Misra, S., et al. (2011). Myoelectric forearm prostheses: State of the art from a user-centered perspective. Journal of Rehabilitation Research and Development, 48(6):719–737.
  • Ramsay and Silverman (2005) Ramsay, J. and Silverman, B. (2005). Functional Data Analysis. Springer, New York, NY, 2 edition.
  • Resnik et al. (2011) Resnik, L., Etter, K., Klinger, S. L., and Kambe, C. (2011). Using virtual reality environment to facilitate training with advanced upper-limb prosthesis. Journal of Rehabilitation Research and Development, 48(6):707–718.
  • Roberts et al. (2017) Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J. J., Schröder, B., Thuiller, W., et al. (2017). Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8):913–929.
  • Ruppert et al. (2003) Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric regression. Number 12. Cambridge university press.
  • Scheipl et al. (2015) Scheipl, F., Staicu, A.-M., and Greven, S. (2015). Functional additive mixed models. Journal of Computational and Graphical Statistics, 24:477–501.
  • Scheme and Englehart (2011) Scheme, E. and Englehart, K. (2011). Electromyogram pattern recognition for control of powered upper-limb prostheses: State of the art and challenges for clinical use. Journal of Rehabilitation Research and Development, 48(6):643–659.
  • Scheme et al. (2010) Scheme, E., Fougner, A., Stavdahl, O., Chan, A. C., and Englehart, K. (2010). Examining the adverse effects of limb position on pattern recognition based myoelectric control. In 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, pages 6337–6340. IEEE.
  • Sherwood et al. (2017) Sherwood, B., Maidman, A., Sherwood, M. B., and ByteCompile, T. (2017). Package ‘rqpen’. R Foundation for Statistical Computing.
  • Tibshirani and Johnstone (2014) Tibshirani, R. and Johnstone, I. (2014). A significance test for the lasso. Annals of Statistics, 42(2):413–468.
  • Tutz and Gertheiss (2010) Tutz, G. and Gertheiss, J. (2010). Feature extraction in signal regression: a boosting technique for functional data regression. Journal of Computational and Graphical Statistics, 19:154–174.
  • Wasserman and Roeder (2009) Wasserman, L. and Roeder, K. (2009). High dimensional variable selection. Annals of Statistics, 37(5A):2178–2201.
  • Wei and Huang (2010) Wei, F. and Huang, J. (2010). Consistent group selection in high-dimensional linear regression. Bernoulli, 16:1369.
  • Wood (2006) Wood, S. N. (2006). Low-rank scale-invariant tensor product smooths for generalized additive mixed models. Biometrics, 62:1025–1036.
  • Wood (2017) Wood, S. N. (2017). Generalized additive models: an introduction with R. CRC press.
  • Wu et al. (2009) Wu, T. T., Chen, Y. F., Hastie, T., Sobel, E., and Lange, K. (2009). Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics, 25:714–721.
  • Wu et al. (2010) Wu, Y., Fan, J., Müller, H.-G., et al. (2010). Varying-coefficient functional linear regression. Bernoulli, 16:730–758.
  • Xiao et al. (2016) Xiao, L., Zipunnikov, V., Ruppert, D., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26:409–421.
  • Yang and Zou (2013) Yang, Y. and Zou, H. (2013). gglasso: group lasso penalized learning using a unified bmd algorithm. R Foundation for Statistical Computing.
  • Yang and Zou (2015) Yang, Y. and Zou, H. (2015). A fast unified algorithm for solving group-lasso penalize learning problems. Statistics and Computing, 25:1129–1141.
  • Yao et al. (2005) Yao, F., Müller, H.-G., and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100:577–590.
  • Yuan and Lin (2006) Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49–67.
  • Zhao et al. (2017) Zhao, S., Shojaie, A., and Witten, D. (2017). In defense of the indefensible: A very naive approach to high-dimensional inference. arXiv preprint arXiv:1705.05543.
  • Zhou et al. (2007) Zhou, P., Lowery, M. M., Englehart, K. B., Huang, H., Li, G., Hargrove, L., Dewald, J. P. A., and Kuiken, T. A. (2007). Decoding a new neural-machine interface for control of artificial limbs. Journal of Neuralphysiology, 98:2974–2982.
  • Ziegler-Graham et al. (2008) Ziegler-Graham, K., MacKenzie, E. J., Ephraim, P. L., Travison, T. G., and Brookmeyer, R. (2008). Estimating the prevalence of limb loss in the united states: 2005 to 2050. Archives of Physical Medicine and Rehabilitation, 89:422–429.
  • Zou (2006) Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429.

Supplementary Material

This Supplementary Material consists of five sections. Section A provides details for the fit without sparse penalty. Section B presents additional analysis results for the kinematic data. Section C and Section D provide additional simulation results. Section E details the algorithm of the post-selection predictive inference.

Appendix A Generalized additive model

Let the observed data, in general, be where all terms bear the usual meaning as before and are described in the manuscript. Write the penalized criterion for the model (4.3) with the smoothing penalty only in matrix form as

where is a vector with elements ’s, is an matrix having the th row , is a vector of parameters, and Following Ruppert et al. (2003); Wood (2017), taking the derivative with respect to yields we obtain the initial estimates of the regression coefficients as ’s for a given . The corresponding Bayesian posterior covariance matrix is where is estimated from the residual sum of squares. Predict the responses as

The and are unknown in practice; one approach to select the optimal tuning parameters is block CV. The post-selection model (4.7) is fitted using the above intuition but with the reduced subset

Appendix B Additional results of EMG selection for movement data

In this section, we present more results of the analysis of the EMG and movement data.

b.1 Data restructuring of EMG signals

Our idea is to characterize the velocity at an instance as a function of the recent past EMG signals. Figure 9 displays visually the data reconstruction of functional predictors using the recent past EMG signals.

Figure 9: Visualization of data restructuring of EMG signals and joint finger angles at 0.9 second and 2.3 second. Time at current position (blue dot on black line) used to extract concurrent and previous values of the two EMG signals, shown in red and green in (a). The time domain for each set of past EMG signal measurements is rescaled to Reconstructed EMG curves are plotted in gray lines on the top-right and bottom-right panels, and two curves corresponding to 0.9 second and 2.3 second are highlighted in (b) and (c) for EMG-7 (green solid line) and EMG-12 (red solid line) respectively.

b.2 Variation in EMG curves

We use functional principal component analysis technique to examine the main sources of variability in the curves. Figure

10 illustrates the EMG curves ’s associated with a muscle (flexor digitorum) in the forearm that contributes to finger flexion. The first three functional principal components (FPCs) of the estimated marginal covariance of ’s are also plotted. We observe one key feature that solely explains the majority of variation in the curves associated with the EMG signal; see the first FPC in Figure 10.

Figure 10: Restructured EMG curves corresponding to the extensor digitorum

muscle and the first three eigenfunctions from left to right.

b.3 Variable selection for wrist movements

Table 4 shows the results of the model selection performance for the wrist movements with consistent and varying patterns. In most cases, the competing methods select one muscle for wrist extension and another for flexion. As described in the manuscript, there are many potential muscles that contribute to similar wrist movements. Therefore selection between the alike muscles is desired to reduce the redundancy of EMG information. Notice both agLASSO and SAFE-gLASSO attain optimal RSPs. In contrast FAR-gSCAD shows suboptimal RSPs and LAD-gLASSO performs poorly in selecting desired variables; follow the column “RSP” for competing methods in Table 4. While both these methods have tendency to select noise variables in addition to the true positives, the problem in the selection by LAD-gLASSO is severe; follow the column “FPR” in Table 4.

agLASSO LAD-gLASSO FAR-gSCAD SAFE-gLASSO Pattern RSP TPR FPR RSP TPR FPR RSP TPR FPR RSP TPR FPR 1.00 [2] 100 [2] 0 [0] 0.57 [8] 100 [6]