Researchers are often interested in using longitudinal data to estimate the causal effects of hypothetical time-varying treatment interventions (equivalently, strategies or rules) on the mean or risk of a future outcome in a study population. Standard regression or conditioning methods for confounding control generally fail to recover such causal effects when time-varying confounders are themselves affected by past treatment (Robins and Hernán, 2009). For example, in studies of the effect of different time-varying antiretroviral treatment strategies on long-term mortality risk in HIV-infected patients, CD4 cell count is a time-varying confounder; i.e., CD4 cell count at follow-up time is a risk factor for both future treatment initiation and future mortality. In addition, CD4 cell count at follow-up time is, itself, affected by whether treatment has been previously initiated. As another example, in studies of the effect of different time-varying interventions on daily minutes of physical activity on long-term coronary heart disease (CHD) risk in a healthy population, body mass index (BMI) is a time-varying confounder; i.e., BMI at follow-up time is a risk factor for future inactivity and future CHD. Also, BMI at time is, itself, affected by an individual’s past physical activity.
In such settings, alternative estimators derived from Robins’s g-formula may recover effects of time-varying treatment interventions under untestable assumptions, including that sufficient covariates are measured to control confounding by unmeasured risk factors (Robins, 1986). The package gfoRmula implements in R one such estimator: the parametric g-formula, also known as parametric g-computation or the plug-in g-formula (Hernán and Robins, 2018)
. In general, the g-formula characterized by a user-specified treatment intervention is a high-dimensional sum or integral over all observed treatment and confounder histories. This sum is over a function of (i) the observed outcome mean (or, in survival settings, time-varying hazards) conditional on each observed treatment and confounder history, (ii) the observed joint distribution of the confounders at each timeconditional on each observed treatment and confounder history, and (iii) an intervention density defining the time-varying user-specified treatment rule. The parametric g-formula estimates this function in realistic high-dimensional settings via a Monte Carlo simulation that relies on consistent estimates of the quantities (i) and (ii), as well as the user-specified intervention density.
The gfoRmula package accommodates general user-specified time-varying treatment interventions which may be static or dynamic (Hernán et al., 2006; Orellana et al., 2010a, b; van der Laan et al., 2005; Murphy et al., 2001; Cain et al., 2010; Young et al., 2011) and, further, deterministic or random (Díaz Muñoz and van der Laan, 2012; Haneuse and Rotnitzky, 2013; Young et al., 2014, 2018; Kennedy, 2019). The algorithm implemented by the package can additionally estimate the extended g-formula of Robins et al. (2004) which, under a stronger no unmeasured confounding assumption, may identify effects of treatment assignment rules that depend on the natural value of treatment at ; the value of treatment that would have been observed at were the intervention discontinued right before (Richardson and Robins, 2013). In addition to the quantities (i) and (ii), the parametric extended g-formula relies on an estimate of the observed distribution of treatment at each time conditional on past treatment and confounder history.
The package also allows: 1) binary (e.g. treat versus do not treat) or continuous/multi-level (e.g. dose, daily minutes of physical activity) time-varying treatments; 2) different types of outcomes (survival or continous/binary end of follow-up); 3) data with competing events (survival outcomes) or truncation by death (end of follow-up outcomes) and loss to follow-up and other types of censoring events; 4) different options for handling competing events in the case of survival outcomes; 5) a random measurement/visit process; 6) joint interventions on multiple treatments; and 7) general incorporation of a priori knowledge of the data structure. The gfoRmula R package adapts many of the capabilities of the GFORMULA SAS macro to implement the parametric g-formula (Logan et al., 2016). However, unlike the SAS macro, the gfoRmula R package more easily allows users to incorporate their own helper functions for estimating (i) and (ii) beyond automated options included within the package.
The structure of the article is as follows: In Section 2, we review the longitudinal data structure of interest, the g-formula and identifying assumptions that give equivalence between this function and a counterfactual mean or risk under a user-specified time-varying treatment strategy. In Section 3, we review the parametric g-formula estimation algorithm. In Section 4, we describe input data set requirements and core features of the package. In Section 5, we give various code examples. In Section 6, we describe additional, more advanced package features. Finally, in Section 7, we provide a discussion.
2.1 Observed data structure
We consider a longitudinal study where measurements of treatment(s), and covariates are regularly updated on each ofsubjects over a specified follow-up period. Let denote fixed follow-up intervals (e.g. months) with baseline measurements taken in interval and interval corresponding to the specified end of follow-up (e.g. 60 months).
In each interval assume the following are measured: Let
be a treatment variable (or vector of treatment variables) measured in interval(e.g. an indicator of antiretroviral treatment initiation by interval ; minutes of physical activity in interval ) and a vector of time-varying covariates (e.g., CD4 cell count, BMI) assumed to precede with possibly including time-fixed baseline covariates (e.g., race, baseline age). In clinical cohorts, including studies based on electronic medical records, may be defined in terms of last measured values of a covariate relative to interval . For example, in a clinical cohort of HIV-infected patients, CD4 cell count is not measured every month. It is only measured in a month when the patient comes for a visit. In this case may be defined in terms of the last measured value of CD4 cell count relative to month . Further, may contain an indicator of whether that last measured value is current or not (i.e. if a visit at has occurred) (Hernán et al., 2009).
Let denote the outcome of interest. may correspond to either a fixed (continuous or binary) end of follow-up outcome in interval (e.g. blood pressure in interval ; an indicator of obesity in interval ) or a survival outcome; i.e., an indicator of failure from an event of interest by (e.g. an indicator of stroke by ). For survival outcomes, we implicitly include in an indicator of failure from the event of interest by earlier (). Past values of outcomes at the end of follow-up may also be components of (e.g. when the outcome is obesity status at , obesity at any may be a component of ). Note that “end of follow-up” may correspond to any user-specified follow-up of interest for which data is available and need not correspond to the administrative end of the study (e.g. the user may specify as 10 month follow-up even if the administrative end of the study was 60 months).
Let denote an indicator of censoring and an indicator of a competing event (for a survival outcome) or truncation by death (for an end of follow-up outcome) by interval , respectively. When a survival outcome is not subject to competing events (e.g. all-cause mortality), we define for all . For end of follow-up outcomes, the user must define as an implicit component of (i.e. as a censoring event). For survival outcomes, the user may choose to define as an implicit component of or . Under the latter choice, competing events are not treated as censoring events. See Young et al. (2019) as well as below regarding how this choice impacts the interpretation of the analysis and estimation algorithms.
We denote the history of a random variable using overbars; for example,is the observed treatment history through interval . By notational convention, we set and to be identically .
2.2 Causal effect definitions
In each interval , we generally define an intervention on that may, at most, depend on past measured variables as a random draw from an intervention density with a possible realization of .
We classify an interventionon as deterministic if may only equal zero or one for all and ; otherwise we classify it as random. We may further classify the intervention as static if does not depend on for any ; otherwise we classify it as dynamic. See Young et al. (2014) and Young et al. (2018), as well as Sections 4.9 and 6.5, for examples.
Beyond interventions of the form , we will also define an intervention on that, in addition to past measured variables, depends on the natural value of treatment at . We define such an intervention as a random draw from an alternative intervention density where is any value in the support of which, in the observational study, coincides with the natural value of treatment at (Richardson and Robins, 2013; Young et al., 2014).
Our goal is to estimate the causal effect on the mean of (which corresponds to the risk of the event of interest by in the case of survival outcomes) had, contrary to fact we implemented two different hypothetical interventions on at all in the study population where these interventions may be any user-specified choices of either or . Implicit in the definition of all interventions is “abolish censoring throughout follow-up” or “Set at all ”. By this, how the user chooses to define will impact the interpretation of the estimand (Hernán and Robins, 2018). For survival outcomes, when is defined as an implicit component of , this effect is a special case of a direct effect that does not capture any treatment effect on the competing event; otherwise it is a special case of a total effect that may capture these effects (Young et al., 2019).
2.3 Identifying assumptions and the g-formula
Let be a deterministic regime, strategy, or intervention (static or dynamic) on that depends, at most, on the measured covariate and treatment history, characterized by the intervention density that also eliminates censoring, where is any component of and is recursively defined by the function of , .
Define and as the outcome and covariate histories, respectively, for an individual in the study population had, possibly contrary to fact, his/her treatment been assigned according to a deterministic regime . Further, let be the set of all deterministic interventions on (both static and dynamic). Following Robins (1986), the mean of had all subjects been assigned treatment according to and had censoring been eliminated is equivalent to where is a specific weight defined in terms of the choice of . See also the appendix of Young et al. (2014).
We now define three -specific identifying conditions:
Exchangeability: For all
The exchangeability condition (1) is expected to hold in an experiment where the treatment and censoring are physically randomized at each possibly depending on treatment and covariate history . This condition is not, however, guaranteed to hold in an observational study, or in a randomized trial with censoring, and cannot be empirically examined. Exchangeability is sometimes referred to as the assumption of “no unmeasured confounding” and the “measured confounder history through ”.
Consistency: If and then and .
Given the three conditions above hold for all such that positivity holds when we replace the observed treatment density with the intervention density , then is equivalent to the g-formula characterized by (Robins, 1986):
where is the observed joint density of the confounders at conditional on treatment and covariate history, and remaining uncensored, through . We use a summation symbol in (3) and elsewhere for notational simplicity. However, in general, when contains any continuous components, sums would be replaced with integrals.
When interest is in a survival outcome, expression (3) is the g-formula for risk of the event of interest by under intervention which, pulling the implicit prior survival indicator out of , can be more explicitly written as:
As above, when the survival outcome is subject to competing events, we may choose to treat competing events () as an implicit component of or , . Under the former, expression (4) will not depend on the distribution of competing events. Under the latter, it will depend on this distribution; in this case, we can more explicitly write expression (4) as
See Young et al. (2019) for details.
Conditions required for identification based on only observed variables of the mean/risk of under an alternative intervention characterized by , which additionally depends on the history of the natural value of treatment, are generally stronger than those defined above. We refer the reader elsewhere for details of these conditions (Richardson and Robins, 2013; Young et al., 2014). Provided these stronger conditions hold, the mean of under an intervention characterized by is equivalent to the extended g-formula of Robins et al. (2004)
where is the observed treatment density conditional on the measured past evaluated at some possibly realized values .
In motivating the general estimation algorithm described in the next section, it is useful to consider a generic version of the g-formula for the mean/risk of the outcome characterized by either an intervention density which depends on the natural value of treatment at or which depends, at most, on the measured past treatment and confounder history. For and realization , we have :
is algebraically equivalent to (6) when is chosen as some and algebraically equivalent to (3) when is alternatively chosen as some as, in this latter case, the expression does not depend on the observed treatment density (Young et al., 2014).
Similarly, for survival outcomes, we can expressly write the g-formula for risk by under generic treatment interventions
when the outcome is subject to competing events and competing events are not treated as censoring events. Following arguments above and in Young et al. (2019), replacing with 1 for all and would give the g-formula for risk by under an intervention when competing events are treated as censoring events and, thus, under an intervention that eliminates competing events along with other possible forms of censoring.
3 Estimation algorithm
3.1 Survival outcomes
The following describes the general computational algorithm for estimating the g-formula for risk of the outcome (the event of interest by ) under a user-specified intervention . This corresponds to (i) expression (8) when competing events are not treated as censoring events; (ii) a restricted version of (8) that replaces with 1 when competing events are treated as censoring events for all (i.e. an implicit component of ) and ; and (iii) a restricted version of (8) with for all when the outcome is not subject to competing events.
Let such that for all ,
where is the component of the vector , for a, generally, arbitrary permutation of these components. Note that certain permutations will be preferred when a priori knowledge of the distributions of certain components of conditional on values of other components is known (see Section 6.1).
For a user-chosen value of , treatment rule and permutation of , , we apply the following algorithm to a subject-interval input data set constructed according to the instructions of Section 4.2:
Using all subject interval records:
If , estimate the conditional densities under a distributional assumption on given “history”
, . This distribution will be used to simulate a value of in Step 2 at each . For example, dichotomous
will, by default, be generated from a Bernoulli distribution with mean. For with many levels, the user can select among several pre-specified distributions (e.g. Normal) or can rely on a user-supplied distribution. The user has the option to estimate the conditional mean of for a specified function of the “history” using pre-specified fit options (e.g. generalized linear models with specified link function) or other user-supplied functions (e.g. generalized additive models, classification and regression trees) that produce a fitted object based on one line of data. Syntax for these specifications is described in Section 4.6 via specification of the parameters
covparams. These conditional mean models are pooled over time and thus the user must specify the function of time , if any, on which they will depend (see Sections 4.6 and 4.7
). When the variance ofis not a function of the mean under the distributional assumption (e.g. when is assumed Normal), the variance is automatically estimated by the model residual mean squared error.
Estimate the conditional probability (discrete hazard) of the event of interest at each timeconditional on treatment and covariate history and surviving and remaining uncensored to the previous time,
. These estimates are automatically based on fitting a pooled over time logistic regression model withthe dependent variable where the user specifies the function of “history” included as independent variables in the fit. See Section 4.8.
If the event of interest is subject to competing events and competing events are not defined as censoring events: Estimate the conditional probability (discrete hazard) of the competing event at each time conditional on treatment and covariate history and surviving and remaining uncensored to the previous time . These estimates are automatically based on fitting a pooled over time logistic regression model with the dependent variable where the user specifies the function of “history” and time included as independent variables in the fit. See Section 4.8.
Select a value , possibly such that . If resample the original ids times with replacement and create a new data set where each possibly resampled record has a unique id . Using the baseline covariate data from the original observations at baseline (if ) or the resampled observations (if ), for and , do the following:
If , set to the observed values of for id . Otherwise, if , iteratively draw from estimates of the conditional densities (based on specifications from step 1.a) evaluated at , , the previously drawn covariates and , the previously assigned treatment through under the user-chosen strategy.
Denote the observed (at ) or drawn (at ) value(s) of treatment(s) in the vector as . Assign according to the user specified rule which may or may not depend on . For example, suppose is defined as the static deterministic strategy “always set treatment to the value 1.” In this case, is always set to 1. Another example is “if the natural value of treatment at is less than 30, set treatment to 30. Otherwise, do not intervene.” In this case, the observed/drawn value of is examined. If then is set to 30. Otherwise, is set to . A third example is “no intervention on ” or the so-called natural course intervention. In this case, the intervention density is defined as the observed treatment density and is always set to . See Sections 4.9 and 6.5 for required syntax for specifying the treatment rules.
Estimate based on specifications from step 1.b. Denote this estimate .
If the event of interest is subject to competing events and competing events are not defined as censoring events: Estimate based on specifications from step 1.c. Denote this estimate .
Compute the intervention risk estimate as
In the special case where either the event of interest is not subject to competing events or competing events are treated as censoring events is set to 0 for all in (9).
Steps 2 and 3 above can be repeated for multiple user-defined treatment rules
and risk ratio/risk difference estimates constructed based on a specified reference intervention. The natural course intervention is always automatically implemented and is the default reference unless otherwise specified by the user. Ninety-five percent confidence intervals are computed based on repeating the entire algorithm (Steps 1-3) inbootstrap samples based on percentiles of the bootstrap distribution for user specified . The user can also opt to additionally output estimates of the risk (risk ratios/risk differences) by all times , in addition to the risk by .
In addition to the parametric g-formula estimate of the natural course risk, a nonparametric estimate will also be computed automatically and reported in the output. When censoring events are present in the data, this is a completely unadjusted estimate of the natural course risk (i.e. it relies on marginal exchangeability for censoring)(Young et al., 2011). When competing risks are absent or treated as censoring events, this is computed as the complement of a product-limit survival estimator (Kaplan and Meier, 1958) in the observed data. When competing risks are present and not treated as censoring events, this is computed using an Aalen Johansen estimator (Aalen and Johansen, 1978). See Young et al. (2019) and Logan et al. (2016) for additional details.
3.2 Fixed end of follow-up outcomes
The algorithm for estimating the g-formula for the mean of an outcome at the end of follow-up under a user-specified intervention is nearly identical to that for the risk in the case of survival outcomes described in the previous section but relies on a slightly different input data structure (see Section 4.2). In the modified algorithm
Step 1.b. is replaced by
Modified Step 1.b.: Using only records on line , estimate the mean of the outcome at conditional on treatment and covariate history and remaining uncensored to the previous time,
. This estimate is automatically based on fitting a linear regression model withthe dependent variable where the user specifies the function of “history” . Because this fit will only use records on line of the data it will not be a function of time. See Section 4.8.
Step 2.c. is replaced by:
Modified Step 2.c.: For only, estimate based on specifications from step 1.b. Denote this estimate .
Step 3 is replaced by
Modified Step 3: Compute the intervention mean estimate as
As discussed above, in this case, if a subject dies prior to , this must be treated as a censoring event ( must be considered a component of the vector , ) and steps 1.c and 2.d are not implemented.
4 Using the gfoRmula package
4.1 Specifying the outcome type
The gfoRmula package gives the user access to three different
gformula function variants, each intended for a different outcome type.
gformula_survival should be used for survival outcomes (e.g. an indicator of all-cause mortality by );
gformula_continuous_eof should be used for fixed continuous end of follow-up outcomes (e.g. blood pressure at ); and
gformula_binary_eof should be used for fixed binary end of follow-up outcomes (e.g. indicator of high cognitive score at ).
4.2 Required structure of the input dataset
The input dataset to the
gformula function must be an R
data.table. For all outcome types, each row of the input dataset should contain one record for each time (which must be a numeric variable in R, and the name of which is specified by the parameter
time_name) for each subject present at baseline (specified by the parameter
id). The index must start at 0 and increase in increments of 1. Each column of the data set will index a time-varying covariate , . A subject’s baseline confounders (the time-fixed components of ) should be repeated on each line for that subject in the data set. A subject who is censored in interval will have only records with time index on the last line. Other requirements are dependent on the outcome type which we describe below. Each subject can have a maximum of records with , the end of follow-up of interest.
4.2.1 Survival outcomes
In this case, the data set should additionally contain a column indicating, on line whether the event of interest occurred by the next interval. A subject who first has the event of interest in interval will have only records with time index on the last line. If then the user has the choice to code as either
NA or 0. Records coded as
NA will be excluded when fitting the discrete hazard model in step 1.b of the algorithm and in nonparametric estimation of the natural course risk (see Section 3) while records coded as 0 will be included in computing these estimates. This choice impacts how “risk sets” at are defined. It will make no difference to the estimates when the intervals are made small enough such that there are no failures in intervals where there are also censoring events. Otherwise, this choice may impact effect estimates to some degree. Note that by (i) the fact that the algorithms of the previous section do not rely on an estimate of the distribution of censoring and (ii) the required structure of the input data set such that subjects have no more records after they are censored, the input data set does not actually require any variables defining censoring indicators.
If the event of interest is subject to a competing event for some subjects and some and the user chooses to treat competing events as censoring events then should be treated like any component of and coding requirements outlined above must be applied. In turn, in this case, the input data set does not require a variable for . If, alternatively, competing events are not treated as censoring events, then the user must include in the input data set a time-varying indicator of the competing event . If for a given subject, then that subject will only have lines in the data with time index on the last line and, on that line, must be coded
NA. See the data sets
basicdata included with the package for examples of input data sets for survival outcomes without and with competing events, respectively. The user can specify the parameter
time_points to be a follow-up interval in order to compute the risk estimates through (with the max number of records for an individual in the data set). The default value of
time_points is .
4.2.2 End of follow-up outcomes
In this case, the data set should additionally contain a column with the value of the outcome in interval . Only the value of this variable on line is used in the algorithm and it does not matter what values are coded on lines . These may be left blank or given any value. As discussed above, death or other truncation events, if present, must be treated as censoring events in this case. For subjects who are first censored in interval , these subjects will have records in the data and on line must be coded as
NA. For subjects censored prior to interval , as above, the value of will be ignored in the estimation algorithm. See the data set
continuous_eofdata included with the package for an example of an input data set for an end of follow-up outcome. Unlike for survival outcomes, the default value of
time_points () cannot be changed for end of follow-up outcomes.
4.3 Specifying the input data set, individual identifier, time index, outcome and competing events
The input data set is specified by the argument
obs_data, the column of this data set specifying the individual identifier by the argument
id, the column of this data set specifying the time index by the argument
time_name, the column of this data set specifying the outcome by the argument
outcome_name and, if present and not defined as a censoring event, the column of this data set specifying a competing event by the argument
compevent_name. Sample syntax is given by:
gformula_survival(..., obs_data = obs_data, id = ’ID’, time_name = ’t0’, outcome_name = ’Y’, compevent_name = ’D’)
compevent_name should only be specified for survival outcomes and when the user chooses not to treat competing events as censoring events. Note that the input data set must be a
4.4 Specifying time-varying and baseline covariates
The time-varying covariates are defined by the vector argument
covnames. Baseline confounders (the time-fixed components of ) are defined by the vector argument
basecovs. Sample syntax is given by:
gformula_survival(..., covnames = c(’Z1’, ’Z2’, ’Z3’), basecovs = c(’race’,’sex’)¯¯)
When the user chooses to allow outcome and/or covariate distributions to be certain discrete functions of time, an additional argument must be added to
covnames (see Section 4.7).
4.5 Generating covariate histories
Recall that, in step 1.a. of the estimation algorithm (Section 3), the user must specify a function of “history” when estimating the conditional distribution of each conditional on “history” for each . For survival outcomes, the user must specify a function of “history” when estimating the observed conditional hazards of the event of interest at each in step 1.b. (and of the competing event in step 1.c. if competing events are not treated as censoring events) conditional on “history” . For end of follow-up outcomes, the user must specify a function of “history” when estimating the mean of the outcome at conditional on “history” only.
histories must be used to specify any desired functions of history that will be used for estimation in step 1 (and then, in turn, in the simulation in step 2) that cannot be defined within a model statement in R . For example, consider a time-varying covariate named
Z4 in the input data set and suppose the user assumes that the distributions of “future” covariates , depend on the history of (or ) through the cumulative average of this variable through each ; i.e. for and otherwise. Then the user must use
histories to create in the data set
obs_data this additional time-varying covariate containing the cumulative average of at each . Alternatively, if the user made the assumption that the distributions of these “future” covariates/outcomes only depend on through the current value at or a transformation of the current value (i.e. square root, restricted cubic spline, quadratic or cubic function) which can be made within an R model statement, then no additional covariates need to be created as these transformations can be made within a model statement (see Section 4.6).
Desired functions of history that cannot be created within a model statement in R must be created using
histories to ensure that estimation decisions in step 1 are carried forward correctly through the simulation in step 2. Pre-coded functions of history for a covariate named
Zj in the input data set include:
lagged: this adds a variable to the specified input data set named
lagi_Zj, containing the th lag of
Zjrelative to the current time (i.e. its value at for ) for all with the desired number of lags.
lagi_Zjis set to 0 on lines with
cumavg: this adds a variable to the specified input data set named
cumavg_Zj, which contains the cumulative average of
Zjup until the current time . It is set to
lagavg: this adds a variable to the specified input data set named
lag_cumavgi_Zjwhich contains the th lag of the cumulative average of
Zjrelative to the current time , .
lag_cumavgi_Zjis set to 0 on lines with .
Note the desired number of lags is specified using the argument
covmodels (see Section 4.6).
The package will apply the function of history listed in the th element of the vector argument
histories to all variables listed in the th element of the list argument
histvars. Therefore, the length of
histories need to match. Sample syntax that would add lagged, cumulative average and lagged cumulative average functions of the covariates
Z3 as new variables to the input data set
dataname for use in estimation of conditional distributions/hazards/means (see Sections 4.6-4.8) is as follows:
gformula_survival(..., obs_data = dataname, covnames = c(’Z1’, ’Z2’, ’Z3’), basecovs = c(’race’, ’sex’)¯, histories = c(lagged, cumavg, lagavg), histvars = list(c(’Z1’, ’Z2’, ’Z3’), c(’Z1’, ’Z2’, ’Z3’), c(’Z1’, ’Z2’, ’Z3’)))
4.6 Specifying covariate distributions
The vector argument
covtypes is used to specify the distributions of each time-varying covariate conditional on a function of history , and , how parameters of those distributions will be estimated and, in turn, how covariate values in step 2 of the algorithm will be simulated. The gfoRmula package supplies a number of pre-programmed options for input to the argument
covtypes which are described below:
’categorical time’. Each of the elements of
covtypes requires its own set of specific sub-parameters, which are contained within the list argument
covparams. Each sub-parameter vector within this list must be the same length as
We now give a description of each of the pre-programmed options for input to
covtypes along with required and optional sub-parameters:
’binary’: The mean of the covariate conditional on history (see Section 4.5) is estimated via the estimated coefficients of a generalized linear model (GLM) (using
glmin R), where the family of the GLM is
binomial. The covariate values are simulated at each time in step 2 of the estimation algorithm (Section 3) given the previously simulated history under intervention by sampling from a Bernoulli distribution with parameter this estimated conditional mean. Assuming the covariate in question is the th entry in
covnames, then the sub-parameter
covmodels, which is a vector of length
covnames, must contain as its th entry a model statement for estimating the mean of the covariate in the th entry of
covnamesas a function of history and possibly
time_name. Optionally, the user can also specify the th entry of
covlink, which is a vector containing the link function(s) for the GLM specified in the