1 Introduction
Generalized partially linear models are a semiparametric extension of generalized linear models (McCullagh & Nelder 1989), such that the conditional mean of a response variable
is related to a linear function of some covariates and a smooth function of other covariates . Letbe independent and identically distributed observations from the joint distribution of
. Consider the following model(1) 
where is an inverse link function,
is a vector of unknown parameters,
is an unknown, smooth function. Estimation in such models has been studied in such models in at least two approaches. In one approach, theory and methods have been developed in the case where is lowdimensional (for example, a scalar) and kernel or spline smoothing is used to estimate at suitable rates of convergence (e.g., Speckman 1988; Severini & Staniswalis 1994). In another approach with relatively highdimensional, doubly robust methods have been proposed to obtain estimators of which remain consistent and asymptotically normal at rateif either a parametric model for
or another parametric model about, for example, is correctly specified (Robins & Rotnitzky 2001; Tchetgen Tchetgen et al. 2010).In this note, we are concerned with model (1) with a binary response (taking value 0 or 1) and a logistic link, hence a logistic partially linear model:
(2) 
where . We provide a new class of doubly robust estimators of which remain consistent and asymptotically normal at rate if either a parametric model for or a parametric model for is correctly specified, under mild regularity conditions but without additional parametric or smoothness restriction.
Previously, doubly robust estimators of were derived in model (1) with respect to parametric models for and , in the case of an identity link, , or a log link, (Robins & Rotnitzky 2001). For the logistic link, however, no doubly robust estimator of can be constructed in this manner with respect to parametric models about and (Tchetgen Tchetgen et al. 2010). In fact, doubly robust estimators of in model (2) were obtained with respect to parametric models about and , the conditional density of given and (Chen 2007; Tchetgen Tchetgen et al. 2010). Therefore, our result in general allows doubly robust estimation for in model (2) with respect to more flexible nuisance models about the conditional mean than about the conditional density . In the special case of binary , our class of doubly robust estimators of is equivalent to that in Tchetgen Tchetgen et al. (2010), but involves use of the parametric model for in a more direct manner.
We also propose two specific doubly robust estimators of in model (2) based on efficiency considerations. The first estimator requires numerical evaluation of expectations under a model for beyond the conditional mean unless
is binary, but can be shown to achieve the minimum asymptotic variance among our class of doubly robust estimators when both models for
and are correctly specified. Compared with the locally efficient, doubly robust estimators in Tchetgen Tchetgen et al. (2010), this estimator remains consistent if the model for is misspecified but the less restrictive model for is correctly specified. Our second estimator is numerically and statistically simpler than our first one: it does not involve numerical integration or a parametric specification of the conditional density , and can achieve a similar asymptotic variance as our first estimator, especially when the true value of is close to 0.2 Doubly robust estimation
For a semiparametric model, doubly robust estimation can often be derived by studying the orthogonal complement of the nuisance tangent space (Robins & Rotnitzky 2001). Denote by the Hilbert space of functions , with the inner product defined as . Denote , , and by and the truth of and . For model (2), the orthogonal complement of the nuisance tangent space is known to be (Bickel et al. 1993; Robins & Rotnitzky 2001)
(3) 
Our first result is a reformulation of as follows. See the Appendix for all proofs.
Proposition 1.
Assume that almost surely. The space can be equivalently expressed as
(4)  
(5) 
where is a function and
Our reformulation (5) suggests the following set of doubly robust estimating functions. Let be a parametric model for and, independently, be a parametric model for . The two functions and are variation independent, because and are variation independent (Chen 2007). For a function , define
(6) 
by letting in (5). Then
is an unbiased estimating function for
if either model or is correctly specified.Proposition 2.
If either for some or for some , then
provided that the above expectation exists.
Various doubly robust estimators can be constructed through (6). In general, let be an estimator of , for example, the maximum likelihood estimator, which satisfies for some constant and influence function such that if model is correctly specified. Let be an estimator of , for example, the leastsquares or related estimator, which satisfies for some constant and influence function such that if model is correctly specified. Define an estimator as a solution to
Under suitable regularity conditions (e.g., Manski 1988), it can be shown that if either model or is correctly specified, then
(7) 
where , , and . The asymptotic variance of can be estimated by using the sample variance of an estimated version of the influence function in (7).
We now provide several remarks. First, estimating function (6) can be expressed as
(8) 
where
, representing the conditional probability
under the conjunction of model (2) and model . Therefore, our doubly robust estimating function involves the product of two “residuals”, and . Similar products can also be found in previous doubly robust estimating functions for in model (1) with the identity or log link (Robins & Rotnitzky 2001). However, a notable feature in (8) is that the residual used from the model is , associated with the estimating equation for calibrated estimation (Tan 2017), which in the case givesThe standard residual from logistic regression is
, associated with the score equation for maximum likelihood estimation, which in the case givesIn general, the estimating function is not unbiased for if model is correctly specified but model is misspecified.
Second, our results can also be used to shed light on the class of doubly robust estimators in Tchetgen Tchetgen et al. (2010), which are briefly reviewed as follows. For model (2), the conditional distribution of jointly given can be determined as (Chen 2007)
(9) 
where is some fixed value (assumed to be 0 hereafter), , and the conditional densities and are variationindependent nuisance parameters. Let be some prespecified conditional densities and . By using (9), the orthocomplement of the nuisance tangent space in model (2) can be characterized as (Tchetgen Tchetgen et al. 2010)
(10) 
where for , and denotes the expectation under . It can be verified by direct calculation that the two sets on the right hand sides of (3) and (10) are equivalent to each other: each element in the right hand side of (10) can be expressed in the form of elements in the right hand side of (3), and vice versa. Let or equivalently be a parametric model for or , and let be a parametric model for . For a function , the estimating function based on (10) in Tchetgen Tchetgen et al. (2010) can be equivalently defined, based on (3), as
(11) 
where and denotes the expectation under the law defined as (9), but evaluated at and . The estimating function (11) is doubly robust, i.e. unbiased for if either model or is correctly specified. Although (11) appears to be asymmetric in and , the double robustness of (11) follows from that of its equivalent version based on (10), as shown by exploiting the symmetry in and in Tchetgen Tchetgen et al. (2010). See also Tchetgen Tchetgen and Rotnitzky (2011) for an explicit demonstration of symmetry of (11) in and with in the case of a binary .
As an interesting implication of our reformulation (4) in Proposition 1, the estimating function (11) can be equivalently expressed as
(12) 
which involves the expectation under , instead of under the law (9) evaluated at and . Therefore, (12) is computationally much simpler than (11) and its equivalent version based on (10). Moreover, the double robustness of (12) with respect to and can be directly shown as in the Appendix, without invoking its equivalent version based on (10).
Third, we compare our doubly robust estimating functions with those in Tchetgen Tchetgen et al. (2010). For a function , consider the estimating funtion
(13) 
By our reformulation (5), the class of estimating functions over all possible choices of is equivalent to that of over all possible choices of as used in Tchetgen Tchetgen et al. (2010). A subtle point is that the mapping between and depends on , but this does not affect our subsequent discussion. Similarly as (12), the estimating function (13) can be shown to be doubly robust for with respect to models and .
By comparing (6) and (13), we see that our estimating function (6) corresponds to a particular choice of estimating function (13) with , such that (6) depends only on a parametric model for the conditional expectation , but not the conditional density . Therefore, our class of (6) is in general a strict subset of the class of (13) to achieve double robustness with respect to conditional mean models for , except when is binary and hence the classes of (6) and (13) are equivalent.
Fourth, there is a similar characterization of as in Proposition 1, involving expectations under instead of . By symmetry, it can be shown that
where is a function and . Consequently, a similar estimating function as (6) can be derived such that it is doubly robust for with respect to parametric models for and .
3 Efficiency considerations
For our class of doubly robust estimating functions (6), we study how to choose the function based on efficiency considerations. First, the following result gives the optimal choice of with correctly specified models and .
Proposition 3.
If both models and are correctly specified for and respectively, then the optimal choice of in minimizing the asymptotic variance of which admits asymptotic expansion (7) is
where for a column vector .
From this result, it is straightforward to derive a locallyefficient like, doubly robust estimator for . Let be the maximum likelihood estimator in the model , and be the maximum likelihood estimator in a conditional density model as in (11) but compatible with model for , where and is a variance parameter. Consider the estimator with
Then it can be shown under suitable regularity conditions that is doubly robust, i.e. remains consistent for if either model or is correctly specified, and achieves the minimum asymptotic variance among all estimators when both models and including are correctly specified.
It is interesting to compare with the locally efficient, doubly robust estimator for in Tchetgen Tchetgen et al. (2010). For a function , define an estimator as a solution to , where are maximum likelihood estimators as above or, without affecting our discussion here, profile maximum likelihood estimators as in Tchetgen Tchetgen et al. (2010). Then the optimal choice of in minimizing the asymptotic variance of is . In fact, the estimator is locally efficient, i.e. achieving the semiparametruc variance bound in model (2) when both models and are correctly specified. Unless is binary, this semiparametric variance bound is in general strictly smaller than the asymptotic variance achieved by when both models and are correctly specified, because the class of estimating functions (6) is strictly a subset of the class (11), (12), or (13), as discussed in Section 2. In the case of a binary and hence , the two estimators and are equivalent. On the other hand, is doubly robust only with respect to models and , whereas is doubly robust with respect to and and hence remains consistent for if model is misspecified but the less restrictive model for is correctly specified.
Evaluation of the function and hence the estimator in general requires cumbersome numerical integration with respect to the density . For computational simplicity, consider the estimator with scalar . The corresponding estimating function can be shown to become
(14) 
The particular choice can be motivated by the fact that if the true then . Then is nearly as efficient as and, by similar reasoning, also whenever is close to 0. This is analogous to how the easytocompute estimator is related to the locally efficient estimator in Tchetgen Tchetgen et al. (2010, Section 4). Moreover, the estimating function (3) can be equivalently expressed as
which, in the case of a binary , coincides with the estimating function underlying the closedform estimator for in Tchetgen Tchetgen (2013).
4 Conclusion
We derive simple, doubly robust estimators of coefficients for the covariates in the linear component in a logistic partially linear model. Such estimators remain consistent if either a nuisance model is correctly specified for the nonparametric component of the partially linear model, or a conditional mean model is correctly specified for the covariates of interest given other covariates and the response at a fixed value. These estimators can be useful in conventional settings with a limited number of covariates. Moreover, there have been various works exploiting doubly robust estimating functions to obtain valid inferences in highdimensional problems (e.g., Farrell 2015; Chernozhukov et al. 2018; Tan 2018). Our estimating functions can potentially be employed to achieve similar properties in highdimensional settings.
5 Appendix
Proof of Proposition 1. First, we show that for any ,
This follows because
by the law of iterated expectations and then the law of total probability. Then the set (
3) is equivalent to (4). Next, the set (4) is equivalent to , and the set (5) is equivalent to . The two sets are equivalent to each other, by letting .Proof of Proposition 2. By the law of iterated expectations, we have
This immediately shows that if either or , then .
Proof of double robustness of (12). By the law of iterated expectations, we have
This immediately shows that if either or , then .
Proof of Proposition 3. Suppose that both models and are correctly specified, such that and . Then by direct calculation, and hence (7) reduces to
By the proof of Proposition 2, we actually have , where
Therefore, is asymptotically equivalent to a solution to , which can be seen as an estimator for
under the conditional moment condition
. By Chamberlain (1987), the optimal choice of in minimizing the asymptotic variance of such an estimator is , which can be simplified as by direct calculation.References

Bickel, P.J., Klaassen, C.A.J., Ritov, Y., and Wellner, J.A. (1993) Efficient and Adaptive Estimation for Semiparametric Models, The Johns Hopkins University Press, Baltimore.

Chamberlain, G. (1987) “Asymptotic efficiency in estimation with conditional moment restrictions,” Journal of Econometrics, 34, 305334.

Chen, H.Y. (2007) “A semiparametric odds ratio model for measuring association, Biometrics, 63, 413421.

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W.K., and Robins, J.M. (2018) “Double/debiased machine learning for treatment and structural parameters,”
Econometrics Journal, 21, C1C68. 
Farrell, M.H. (2015) “Robust inference on average treatment effects with possibly more covariates than observations.” Journal of Econometrics, 189, 1–23.

Manski, C.F. (1988) Analog Estimation Methods in Econometrics, Chapman & Hall, New York

McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models (2nd edition), Chapman & Hall, London.

Robins, J.M., and Rotnitzky, A. (2001) Comment on the Bickel and Kwon Article, “Inference for semiparametric models: Some questions and an answer,” Statistica Sinica, 11, 920936.

Severini, T.A. and Staniswalis, J.G. (1994) “Quasilikelihood estimation in semiparametric models,” Journal of the American Statistical Association, 89, 501511.

Speckman, P. (1988) “Kernel smoothing in partial linear models,” Journal of the Royal Statistical Society, Ser. B, 50, 413436.

Tan, Z. (2017) “Regularized calibrated estimation of propensity scores with model misspecification and highdimensional data,” arXiv:1710.08074.

Tan, Z. (2018) “Modelassisted inference for treatment effects using regularized calibrated estimation with highdimensional data,” arXiv:1801.09817.

Tchetgen Tchetgen, E.J. (2013) “On a closedform doubly robust estimator of the adjusted odds ratio for a binary exposure,” American Journal of Epidemiology, 177, 13141316.

Tchetgen Tchetgen E.J. and Rotnitzky A. (2011) “Doublerobust estimation of an exposureoutcome odds ratio adjusting for confounding in cohort and casecontrol studies,” Statistics in Medicline, 30, 335347.

Tchetgen Tchetgen, E.J., Robins, J.M., and Rotnitzky, A. (2010) “On doubly robust estimation in a semiparametric odds ratio model,” Biometrika, 97, 171180.
Comments
There are no comments yet.