1 Introduction
The underlying question of this article can be formulated quite easily: Given some real valued random variable
and some valued random variable fulfilling the heteroscedastic transformation model(1.1) 
with some error term independent of and fulfilling and , are the model components
and the error distribution uniquely determined if the joint distribution of
is known? This uniqueness is called identification of a model.Over the last years, transformation models have attracted more and more attention since they are often used to obtain desirable properties by first transforming the dependent random variable of a regression model. Applications for such transformations can reach from reducing skewness of the data to inducing additivity, homoscedasticity or even normality of the error terms. Already
Box and Cox (1964), Bickel and Doksum (1981) and Zellner and Revankar (1969) introduced some parametric classes of transformation functions. Horowitz (1996)proved for a linear regression function
and homoscedastic errors that the model is identified, when is assumed for some and the regression parameter is standardized such that the first component, which is different from zero, is equal to one. Later, the ideas of Horowitz (1996) were extended by Ekeland et al. (2004) to general smooth regression functions . The arguably most general identification results so far were provided by Chiappori et al. (2015) and Vanhems and Van Keilegom (2019), who considered general regression functions and homoscedastic errors as well, but allowed endogenous regressors. Linton et al. (2008) used similar ideas to obtain identifiability of a model with parametric transformation functions as a special case. As will be seen in Section 2 these approaches can not be applied to the heteroscedastic model so that different methods are needed. Despite their practical relevance (e.g. in duration models, see Khan et al. (2011)), results allowing heteroscedasticity are rare. Zhou et al. (2009) showed identifiability in a singleindex model with a linear regression functionand a known variance function
. Wang and Wang (2018) applied this model to lung cancer data. Neumeyer et al. (2016) required identifiability implicitly in their assumptions.In contrast to the approaches mentioned above, it is tried here to avoid any parametric assumption on or
, which to the author’s knowledge has not been done before. Note that the validity of the model is unaffected by linear transformations. This means that for arbitrary constants
equation (1.1) still holds when replacing , and byOf course, one could have chosen an arbitrary as well, but similar to existing results the transformation function will be restricted to be strictly increasing without loss of generality. Nevertheless, at least two conditions for fixing and are needed. Referring to the fact that these conditions will determine the linear transformation they are sometimes called location and scale constraints.
This remainder is organized as follows. First, some assumptions are listed before the main identification result for heteroscedastic transformation models is motivated and stated. Afterwards, a short conclusion in Section 3 is followed by the Appendix, which contains some results on uniqueness of solutions to differential equations and the proof of the main result.
2 The Idea and the Result
Before the identification result can be motivated, some assumptions and notations have to be introduced. First, basic assumptions concerning validity of model (1.1) and continuity of its model components are given.

[label=(A0)]

Let and be real valued and valued random variables, respectively, with
for some transformation, regression and variance functions and .

is a centred random variable independent of with and .

Let the density of be continuous and let and from 1 be continuously differentiable.
Moreover, a regularity assumption for the conditional distribution function of given is needed.

[label=(A4)]

The conditional cumulative distribution function
is continuously differentiable with respect to and . Let be a weight function with support such that for all and such that (with and from 1)are well defined with .
The assumption requires heteroscedasticity of the model. Note that the homoscedastic case was already treated by Chiappori et al. (2015). Later, it will be shown in Remark 2.2 that 1–3 and the first part of 1 exclude the case that there exist a homoscedastic and a heteroscedastic version of model (1.1) at the same time. In the following, the functions and from 1 and 3 are used to show their uniqueness and consequently identification of the model.
2.1 The Transformation Function as a Solution to an Initial Value Problem
Many of the homoscedastic identification approaches mentioned in the introduction are based on the same idea (see Ekeland et al. (2004), Horowitz (2009) and recently Chiappori et al. (2015)). Using the example of Chiappori et al. (2015) their method can be summarized in the following way: Let be the conditional cumulative distribution function of conditioned on . Take the derivatives of with respect to and some component of , divide the first by the latter one and obtain the transformation function by integrating this quotient. After applying some identification constraints the transformation function is identified as it only depends on the joint distribution of . In heteroscedastic models, the reasoning has to be changed since the way, the transformation function enters the conditional distribution function and its partial derivatives, becomes more complex. The latter functions can be written as
(1.2) 
and
Here, is an abbreviation for the derivative and denotes the cumulative distribution function of . Hence, even if 1 is valid the transformation function can not be obtained by simply integrating the quotient
(1.3) 
since the denominator now also depends on the transformation function.
Instead, we consider the reciprocal value of (1.3) and name this :
Next, if is the weight function from 1 can be integrated with respect to as follows to obtain
(1.4) 
with and from 1. Since assumption 1 implies and consequently strict monotonicity of , there exists exactly one root of which will be called
in the following. Due to (1.4) it holds that .
In the following, the problem of identifying model (1.1
) is reduced to solving an ordinary differential equation uniquely. Afterwards, basic uniqueness theorems for initial value problems will imply the main identification result. To this end, rewrite equation (
1.4) to obtain(1.5) 
for all . This indeed can be understood as a differential equation, but an initial condition is needed to obtain an initial value problem. Here, the initial condition
(1.6) 
for some and some is considered (remember that was assumed to be strictly increasing). Theorem A.2 in the appendix yields uniqueness of any solution to this initial value problem on any interval . This identification result can be generalized to all .
2.2 Uniqueness of the Unknown Coefficients
The reasoning above is designed for fixed and , that is, it remains to prove uniqueness of these coefficients. Moreover, it would be desirable to derive an explicit formula for the transformation function instead of only proving its uniqueness. This will be done in the remainder of this section.
First, the initial value problem, which corresponds to the equations (1.5) and (1.6), is solved by
(1.7) 
By straightforward calculations, it can be verified that (1.7) is indeed a solution to the initial value problem. Second, as was already mentioned in the introduction, model (1.1) is not only fulfilled for , but also for any linear transformation of these functions. Therefore, to obtain uniqueness it is necessary to fix these linear transforms. This can be done by requiring so called location and scale constraints and corresponds to fixing and . While
(1.8) 
is chosen as the location constraint the scale constraint is equal to the initial condition (1.6), that is, is viewed as an arbitrary, but fixed positive number. Here, the location constraint was chosen such that equation (1.4) implies . Nevertheless, other location constraints are conceivable as well as can be seen in Remark 2.3.
Consequently, equation (1.7) reduces to
(1.9) 
If there exist two coefficients such that the corresponding transformation functions from (1.9) fulfil model (1.1), it would hold that
Assume without loss of generality . Then,
Therefore, continuous differentiability of and would imply , which due to (1.2) would lead to a violation of 1. Hence, is unique under 1–1, which finally leads to the main identification result. Note that the same argument is valid for transformation functions as in (1.7) since these are simply linearly transformed versions of (1.9).
Theorem 2.1

[label=)]
Moreover, is uniquely determined and it holds that
The proof can be found in Section B.
Finally, two remarks are given dealing on the one hand with further generalizations and implications for future estimation and testing techniques and on the other hand with justifying alternative identification constraints.
Remark 2.2

[label=)]

If is not constant, there are values such that . Consequently, changes its sign for these . If is constant, that is, the error is homoscedastic, this is not the case. Hence, model (1.1) can not be fulfilled for homoscedastic and heteroscedastic errors at the same time.

The identification result can be generalized in many regards. For example, one could have used any other partial derivative in 1 as well. Moreover, can be chosen as a Dirac delta function as well and it is possible to consider error densities with bounded support. See Kloodt (2019) for a more detailed examination. Moreover, it is conjectured that the result can be generalized to conditional independence of and given endogenous regressors similarly to Chiappori et al. (2015).
Remark 2.3
One could have used other scale and location constraints than (1.6) and (1.8). For example, consider for some real numbers the conditions
(1.14) 
Assume there exist two transformation functions such that model (1.1) and the constraints (1.14) are fulfilled. Then, the functions
fulfil model (1.1) and the constraints (1.6) and (1.8). This leads to so that
for all .
A similar reasoning can be applied to show that identification constraints like
for some ensure uniqueness of the transformation function as well.
3 Conclusion and Outlook
The so far most general identification result in the theory of transformation models has been provided. While doing so, the techniques of Ekeland et al. (2004) and Chiappori et al. (2015) have been used to reduce the problem of identifiability to that of solving an ordinary differential equation. Most of the previous results are contained as special cases. The main contribution consists in allowing heteroscedastic errors, which justifies the common practice to assume identifiability like for example in the paper of Neumeyer et al. (2016).
Moreover, the result is constructive in the sense that it does not only guarantee identification of the model, but even supplies an analytic expression of the transformation function depending on the joint cumulative distribution function of the data and some parameter . This parameter is identified, too, and can be expressed as in Kloodt (2019) under the additional assumption of a twice continuously differentiable transformation function.
Due to the explicit character of equation (1.7), future research could consist in analysing the resulting plugin estimator. This will be the topic of a subsequent paper. Furthermore, the presented results could be successively generalized as in Remark 2.2 or by allowing vanishing derivatives of . Moreover, it would be desirable to develop conditions on the joint distribution function of under which model (1.1) is fulfilled. In contrast to the thoughts on identifiability here, such a question addresses the solvability of (1.1), that is, the issue of existence of a solution instead of uniqueness.
Appendix A Uniqueness of Solutions to Ordinary Differential Equations
In this Section, two basic results about ordinary differential equations and uniqueness of possible solutions are given. Theorem A.2 is slightly modified compared to the version of Forster (1999, p. 102) so that the proof is presented as well.
Lemma A.1
Theorem A.2
(see Forster (1999, p. 102) for a related version) Let and be a set such that . Moreover, let be continuous with respect to both components and continuously differentiable with respect to the second component. Then, for all any solution of the initial value problem
is unique.
Proof: Let be two solutions of the mentioned initial value problem. Since
is compact, there exists some such that for all . Consider the distance . Then for all
Gronwall’s Inequality leads to (set ).
Appendix B Proof of Theorem 2.1
Consider a compact interval and recall equation (1.7). Assumption 1 ensures . First, it is shown that as defined in (1.7) is the unique solution to (1.5) on
. For the moment assume
and defineWith the choices and , Theorem A.2 ensures uniqueness of the solution to
By straightforward calculations, it can be verified that (1.7) is indeed a solution to this initial value problem. Since for all , this solution holds for arbitrarily large . Hence, by letting tend to infinity uniqueness of on is obtained.
Now, consider an arbitrary value . Then, if the previous initial condition is replaced by
for some , the same reasoning as before can be used to show that the differential equation
is uniquely solved under this constraint by
for all , where the last equation follows from (1.7). To fulfil the previous scale constraint it is required that
Since this in turn results in expression (1.7) for all , is identified for all . Choosing arbitrarily close to results in
When proceeding analogously for with the initial condition
for some , one has
Recall for all and let . Due to the continuous differentiability of in , one has
On the other hand, it holds that
so that
This leads to the uniqueness of solution (1.10), since uniqueness of was already shown in Section 2.2.
Inserting yields the second part of the assertion, while identification of and
as the conditional mean and standard deviation follows from standard arguments.
Acknowledgements
This work was supported by the DFG (Research Unit FOR 1735 Structural Inference in Statistics: Adaptation and Effciency).
Moreover, I would like to thank Natalie Neumeyer and Ingrid Van Keilegom for their very helpful suggestions and comments on the project.
References
 Bellman (1953) R. Bellman. Stability theory of differential equations. Dover Publications, 1953.
 Bickel and Doksum (1981) P. J. Bickel and K. A. Doksum. An analysis of transformations revisited. Journal of the American Statistical Association, 76:296–311, 1981.
 Box and Cox (1964) G. E. P. Box and D. R. Cox. An analysis of transformations. Journal of the Royal Statistical Society. Series B, 26(2):211–252, 1964.
 Chiappori et al. (2015) P.A. Chiappori, I. Komunjer, and D. Kristensen. Nonparametric identification and estimation of transformation. Journal of Econometrics, 188(1):22–39, 2015.
 Ekeland et al. (2004) I. Ekeland, J. J. Heckman, and L. Nesheim. Identification and estimation of hedonic models. Journal of Political Economy, 112(1):60–109, 2004.
 Forster (1999) O. Forster. Analysis 2, volume 4. Vieweg, 1999.
 Grönwall (1919) T. H. Grönwall. Note on the derivatives with respect to a parameter of the solutions of a system of differential equations. Annals of Mathematics, 20(4):292–296, 1919.
 Horowitz (1996) J. L. Horowitz. Semiparametric estimation of a regression model with an unknown transformation of the dependent variable. Econometrica, 64(1):103–137, 1996.
 Horowitz (2009) J. L. Horowitz. Semiparametric and nonparametric methods in econometrics. Springer, 2009.
 Khan et al. (2011) S. Khan, Y. Shin, and E. Tamer. Heteroscedastic transformation models with covariate dependent censoring. Journal of Business & Economic Statistics, 29(1):40–48, 2011.

Kloodt (2019)
N. Kloodt.
Nonparametric Transformation Models.
PhD thesis, Universität Hamburg, 2019.
available
at
https://ediss.sub.unihamburg.de/volltexte/2019/10034/pdf/Dissertation.pdf.  Linton et al. (2008) O. Linton, S. Sperlich, and I. Van Keilegom. Estimation of a semiparametric transformation model. The Annals of Statistics, 36(2):686–718, 2008.
 Neumeyer et al. (2016) N. Neumeyer, H. Noh, and I. Van Keilegom. Heteroscedastic semiparametric transformation models: estimation and testing for validity. Statistica Sinica, 26:925–954, 2016.
 Vanhems and Van Keilegom (2019) A. Vanhems and I. Van Keilegom. Semiparametric transformation model with endogeneity: a control function approach. Econometric Theory, 2019. to appear.
 Wang and Wang (2018) Q. Wang and X. Wang. Analysis of censored data under heteroscedastic transformation regression models with unknown transformation function. The Canadian Journal of Statistics, 46(2):233–245, 2018.
 Zellner and Revankar (1969) A. Zellner and N. S. Revankar. Generalized production functions. Review of Economic Studies, 36(2):241–250, 1969.
 Zhou et al. (2009) X.H. Zhou, H. Lin, and E. Johnson. Nonparametric heteroscedastic transformation regression models for skewed data with an application to health care costs. Journal of the Royal Statistical Society B, 70:1029–1047, 2009.
Comments
There are no comments yet.