ROS Regression: Integrating Regularization and Optimal Scaling Regression

11/16/2016
by   Jacqueline J. Meulman, et al.
0

In this paper we combine two important extensions of ordinary least squares regression: regularization and optimal scaling. Optimal scaling (sometimes also called optimal scoring) has originally been developed for categorical data, and the process finds quantifications for the categories that are optimal for the regression model in the sense that they maximize the multiple correlation. Although the optimal scaling method was developed initially for variables with a limited number of categories, optimal transformations of continuous variables are a special case. We will consider a variety of transformation types; typically we use step functions for categorical variables, and smooth (spline) functions for continuous variables. Both types of functions can be restricted to be monotonic, preserving the ordinal information in the data. In addition to optimal scaling, three regularization methods will be considered: Ridge regression, the Lasso, and the Elastic Net. The resulting method will be called ROS Regression (Regularized Optimal Scaling Regression. We will show that the basic OS algorithm provides straightforward and efficient estimation of the regularized regression coefficients, automatically gives the Group Lasso and Blockwise Sparse Regression, and extends them with monotonicity properties. We will show that Optimal Scaling linearizes nonlinear relationships between predictors and outcome, and improves upon the condition of the predictor correlation matrix, increasing (on average) the conditional independence of the predictors. Alternative options for regularization of either regression coefficients or category quantifications are mentioned. Extended examples are provided. Keywords: Categorical Data, Optimal Scaling, Conditional Independence, Step Functions, Splines, Monotonic Transformations, Regularization, Lasso, Elastic Net, Group Lasso, Blockwise Sparse Regression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2018

A Note on Coding and Standardization of Categorical Variables in (Sparse) Group Lasso Regression

Categorical regressor variables are usually handled by introducing a set...
research
03/05/2021

Elastic Net Regularization Paths for All Generalized Linear Models

The lasso and elastic net are popular regularized regression models for ...
research
09/01/2023

Optimal Scaling transformations to model non-linear relations in GLMs with ordered and unordered predictors

In Generalized Linear Models (GLMs) it is assumed that there is a linear...
research
05/11/2023

Bias of determinacy coefficients in confirmatory factor analysis based on categorical variables

The relevance of determinacy coefficients as indicators for the validity...
research
06/10/2018

Lost in translation: On the impact of data coding on penalized regression with interactions

Penalized regression approaches are standard tools in quantitative genet...
research
09/01/2017

Sparse Regularization in Marketing and Economics

Sparse alpha-norm regularization has many data-rich applications in mark...
research
11/06/2017

Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables

Sparse regularization such as ℓ_1 regularization is a quite powerful and...

Please sign up or login with your details

Forgot password? Click here to reset