Optimal Categorical Attribute Transformation for Granularity Change in Relational Databases for Binary Decision Problems in Educational Data Mining

02/28/2017
by   Paulo J. L. Adeodato, et al.
0

This paper presents an approach for transforming data granularity in hierarchical databases for binary decision problems by applying regression to categorical attributes at the lower grain levels. Attributes from a lower hierarchy entity in the relational database have their information content optimized through regression on the categories histogram trained on a small exclusive labelled sample, instead of the usual mode category of the distribution. The paper validates the approach on a binary decision task for assessing the quality of secondary schools focusing on how logistic regression transforms the students and teachers attributes into school attributes. Experiments were carried out on Brazilian schools public datasets via 10-fold cross-validation comparison of the ranking score produced also by logistic regression. The proposed approach achieved higher performance than the usual distribution mode transformation and equal to the expert weighing approach measured by the maximum Kolmogorov-Smirnov distance and the area under the ROC curve at 0.01 significance level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2017

Accelerating Cross-Validation in Multinomial Logistic Regression with ℓ_1-Regularization

We develop an approximate formula for evaluating a cross-validation esti...
research
11/28/2017

On the correspondence of deviances and maximum likelihood and interval estimates from log-linear to logistic regression modelling

Consider a set of categorical variables P where at least one, denoted by...
research
08/06/2018

Structure Learning for Relational Logistic Regression: An Ensemble Approach

We consider the problem of learning Relational Logistic Regression (RLR)...
research
12/09/2019

Logistic regression models for aggregated data

Logistic regression models are a popular and effective method to predict...
research
10/07/2020

Posterior contraction in group sparse logit models for categorical responses

This paper studies posterior contraction in multi-category logit models ...
research
01/12/2021

Evaluation of Logistic Regression Applied to Respondent-Driven Samples: Simulated and Real Data

Objective: To investigate the impact of different logistic regression es...

Please sign up or login with your details

Forgot password? Click here to reset