Winning Models for GPA, Grit, and Layoff in the Fragile Families Challenge

05/29/2018 ∙ by Daniel E Rigobon, et al. ∙ 0

In this paper, we discuss and analyze our approach to the Fragile Families Challenge. The challenge involved predicting six outcomes for 4,242 children from disadvantaged families from around the United States. The data consisted of over 12,000 features (covariates) about the children and their parents, schools, and overall environments from birth to age 9. Our approach relied primarily on existing data science techniques, including: (1) data preprocessing: elimination of low variance features, imputation of missing data, and construction of composite features; (2) feature selection through univariate Mutual Information and extraction of non-zero LASSO coefficients; (3) three machine learning models: Random Forest, Elastic Net, and Gradient-Boosted Trees; and finally (4) prediction aggregation according to performance. The top-performing submissions produced winning out-of-sample predictions for three outcomes: GPA, grit, and layoff. However, predictions were at most 20 training data of each outcome.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 17

page 24

page 27

page 28

page 38

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.