Using LASSO to Assist Imputation and Predict Child Well-being

Diana Stanescu, Erik Wang, Soichiro Yamauchi

    Research output: Contribution to journalArticle

    Abstract

    This article documents an approach to predicting children’s well-being using data from the Fragile Families and Child Wellbeing Study, which are representative of births in large U.S. cities. The authors use the least absolute shrinkage and selection operator (LASSO) to preprocess the data. They then apply the Amelia algorithm to impute missing data. Finally, they use LASSO again for prediction with the imputed data. The authors report the performance of this approach for six outcome variables. The approach achieves the best performance for the variable material hardship. The out-of-sample mean squared error of the authors’ prediction is 0.019, the lowest among all submissions in the Fragile Families Challenge. The authors find that among variables with high predictive power, variables from mother surveys dominate. Furthermore, components of material hardship in the past strongly predict current material hardship.
    Original languageEnglish
    Pages (from-to)1-21
    JournalSocius: Sociological Research for a Dynamic World
    Volume5
    DOIs
    Publication statusPublished - 2019

    Cite this