If the size of the tree is not limited, the algorithm will overfit the data in the training set. [...] We first compare the performance of the OLS model to those of several ML algorithms, using the restricted set of variables that are standard in the wellbeing literature; we then carry out an analogous comparison for the extended set of individual characteristics. [...] The greater the average drop in the R-squared, the more important is the variable in predicting wellbeing. [...] The extended sets of variables we consider here include all of the variables available in the 2013 waves of the SOEP and Gallup, and Wave 3 of the UKHLS. [...] The R-squared figure for the extended set of variables is around double that for the restricted set in all of the algorithms.
Related Organizations
- Pages
- 38
- Published in
- United Kingdom