Robust Variable Selection in Multiple Linear Regression Via Penalized Least Trimmed Squares
Variable selection has been studied using different approaches. Its growing importance lies in numerous applications to high-dimensional data from experiments and natural phenomena. Often, models are to be constructed from such data based on significant variables for estimation or prediction purposes. This demands not just any variable selection method, but one that is robust, computationally efficient and with other desirable statistical properties. Besides the high-dimensionality of such data, the presence of outliers is common due to heterogeneous sources. Though outliers often contain useful information, they can unduly influence non-robust estimators to produce misleading results. This is the case for ordinary least squares regression which is biased and inefficient under assumption violations. Many robust loss functions and penalization selection techniques have been proposed in literature. However, the choice of loss function, penalty function, optimal tuning parameter and their implementation are paramount to the robustness and efficiency of variable selection. This work proposes a penalized robust variable selection method for multiple linear regression through the least trimmed squares loss function. The proposed method employs a robust tuning parameter criterion constructed through BIC for model selection. It is implemented via a fast computation algorithm with high breakdown point which does not depend on the number of predictors in the data.
Kesseku, Reagan, "Robust Variable Selection in Multiple Linear Regression Via Penalized Least Trimmed Squares" (2021). ETD Collection for University of Texas, El Paso. AAI28541313.