Comparison of Different Robust Methods in Linear Regression and Applications in Cardiovascular Data
Abstract
Due to advanced technology and wide source of data collection, high-dimensional data is available in several fields, including healthcare, bioinformatics, medicine, epidemiology, economics, finance, sociology, and climatology. In those datasets, outliers are generally encountered due to technical errors, heterogeneous sources, or the effect of some confounding variables. As outliers are often difficult to detect in high-dimensional data, the standard approaches may fail to model such data and produce misleading information. In this thesis, we studied Huber and Tukey's M-estimators for linear regression that automatically down-weight outliers and provide a good fit. We also investigated two variable selection methods -- LASSO and LAD-LASSO. In addition, we performed a simulation study to compare different estimators in pure and contaminated data. Finally, we analyzed cardiovascular data to model systolic and diastolic blood pressure. The results show that Huber and Tukey's M-estimators perform better for this dataset.
Subject Area
Statistics|Information Technology|Applied Mathematics
Recommended Citation
Das, Jagannath, "Comparison of Different Robust Methods in Linear Regression and Applications in Cardiovascular Data" (2023). ETD Collection for University of Texas, El Paso. AAI30521760.
https://scholarworks.utep.edu/dissertations/AAI30521760