Date of Award
Master of Science
Due to advanced technology and wide source of data collection, high-dimensional data is available in several fields, including healthcare, bioinformatics, medicine, epidemiology, economics, finance, sociology, and climatology. In those datasets, outliers are generally encountered due to technical errors, heterogeneous sources, or the effect of some confounding variables. As outliers are often difficult to detect in high-dimensional data, the standard approaches may fail to model such data and produce misleading information. In this thesis, we studied Huber and Tukey's M-estimators for linear regression that automatically down-weight outliers and provide a good fit. We also investigated two variable selection methods -- LASSO and LAD-LASSO. In addition, we performed a simulation study to compare different estimators in pure and contaminated data. Finally, we analyzed cardiovascular data to model systolic and diastolic blood pressure. The results show that Huber and Tukey's M-estimators perform better for this dataset.
Recieved from ProQuest
Das, Jagannath, "Comparison Of Different Robust Methods In Linear Regression And Applications In Cardiovascular Data" (2023). Open Access Theses & Dissertations. 3780.