Outlier detection under interval uncertainty: Algorithmic solvability and computational complexity
Abstract
In many application areas it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some “normal” values x1,…,xn, compute the sample average E, the sample standard variation σ, and then mark a value x as an outlier if x is outside the k0-sigma interval (E − k0 · σ, E + k 0 · σ] (for some pre-selected parameter k 0). In practice, we often have only interval ranges [x i, x¯i] for the normal values x 1,…,xn. In this case, we only have intervals of possible values for the bounds E − k0 · σ and E + k0 · σ. We can therefore identify outliers as values that are outside all k0-sigma intervals. Once we identify a value as an outlier for a fixed k 0, it is also desirable to find out to what degree this value is an outlier, i.e., what is the largest value k0 for which this value is an outlier. In this thesis, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions).
Subject Area
Computer science
Recommended Citation
Patangay, Praveen, "Outlier detection under interval uncertainty: Algorithmic solvability and computational complexity" (2003). ETD Collection for University of Texas, El Paso. AAIEP10373.
https://scholarworks.utep.edu/dissertations/AAIEP10373