Publication Date
7-2004
Abstract
In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some "normal" values x1,...,xn, compute the sample average E, the sample standard variation sigma, and then mark a value x as an outlier if x is outside the k0-sigma interval [E-k0*sigma,E+k0*sigma] (for some pre-selected parameter k0). In real life, we often have only interval ranges [xi] for the normal values x1,...,xn. In this case, we only have intervals of possible values for the bounds E-k0*sigma and E+k0*sigma. We can therefore identify outliers as values that are outside all k0-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions).
Original file: UTEP-CS-03-10a
tr03-10b.pdf (178 kB)
Short version: UTEP-CS-03-10b
tr03-10e.pdf (314 kB)
Updated version: UTEP-CS-03-10e
Comments
Technical Report: UTEP-CS-03-10f
Short version published in: Ivan Lirkov, Svetozar Margenov, Jerzy Wasniewski, and Plamen Yalamov (eds.), Large-Scale Scientific Computing, Proceedings of the 4th International Conference LSSC'2003, Sozopol, Bulgaria, June 4-8, 2003, Springer Lecture Notes in Computer Science, 2004, Vol. 2907, pp. 276-283; full paper published in Reliable Computing, 2005, Vol. 11, No. 1, pp. 59-76.