Publication Date



Technical Report: UTEP-CS-24-12a

To appear in Proceedings of the NAFIPS International Conference on Fuzzy Systems, Soft Computing, and Explainable AI NAFIPS'2024, South Padre Island, Texas, May 27-29, 2024


In general, in statistics, the most widely used way to describe the difference between different elements of a sample if by using standard deviation. This characteristic has a nice property of being decomposable: e.g., to compute the mean and standard deviation of the income overall the whole US, it is sufficient to compute the number of people, mean, and standard deviation over each state; this state-by-state information is sufficient to uniquely reconstruct the overall standard deviation. However, e.g., for gauging income inequality, standard deviation is not very adequate: it provides too much weight to outliers like billionaires, and thus, does not provide us with a good understanding of how unequal are incomes of the majority of folks. For this purpose, Theil introduced decomposable modifications of the standard deviation that is now called Theil indices. Crudely speaking, these indices are based on using logarithm instead of the square. Other researchers found other another decomposable modifications that use power law. In this paper, we provide a complete description of all decomposable versions of the Theil index. Specifically, we prove that the currently known functions are the only one for which the corresponding versions of the Theil index are decomposable -- so no other decomposable versions are possible. A similar result was previously proven under the additional assumption of linearity; our proof shows that this result is also true in the general case, without assuming linearity.