Publication Date




Short version published in Proceedings of the IEEE International Conference on Fuzzy Systems FUZZ-IEEE'2003, St. Louis, Missouri, May 25-28, 2003, pp. 67-73. Full paper published in Reliable Computing, 2004, Vol. 10, No. 2, pp. 83-106.


To design data processing algorithms with the smallest average processing time, we need to know what this "average" stands for. At first glance, it may seem that real-life data are really "chaotic", and no probabilities are possible at all: today, we may apply our software package to elementary particles, tomorrow -- to distances between the stars, etc. However, contrary to this intuitive feeling, there are stable probabilities in real-life data. This fact was first discovered in 1881 by Simon Newcomb who noticed that the first pages of logarithm tables (that contain numbers starting with 1) are more used than the last ones (that contain numbers starting with 9). To check why, he took all physical constants from a reference book, and counted how many of them start with 1. An intuitive expectation is that all 9 digits should be equally probable. In reality, instead of 11%, about 30% of these constants turned out to be starting with 1. In general, the fraction of constants that start with a digit d can be described as ln(d+1)-ln(d). We describe a new interval computations-related explanation for this empirical fact, and we explain its relationship with lifetime of the Universe and with the general problem of determining subjective probabilities on finite and infinite intervals.

tr02-23.pdf (278 kB)
Original file: UTEP-CS-02-23

tr02-23b.pdf (126 kB)
Short Version: UTEP-CS-02-23b