Publication Date



Technical Report: UTEP-CS-18-03


To process huge amounts of data, one possibility is to combine some data points into granules, and then process the resulting granules. For each group of data points, if we try to include all data points into a granule, the resulting granule often becomes too wide and thus rather useless; on the other case, if the granule is too narrow, it includes only a few of the corresponding point -- and is, thus, also rather useless. The need for the trade-off between coverage and specificity is formalized as the principle of justified granularity. The specific form of this principle depends on the selection of a measure of specificity. Empirical analysis has show that exponential and power law measures of specificity are the most adequate. In this paper, we show that natural symmetries explain this empirically observed efficiency.