Publication Date



Technical Report: UTEP-CS-14-02a

Published in: Shyi-Ming Chen and Witold Pedrycz (eds.), Information Granularity, Big Data, and Computational Intelligence, Springer Verlag, Cham, Switzerland, 2015, pp. 63-87.


One of the main objectives of science and engineering is to predict the future state of the world -- and to come up with actions which will lead to the most favorable outcome. To be able to do that, we need to have a quantitative model describing how the values of the desired quantities change -- and for that, we need to know which factors influence this change. Usually, these factors are selected by using traditional statistical techniques, but with the current drastic increase in the amount of available data -- known as the advent of {\it big data} -- the traditional techniques are no longer feasible. A successful semi-heuristic method has been proposed to detect true connections in the presence of big data. However, this method has its limitations. The first limitation is that this method is heuristic -- its main justifications are common sense and the fact that in several practical problems, this method was reasonably successful. The second limitation is that this heuristic method is based on using "crisp" granules (clusters), while in reality, the corresponding granules are flexible ("fuzzy"). In this paper, we explain how the known semi-heuristic method can be justified in statistical terms, and we also show how the ideas behind this justification enable us to improve the known method by taking granule flexibility into account.

tr14-02.pdf (482 kB)
Original file