Publication Date



Technical Report: UTEP-CS-21-101


At present, the most efficient machine learning techniques are deep neural networks. In these networks, a signal repeatedly undergoes two types of transformations: linear combination of inputs, and a non-linear transformation of each value v -> s(v). Empirically, the function s(v) = max(v,0) -- known as the rectified linear function -- works the best. There are some partial explanations for this empirical success; however, none of these explanations is fully convincing. In this paper, we analyze this why-question from the viewpoint of uncertainty propagation. We show that reasonable uncertainty-related arguments lead to another possible explanation of why rectified linear functions are so efficient.