Publication Date
12-2017
Abstract
Traditionally, neural networks used a sigmoid activation function. Recently, it turned out that piecewise linear activation functions are much more efficient -- especially in deep learning applications. However, so far, there have been no convincing theoretical explanation for this empirical efficiency. In this paper, we show that, by using different uncertainty techniques, we can come up with several explanations for the efficiency of piecewise linear neural networks. The existence of several different explanations makes us even more confident in our results -- and thus, in the efficiency of piecewise linear activation functions.
Original file
Comments
Technical Report: UTEP-CS-17-76a
To appear in: Olga Kosheleva, Sergey Shary, Gang Xiang, and Roman Zapatrin (eds.), Beyond Traditional Probabilistic Data Processing Techniques: Interval, Fuzzy, etc. Methods and Their Applications, Springer, Cham, Switzerland, 2018.