Departmental Technical Reports (CS)

Why Squashing Functions in Multi-Layer Neural Networks

Julio Urenda, The University of Texas at El PasoFollow
Orsoly Csiszár, Óbuda UniversityFollow
Gábor Csiszár, University of StuttgartFollow
József Dombi, University of SzegedFollow
Olga Kosheleva, The University of Texas at El PasoFollow
Vladik Kreinovich, The University of Texas at El PasoFollow
György Eigner, Óbuda UniversityFollow

Publication Date

2-2020

Comments

Technical Report: UTEP-CS-20-12

Abstract

Most multi-layer neural networks used in deep learning utilize rectified linear neurons. In our previous papers, we showed that if we want to use the exact same activation function for all the neurons, then the rectified linear function is indeed a reasonable choice. However, preliminary analysis shows that for some applications, it is more advantageous to use different activation functions for different neurons -- i.e., select a family of activation functions instead, and select the parameters of activation functions of different neurons during training. Specifically, this was shown for a special family of squashing functions that contain rectified linear neurons as a particular case. In this paper, we explain the empirical success of squashing functions by showing that the formulas describing this family follow from natural symmetry requirements.

Download

Included in

Computer Sciences Commons

COinS

Departmental Technical Reports (CS)

Why Squashing Functions in Multi-Layer Neural Networks

Publication Date

Comments

Abstract

Included in

Search

Links

Browse

Author Corner

Links

Departmental Technical Reports (CS)

Why Squashing Functions in Multi-Layer Neural Networks

Authors

Publication Date

Comments

Abstract

Included in

Share

Search

Links

Browse

Author Corner

Links