Publication Date

12-1-2024

Comments

Technical Report: UTEP-CS-24-55

Abstract

Everyone knows the success story of machine-learning AI. However, the current AI tools are not perfect. We know how to make them better: every time we increase the amount of computations by the order of magnitude, we get a drastic improvement in the performance of the resulting machine learning tools. Training modern AI system requires a tremendous amount of computations -- that already take a lot of time. So, to increase the number of computations, we need to make each computation step faster. One way to do that is to use low-precision arithmetic operations, e.g., with 1 byte per real number instead of the usual 8. It was shown that we can speed up computations even further if we apply an appropriate nonlinear transformation to all the values. Empirically, out of all transformations that were tried, logarithmic (log) transformation works the best. In this paper, we prove that under some reasonable condition, log transformation is indeed optimal. This way, not only we provide a theoretical explanation for the above empirical fact, but we are also proving that log transformation is better than all possible transformations, including the ones that have not been experimentally tried.

Share

COinS