Publication Date
6-1-2021
Abstract
One of the most effective image processing techniques is the use of convolutional neural networks that use convolutional layers. In each such layer, the value of the output at each point is a combination of input data corresponding to several neighboring points. To improve the accuracy, researchers have developed a version of this technique, in which only data from some of the neighboring points is processed. It turns out that the most efficient case -- called dilated convolution -- is when we select the neighboring points whose differences in both coordinates are divisible by some constant l. In this paper, we explain this empirical efficiency by proving that for all reasonable optimality criteria, dilated convolution is indeed better than possible alternatives.
Original file
tr21-38a.pdf (701 kB)
Updated version
tr21-38b.pdf (724 kB)
Updated version
Comments
Technical Report: UTEP-CS-21-38c
Published in Entropy, 2021, Vol. 23, Paper 767.