Transfer learning techniques for deep neural nets
Abstract
Inductive learners seek meaningful features within raw input. Their purpose is to accurately categorize, explain or extrapolate from this input. Relevant features for one task are frequently relevant for related tasks. Reuse of previously learned data features to help master new tasks is known as ‘transfer learning’. People use this technique to learn more quickly and easily. However, machine learning tends to occur from scratch. In this thesis, two machine learning techniques are developed, that use transfer learning to achieve significant accuracy for recognition tasks with extremely small training sets and, occasionally, no task specific training. These methods were developed for neural nets, not only because neural nets are a well established machine learning technique, but also because their modularity makes them a promising candidate for transfer learning. Specifically, an architecture known as a convolutional neural net is used because it has a modularity defined both by the fact that it is a deep net and by its use of feature maps within each layer of the net. The first transfer learning method developed, structurally based transfer relies on the architecture of a neural net to determine which nodes should or should not be transferred. This represents an improvement over existing techniques in terms of ease of use. The second technique takes a very different approach to the concept of training. Traditionally, neural nets are trained to give specific outputs in response to specific inputs. These outputs are arbitrarily chosen by the net’s trainers. However, even prior to training, the probability distribution of a net’s output in response to a specific input class is not uniform. The term inherent bias is introduced to refer to a net’s preferred response to a given class of input, whether or not that response has been trained into the net. The main focus of this work will involve using inherent biases that have not been trained into the net. If a net has already been trained for one set of tasks, then it’s inherent bias may already provide a surprisingly high degree of accuracy for other, similar tasks that have not yet been encountered. Psychologists refer to this is as latent learning. The accuracies obtainable in such a manner are examined, as is the use of structurally based transfer in conjunction with latent learning. These methods provide significant recognition rates for very small training sets.
Subject Area
Computer science
Recommended Citation
Gutstein, Steven Michael, "Transfer learning techniques for deep neural nets" (2010). ETD Collection for University of Texas, El Paso. AAI3409154.
https://scholarworks.utep.edu/dissertations/AAI3409154