Date of Award


Degree Name

Master of Science


Computer Engineering


Patricia A. Nava


Recently, there has been a push to perform deep learning (DL) computations on the edge rather than the cloud due to latency, network connectivity, energy consumption, and privacy issues. However, state-of-the-art deep neural networks (DNNs) require vast amounts of computational power, data, and energyâ??resources that are limited on edge devices. This limitation has brought the need to design domain-specific architectures (DSAs) that implement DL-specific hardware optimizations. Traditionally DNNs have run on 32-bit floating-point numbers; however, a body of research has shown that DNNs are surprisingly robust and do not require all 32 bits. Instead, using quantization, networks can run on extremely low-bit widths (1-8 bits) with fair accuracy. Suggesting that edge devices can handle low-bit width DNNs at the cost of accuracy, saving computations and energy. In addition to DNNs being run on low-bit widths, it has also been shown that not all layers within a network require the same precision. Therefore, a further optimization suggests using per-layer mixed-precision quantization rather than uniform quantization. This Thesis conducts a comparative study on the effects of mixed-precision quantization using "simulated quantization" in software. Furthermore, a mixed-precision multiplierâ??able to be configured at run timeâ??is designed to support mixed-precision quantized DNNs in hardware, and a comparative study is performed between a full-precision implementation.




Received from ProQuest

File Size

114 p.

File Format


Rights Holder

Andres Rios