Date of Award


Degree Name

Master of Science


Computer Science


Patricia J. Teller


In recent years the designs of High Performance Computing (HPC) clusters have become more complex. This is due to the emergence of new processing elements, in particular Graphics Processing Units (GPUs) and other many-core processors that can be combined with multi-core processors to enhance application performance. The design of a cluster includes processing elements that meet the needs of the applications that will run on the system. Unfortunately, it has become increasingly difficult to compare the performance of novel many-core processing elements due to the differences in their architectures. This work describes an attempt to develop a methodology for comparing the architectures of three many-core processing elements, which are called accelerators, that are used in several existing HPC systems, i.e., the Fermi, Kepler, and MIC architectures. Using the LULESH 1.0 proxy application, which has been ported to processing elements with different architectures and programming models, we compared the number of instructions executed per cycle (IPC), memory behavior, vectorization capacity, and power energy consumption of these three architectures, as well as of the multi-core Sandy Bridge processor, the performance of which was used as a baseline for comparison. This study showed that (1) the Kepler architecture achieved the best execution-time performance, while consuming the least power/energy; and the Keplerâ??s superior execution-time performance is due to LULESHâ??s vectorization usage and high IPCs.




Received from ProQuest

File Size

103 pages

File Format


Rights Holder

Esthela Gallardo