Date of Award
2013-01-01
Degree Name
Master of Science
Department
Computer Science
Advisor(s)
Patricia J. Teller
Second Advisor
Sarala Arunagiri
Abstract
For the past decade, power/energy consumption has become a limiting factor for large-scale and embedded High Performance Computing (HPC) systems. This is especially true for systems that include accelerators, e.g., high-end computing devices, such as Graphics Processing Units (GPUs), with terascale computing capabilities and high power draws that greatly surpass that of multi-core CPUs. Accordingly, improving the node-level power/energy efficiency of an application can have a direct and positive impact on both classes of HPC systems.
The research reported in this thesis explores the use of software techniques to enhance the execution-time and power-consumption performance of applications executed on a CPU/GPGPU compute node. We conducted this exploration in the context of parallel matrix-matrix multiplication with the goal of designing and developing a Single-precision General Matrix-matrix Multiplication (SGEMM) routine for CPU/GPGPU node execution that executes faster and/or consumes less power than competing routines. This thesis reports on this work, which resulted in an efficient, scalable, parallel single-precision matrix-matrix multiplication routine for square matrices that has comparable performance to existing routines but can multiply larger matrices and is limited by the size of the host CPU memory, instead of the size of the GPGPU device memory.
Language
en
Provenance
Received from ProQuest
Copyright Date
2013
File Size
143 pages
File Format
application/pdf
Rights Holder
Enrique Portillo
Recommended Citation
Portillo, Enrique, "Efficient, Scalable, Parallel, Matrix-Matrix Multiplication" (2013). Open Access Theses & Dissertations. 1704.
https://scholarworks.utep.edu/open_etd/1704
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Electronics Commons