Date of Graduation

12-2018

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Engineering (PhD)

Degree Level

Graduate

Department

Computer Science & Computer Engineering

Advisor/Mentor

Huang, Miaoqing

Committee Member

Gauch, John M.

Second Committee Member

Shi, Xuan

Third Committee Member

Beavers, M. Gordon

Keywords

Acceleration; Heterogeneous System; Manycore Coprocessor

Abstract

Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated Core (MIC) Architecture and graphics processing units (GPU), provide a promising solution to employ parallelism for achieving high performance, scalability and low power consumption. As a result, accelerators have become a crucial part in developing supercomputers. Accelerators usually equip with different types of cores and memory. It will compel application developers to reach challenging performance goals. The added complexity has led to the development of task-based runtime systems, which allow complex computations to be expressed as task graphs, and rely on scheduling algorithms to perform load balancing between all resources of the platforms. Developing good scheduling algorithms, even on a single node, and analyzing them can thus have a very high impact on the performance of current HPC systems. Load balancing strategies, at different levels, will be critical to obtain an effective usage of the heterogeneous hardware and to reduce the impact of communication on energy and performance. Implementing efficient load balancing algorithms, able to manage heterogeneous hardware, can be a challenging task, especially when a parallel programming model for distributed memory architecture.

In this paper, we presents several novel runtime approaches to determine the optimal data and task partition on heterogeneous platforms, targeting the Intel Xeon Phi accelerated heterogeneous systems.

Share

COinS