Date of Graduation


Document Type


Degree Name

Doctor of Philosophy in Engineering (PhD)

Degree Level



Computer Science & Computer Engineering


Miaoqing Huang

Committee Member

John M. Gauch

Second Committee Member

Xuan Shi

Third Committee Member

Merwin G. Beavers


Acceleration, Heterogeneous System, Manycore Coprocessor


Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated Core (MIC) Architecture and graphics processing units (GPU), provide a promising solution to employ parallelism for achieving high performance, scalability and low power consumption. As a result, accelerators have become a crucial part in developing supercomputers. Accelerators usually equip with different types of cores and memory. It will compel application developers to reach challenging performance goals. The added complexity has led to the development of task-based runtime systems, which allow complex computations to be expressed as task graphs, and rely on scheduling algorithms to perform load balancing between all resources of the platforms. Developing good scheduling algorithms, even on a single node, and analyzing them can thus have a very high impact on the performance of current HPC systems. Load balancing strategies, at different levels, will be critical to obtain an effective usage of the heterogeneous hardware and to reduce the impact of communication on energy and performance. Implementing efficient load balancing algorithms, able to manage heterogeneous hardware, can be a challenging task, especially when a parallel programming model for distributed memory architecture.

In this paper, we presents several novel runtime approaches to determine the optimal data and task partition on heterogeneous platforms, targeting the Intel Xeon Phi accelerated heterogeneous systems.