Date of Graduation

12-2014

Document Type

Thesis

Degree Name

Master of Science in Computer Engineering (MSCmpE)

Degree Level

Graduate

Department

Computer Science & Computer Engineering

Advisor

Miaoqing Huang

Committee Member

David Andrews

Second Committee Member

Christophe Bobda

Keywords

Dynamic Scheduling, FPGA, High-level Synthesis, MPSoC, System Design

Abstract

Hardware accelerators are capable of achieving significant performance improvement. But design- ing hardware accelerators lacks the flexibility and the productivity. Combining hardware accelerators with multiprocessor system-on-chip (MPSoC) is an alternative way to balance the flexibility, the productivity, and the performance. However, without appropriate programming model it is still a challenge to achieve parallelism on a hybrid (MPSoC) with with both general-purpose processors and dedicated accelerators. Besides, increasing computation demands with limited power budget require more energy-efficient design without performance degradation in embedded systems and mobile computing platforms. Reconfigurable computing with emerging storage technologies is an alternative to enable the optimization of both performance and power consumption.

In this work, we present a hybrid OpenCL-like (HOpenCL) parallel computing framework on FPGAs. The hybrid hardware platform as well as both the hardware and software kernels can be generated through this an automatic design flow. In addition, the OpenCL-like programming model is exploited to combine software and hardware kernels running on the unified hardware platform. By using the partial reconfiguration technique, a dynamic reconfiguration scheme is presented to optimize performance without losing the programmable flexibility.

Our results show that our automatic design flow can not only significantly minimize the development time, but also gain about 11 times speedup compared with pure software parallel implementation. When partial reconfiguration is enable to conduct dynamic scheduling, the overall performance speedup of our mixed micro benchmarks is around 5.2 times.

Share

COinS