Author ORCID Identifier:

https://orcid.org/0009-0007-5329-8233

Date of Graduation

5-2026

Document Type

Thesis

Degree Name

Master of Science in Computer Engineering (MSCmpE)

Degree Level

Graduate

Department

Computer Science & Computer Engineering

Advisor/Mentor

Andrews, David

Committee Member

Nelson, Alexander

Second Committee Member

Huang, Miaoqing

Keywords

BRAM; FPGA; GEMV; Memory Wall; PIM; Processing-in-Memory

Abstract

Traditional computer systems are hitting the Memory Wall as machine learning applications are bottlenecked by the bandwidth between separate memory and compute units. Current FPGA architectures are able to bypass this bottleneck with block RAMs (BRAMs) that provide on-chip, in-fabric storage. However, their potential as the foundation of computing components is often overlooked; machine learning accelerators implemented on FPGAs face additional delays when transferring data between memory and compute units. These penalties arise from BRAM bandwidth limitations and movement of data through the reconfigurable fabric. To support FPGA-based accelerators on the edge and break the Memory Wall, reconfigurable architectures must adopt a Processor-in-Memory aligned paradigm. This work presents MC2-BRAM, a Memory-Centric Compute BRAM architecture that integrates processing and networking directly within the FPGA’s memory blocks, enabling efficient execution of machine learning workloads. MC2-BRAM includes networking that integrates zero-copy, in-operation data movement within and between BRAMs. As a result, MC2-BRAM delivers the lowest inter-PIM accumulation latency among all PIM BRAMs to date. MC2-BRAM stands as the first PIM BRAM architecture to not only maintain but exceed the BRAM’s original clocking frequency, advancing the state of BRAM-based Processor-in-Memory designs. This speed enables MC2-BRAM to deliver superior GEMV latencies at low-precisions. MC2 -BRAM achieves all this while using 18% less area than the previously smallest PIM design, making it the smallest and fastest FPGA PIM architecture reported to date.

Citation

Fredricks, N. J. (2026). Faster than the Speed of BRAM: In-Memory Computing for Next Generation FPGAs on the Edge. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/6141

Download

Included in

Computer Engineering Commons

COinS

Graduate Theses and Dissertations

Faster than the Speed of BRAM: In-Memory Computing for Next Generation FPGAs on the Edge

Author ORCID Identifier:

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Keywords

Abstract

Citation

Included in

Search

Links

Browse

Contact Us

Graduate Theses and Dissertations

Faster than the Speed of BRAM: In-Memory Computing for Next Generation FPGAs on the Edge

Author

Author ORCID Identifier:

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Keywords

Abstract

Citation

Included in

Share

Search

Links

Browse

Contact Us