Author ORCID Identifier:
Date of Graduation
5-2026
Document Type
Thesis
Degree Name
Master of Science in Computer Engineering (MSCmpE)
Degree Level
Graduate
Department
Computer Science & Computer Engineering
Advisor/Mentor
Andrews, David
Committee Member
Nelson, Alexander
Second Committee Member
Huang, Miaoqing
Keywords
BRAM; FPGA; GEMV; Memory Wall; PIM; Processing-in-Memory
Abstract
Traditional computer systems are hitting the Memory Wall as machine learning applications are bottlenecked by the bandwidth between separate memory and compute units. Current FPGA architectures are able to bypass this bottleneck with block RAMs (BRAMs) that provide on-chip, in-fabric storage. However, their potential as the foundation of computing components is often overlooked; machine learning accelerators implemented on FPGAs face additional delays when transferring data between memory and compute units. These penalties arise from BRAM bandwidth limitations and movement of data through the reconfigurable fabric. To support FPGA-based accelerators on the edge and break the Memory Wall, reconfigurable architectures must adopt a Processor-in-Memory aligned paradigm. This work presents MC2-BRAM, a Memory-Centric Compute BRAM architecture that integrates processing and networking directly within the FPGA’s memory blocks, enabling efficient execution of machine learning workloads. MC2-BRAM includes networking that integrates zero-copy, in-operation data movement within and between BRAMs. As a result, MC2-BRAM delivers the lowest inter-PIM accumulation latency among all PIM BRAMs to date. MC2-BRAM stands as the first PIM BRAM architecture to not only maintain but exceed the BRAM’s original clocking frequency, advancing the state of BRAM-based Processor-in-Memory designs. This speed enables MC2-BRAM to deliver superior GEMV latencies at low-precisions. MC2 -BRAM achieves all this while using 18% less area than the previously smallest PIM design, making it the smallest and fastest FPGA PIM architecture reported to date.
Citation
Fredricks, N. J. (2026). Faster than the Speed of BRAM: In-Memory Computing for Next Generation FPGAs on the Edge. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/6141