With the rapid evolution of CPU-FPGA heterogeneous acceleration platforms, it is critical for both platform developers and users to quantify the fundamental microarchitectural features of the platforms. We developed a set of microbenchmarks to evaluate mainstream CPU-FPGA platforms.
The first benchmark (https://github.com/peterpengwei/Microbench_AlphaData) is dedicated to the Alpha Data card which connects a CPU with an FPGA via the PCIe interface. The benchmark follows the Xilinx SDAccel programming model, and contains a host program written in C and a kernel program written in OpenCL. With a set of timers in the host program, users can quantify the latency and bandwidth of each critical step, including device buffer allocation, pageable-to-pinned memory copy, PCIe-DMA, etc.
The second benchmark (https://github.com/peterpengwei/Microbench_HARP) is dedicated to the Intel/Altera Heterogeneous Accelerator Research Platform (HARP). HARP connects an CPU with an FPGA via Intel's QPI processor interconnect, and implements a coherent cache interface (CCI) on the FPGA side to achieve coherence between CPU and FPGA. The benchmark follows the Intel AALSDK programming model, and contains a host program written in C++ and a kernel program written in Verilog HDL. Users can quantify the hit/miss latency of the coherent cache, as well as the remote memory access bandwidth between FPGA and the main memory located on the CPU side.
Postdoc and students: Young-kyu Choi, Zhenman Fang, Yuchen Hao, Peng Wei
Faculty: Jason Cong, Glenn Reinman
For more details, please read our DAC 16 paper, or contact Peng Wei.
If you use our microbenchmarks in your research, please cite our DAC 16 paper (bibtex download):
Young-kyu Choi, Jason Cong, Zhenman Fang, Yuchen Hao, Glenn Reinman, and Peng Wei. A Quantitative Analysis on Microarchitectures of Modern CPU-FPGA Platforms. Proceedings of the 53rd Annual Design Automation Conference (DAC 2016), Austin, TX, June 5-9, 2016.