With the rapid evolution of CPU-FPGA heterogeneous acceleration platforms, it is critical for both platform developers and users to quantify the fundamental microarchitectural features of the platforms. We developed a set of microbenchmarks to evaluate mainstream CPU-FPGA platforms.
The first benchmark (https://github.com/peterpengwei/Microbench_AlphaData) is dedicated to the Alpha Data card which connects a CPU with an FPGA via the PCIe interface. The benchmark follows the Xilinx SDAccel programming model, and contains a host program written in C and a kernel program written in OpenCL. With a set of timers in the host program, users can quantify the latency and bandwidth of each critical step, including device buffer allocation, pageable-to-pinned memory copy, PCIe-DMA, etc.
The second benchmark (https://github.com/peterpengwei/Microbench_HARP) is dedicated to the Intel/Altera Heterogeneous Accelerator Research Platform (HARP). HARP connects an CPU with an FPGA via Intel's QPI processor interconnect, and implements a coherent cache interface (CCI) on the FPGA side to achieve coherence between CPU and FPGA. The benchmark follows the Intel AALSDK programming model, and contains a host program written in C++ and a kernel program written in Verilog HDL. Users can quantify the hit/miss latency of the coherent cache, as well as the remote memory access bandwidth between FPGA and the main memory located on the CPU side.
Young-kyu Choi, Jason Cong, Zhenman Fang, Yuchen Hao, Glenn Reinman, and Peng Wei. A Quantitative Analysis on Microarchitectures of Modern CPU-FPGA Platforms. Proceedings of the 53rd Annual Design Automation Conference (DAC 2016), Austin, TX, June 5-9, 2016.