A Scalable Communication-Aware Compilation Flow for Programmable Accelerators