Congratulations to Zifan He for receiving the 2025 AMD HACC Outstanding Researcher Award!

Congratulations to Zifan He for receiving the annual AMD Heterogeneous Accelerated Computing award. He is a second year PhD student working on algorithm-hardware co-design to enable efficient and high-quality inference of LLMs. His research includes (1) Efficient Language Processing with Unlimited Context Length: Zifan developed the Hierarchical Memory Transformer (HMT), a framework designed to improve memory efficiency in long-context scenarios. HMT segments input sequences, applies recurrent sequence compression, and retrieves compressed representations dynamically during inference. This plug-and-play framework achieves similar or better generation quality than existing long-context models while using 2-57x fewer parameters and 2.5-116x less memory during inference. These advantages make HMT especially well-suited for FPGA-based acceleration due to its reduced off-chip memory demands and efficient data movement patterns; (2) Novel Inference Accelerator Design: Recognizing that FPGAs offer more flexible and distributed on-chip memory, Zifan developed the Inter-Task Auto-Reconfigurable (InTAR) accelerator. This design enables resource repurposing of tasks under a static schedule, optimizing the trade-off between computation and memory access. Experiments on transformer models such as GPT-2 show that InTAR achieves speedups of 3.65- 39.14x, computational efficiency gains of 1.72-10.44x over prior FPGA accelerators, and 1.66-7.17x better power efficiency compared to GPUs. With 3 papers published at top-tier conferences in NLP/ML and FPGA design, Zifan has made impressive contributions that improve both computational efficiency and model performance in LLM inference.

Congratulations to Zifan He for receiving the 2025 AMD HACC Outstanding Researcher Award!

2025 AMDHAAC Award_Zifan He.jpg

Main menu