VAST lab at UCLA

The VAST lab at UCLA investigates cutting-edge research topics at the intersection of VLSI technologies, design automation,  architecture and compiler optimization at multiple scales, from micro-architecture building blocks,  to heterogeneous compute nodes, and scalable data centers.  Current focuses include architecture and design automation for emerging technologies, customizable domain-specific computing with applications to multiple domains, such as imaging processing, bioinformatics, data mining and machine learning.

Latest News

April 6, 2017 | 0 comments

Congratulations to Bingjun Xiao (PhD’2015, advisor, Jason Cong), whose dissertation "Communication Optimization for Customizable Domain-Specific Computing", has been awarded the 2016 EDAA Outstanding PhD Dissertation Award.  The award was...

March 1, 2017 | 0 comments

In celebrating the 25th anniversary of the FPGA Symposium, which took place  February 22nd through 24th in Monterey, California, ACM/SIGDA TCFPGA initiated the FPGA and Reconfigurable Computing Hall of Fame program at the symposium. 


February 10, 2017 | 0 comments

Three faculty members of the UCLA Henry Samueli School of Engineering and Applied Science – Jason Cong and George Varghese of Computer Science, and Behzad Razavi of Electrical Engineering (...

Latest Publications

Bandwidth Optimization Through On-Chip Buffer Restructuring for HLS
Conference publication
Jason Cong, Peng Wei, Cody Hao Yu, and Peipei Zhou
Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs
Conference publication
Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, and Jason Cong
[PDF]: Supporting Address Translation for Accelerator-Centric Architectures
Conference publication
Jason Cong, Zhenman Fang, Yuchen Hao, and Glenn Reinman
FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow
Journal publication
Y. Chen, T. Nguyen, Y. Chen, S.T. Gurumani, Y. Liang, K. Rupnow, J. Cong, W.-M. Hwu, and D. Chen
[PDF]: Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems
Conference publication
X. Wei, Y. Liang, T. Wang, S. Lu, and J. Cong
[PDF]: FPGA-based Accelerator for Long Short-Term Memory Recurrent Neural Networks
Conference publication
Y. Guan, Z. Yuan, G. Sun, and J. Cong
[PDF]: Platform Choices and Design Demands for IoT Platforms: Cost, Power, and Performance Tradeoffs
Journal publication
D. Chen, J. Cong, S. Gurumani, W.-m. Hwu, K. Rupnow, and Z. Zhang
Re-form: FPGA-powered true codesign flow for high-performance computing in the post-Moore era
Conference publication
Franck Cappello, Kazutomo Yoshii, Hal Finkel, and Jason Cong
Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks
Conference publication
Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale
Conference publication
Muhuan Huang, Di Wu, Cody Hao Yu, Zhenman Fang, Matteo Interlandi, Tyson Condie, and Jason Cong

Our Projects

Many applications in precision medicine present significant computational challenges.  For example, the computation demand for personalized cancer treatment is prohibitively high for the general-purpose computing technologies, as tumor heterogeneity requires great sequencing depths, structural...

With the increasing of the system complexity, the needs of system level design automation becomes more and more urgent. The maturity of high-level synthesis pushes the desgin abstraction from register-transfer level (RTL) to software programming language like C/C++. However, the state-of-art...

To meet ever-increasing computing needs and overcome power density limitations, the computing industry has entered theera of parallelization, with tens to hundreds of computing cores integrated into a single...

Software Releases

Cloud-scale BWAMEM (CS-BWAMEM) is an ultrafast and highly scalable aligner built on top of cloud infrastructures, including Spark and Hadoop distributed file system (HDFS). It leverages the abundant computing resources in a public or private cloud to fully exploit the parallelism obtained from...

With the rapid evolution of CPU-FPGA heterogeneous acceleration platforms, it is critical for both platform developers and users to quantify the fundamental microarchitectural features of the platforms. We developed a set of microbenchmarks to evaluate mainstream CPU-FPGA platforms.


PARADE is a cycle-accurate full-system simulation platform that enables the design and exploration of the emerging accelerator-rich architectures (ARA). It extends the widely used gem5 simulator with high-level synthesis (HLS) support.