Home

Short Bio

I am an Associate Professor at the State Key Laboratory of Processors (SKLP), Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). I will soon be joining the School of Control and Computer Engineering at North China Electric Power University. I received my Ph.D. from The University of Hong Kong in 2016 under the supervision of Prof. Hayden K. H. So. Prior to joining ICT, I served as a Research Fellow at the National University of Singapore (2016-2018). I hold a B.Eng. and M.Eng. from Harbin Institute of Technology. My research focuses on Computer Architecture, specifically Domain-Specific Architectures, LLM for Chip Design, and Hardware Code Generation. I am a Senior Member of IEEE and CCF, and a Member of ACM.

Vancancies

I am looking for self-motivated master/intern students for LLM-based intelligent chip designs. It includes various design tasks such as RISC-V SoC design and verification, RISC-V ASIP design automation, domain-specific accelerator generation, Low-power design and optimization. Students with RISC-V processor design and LLM experience are highly preferred. It is possible to work fully remotely. Check the topics in the following list and contact me if you are interested.

LLM-based SoC/DSA/ASIP design
LLM-based high-level synthesis
LLM-based ASIC design, verification, and debugging
LLM-based EDA parallelization

News

[May 2026] We are glad to have contributed a chapter on lightweight ABFT to the recently published book, Machine Learning Systems: The Role of Hardware Design for Dependable Computing.
[Mar 2026] Our recent work on modeling the correlation between chiplet redundancy and cost is accepted by TCAD’26.
[Feb 2026] Our recent work on fault-tolerant LLM inference algorithm is accepted by DAC’26.
[Feb 2026] Our work on mixed precision neural network processing on MCUs is accepted by ACM TECS’26. It outperforms SOTA neural network processing frameworks substantially and is now open sourced on github. Welcome to try it.
[Feb 2026] Our work on input adaptive soft error protection is accepted by TCAD’26.
[Jan 2026] Our work that analyzes errors in LLM-based code generation is accepted by TCAD’26. We find many interesting reasons for the errors that may help to imporve future LLMs for verilog code generation.
[Jan 2026] Our work that investigates small language models for efficient CUDA code generation through an LLM-generated reasoning graph has been accepted by ICLR’26.
[Jun 2025] Our recent work on DRAM-PIM based ANNS Engine is accepeted by SC’25.
[Mar 2025] Our DSL for FPGA-based graph processing accelerator generation is accepeted by LCTES’25.
[Mar 2025] Our FrontOrder approach is accepted by ICDE’25 and ApproxPilot work is accepted by ISEDA’25. All of them are open sourced, feel free to try them.
[Mar 2025] We got the third place in the PPoPP’25 FastCode Programming Challenge for the BFS track and achieved the highest throughput on social network graphs.
[Dec 2024] Our TaijiGraph, a computational storage (CSD) based out-of-core graph processing system, is accepted by IPDPS’25. It leverages CSDs to address the I/O bottleneck of out-of-core graph processing systems from the perspective of both high-level data layout, task scheduling and low-level I/O coalescing.
[Nov 2024] Two papers about DSA are accepted by HPCA’25.
[Oct 2024] Start to serve as an associate editor of IEEE Transactions on Emerging Topics in Computing.
[Oct 2024] Our work on CIM-based LLM acceleration collaborated with Dr. Luo is accepted by IEEE TPAMI.
[Jun 2024] HLSPilot: LLM-based High-level Synthesis is accepted by ICCAD’24. This work presents an LLM-based high-level synthesis design framework for typical CPU-FPGA architecture. Particularly, it leverages LLM and RAG techniques to generate HLS-based accelerator design for arbitrary sequential C/C++ code, which is also a key component for our LLM-driven SoC generation framework.
[May 2024] Our work on computational storage got Huawei OlympusMons Awards in 2024.
[Mar 2024] A multi-resolution fault injection framework for deep learning is accepted by TVLSI’24. It is well documented and validated sufficiently, and provides many convenient evaluation features that are typically required in fault-tolerant deep learning study. Check it on github.
[Nov 2023] A HW/SW co-design framework for mixed-precision neural network acceleration on FPGAs (DeepBurning-MixQ) is accepted by ICCAD’23. We are adding LLM support to enable more intelligent code generation. The code is open sourced on github and welcome to try it.
[Jun 2023] Congratulations! Haitong Huang, Erjing Luo, and Guoyu Li got the Second Place in DAC’23 SDC.
[Jan 2023] Our work “EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks” won the Best Paper Award of TC’21.
[Jul 2022] Haitong Huang, Erjing Luo, and Cangyuan Li in my group got the Third Place in DAC’22 SDC.
[Jun 2020] Shengwen Liang and Rick Lee won the Third Prize (FPGA Track) of the 2020 IEEE Low-Power Computer Vision Challenge (LPCVC).

Fundings

HW/SW platform for intelligent cross-layer design and optimization of processors, 11 000 000￥, xdb(2024-2028)
Natural weather disturbed image generation for remote sensing, 500 000￥, XXX(2023-2024)
AI assisted DSA design automation, 300 000￥, SKLP(2023-2024)
Automatic Cross-Layer DSA Design and Optimization, 1 600 000￥, National Key Research and Development Program of China(2023-2025)
Intelligent In-Storage Big Data Processing System, 1 300 000￥, 1XX(2023-2024)
Fault-tolerant Deep Learning Toolchain for COTS Devices, 300 000￥, XXX(2023)
Elastic Fault-tolerant Deep Learning Processor Design, 570 000￥, NSFC(2022-2025)
Customized Energy-efficient Graph Processing Acceleration on FPGAs, 300 000￥, NSFC(2020-2022)
Fault-tolerant Deep Learning Processor Design Automation, 300 000￥, SKLCA(2021-2022)

Talks

Investigating Self-Test, Self-Diagnosis, and Self-Recovery (3S) Techniques for Wafer-Scale AI Chips, SEMICON China, 2025
基于大语言模型的全自动CPU-FPGA异构硬件加速, 高性能异构计算与人工智能优化论坛，CCF-HPC, 2024
异构硬件加速器设计、优化以及编程, 高性能异构计算与人工智能优化论坛，CCF-HPC，2023
基于计算存储器的图处理系统设计, 面向复杂图计算应用的新型高能效体系结构论坛，CNCC，2022
容错深度学习处理器微结构设计，集成电路设计与自动化学术会议(CCF-DAC), 2021
DeepBurning2.0: An Automatic End-to-end Neural Network Acceleration System on FPGAs, 华南理工大学，软件学院, 2019

Services

Session co-chair of DAC’25
Session chair of ISEDA’25
Member of the Review Board, IEEE Transactions on Parallel and Distributed Systems (TPDS)
Associate Editor, IEEE Transactions on Emerging Topics in Computing (TETC)
TPC for DFTS’22, FPT’22, ITC’22
TPC for ATS’23, FPT’23, DFTS’23, ITC’23, NeurIPS’23
TPC for DFTS’24, FPT’24, ICLR’24, FCCM’24, ICML’24
TPC for FPT’25, AAAI’25, ICLR’25, DFTS’25, DAC’25
Review for TC, TPDS, TCAD. TVLSI, TETC, JETC, JSA, TNNLS

Graduated Students

Mingkai Chen (PhD, 字节跳动)
Junfeng Gong (Master, Seed)
Xinmiao Zhang (PhD, 阿里淘天)
Chengwei Xiong (Master, 字节跳动)
Qing Zhang (Master, 腾讯)
Wenyu Zhang (Intern from Harbin Institute of Technology, 兆易创新)
Zesong Jiang (Intern from University of Science and Technology, PhD candidate in Arizona State University)
Xinghua Xue (PhD, Hangzhou Institute of Advanced Study, UCAS)
Zheng Feng (Master, Huawei, AI Compilation)
Guoyu Li (Master, Baidu, Kunlun Chip)
Haitong Huang (Master, Tencent, AI Lab)
Xuejian Sun (Intern from Harbin Institute of Technology, Master student in Fudan University)
Erjing Luo (Intern from Beijing Institute of Technology, Mphil in University of Alberta)
Miaoxin Wang (Intern from Harbin Institute of Technology, Master student in Nanjing University)
Jinming Zhao (Intern from Wuhan University, PhD candidate in the University of Hong Kong)
Zhiyu Zhu (Intern from Harbin Institute of Technology, CETC 47)
Cheng Chu (Intern from Hefei University of Technology, PhD candidate in Indiana University Bloomington)
Meng He (Intern from Hefei University of Technology, 字节跳动)
Ziyang Zhu (Intern from Hefei University of Technology, 紫光展锐)
Qiang Zhang (Master, 快手)
Kouzi Xing (Intern from Hefei University of Technology, 商汤科技)
Li Li (Intern from Hefei University of Technology, 京东)
Kexin Chu (Intern from Hefei University of Technology, 百度)
Kaijie Tu (Intern from Hefei University of Technology, 计算所)
Chang Shi (Master, Alibaba)
Peibin Wu (Master, MSRA)

Cheng Liu