Jiang, Zhennan

Ph.D.

Deep Reinforcement Learning Embodied AI Deep Generative Model

Ph.D. Candidate (supervisor: Prof. Yu Chao & Prof. Zhao Dongbin & Prof. Li Haoran)

Zhongguancun Academy & Insitute of Automation, Chinese Academy of Sciences | 2024-Present

B.Sc.

Central South University | 2020-2024

Welcome! I’m Jiang Zhennan (江震南), a first-year Ph.D. student in Technology for Computer Applications at the Institute of Automation, Chinese Academy of Sciences (CASIA). I am fortunate to have Prof. Zhao Dongbin (IEEE Fellow) as my chief supervisor, alongside Dr. Li Haoran as my co-supervisor. Currently, I am also undergoing joint training at Zhongguancun Academy, under the mentorship of Prof. Yu Chao.

My research interests currently focus on reinforcement learning and robotics. In the future, I aim to delve into embodied intelligence technologies and their applications.

Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization

Haoran Li, Zhennan Jiang, Yuhui Chen, and Dongbin Zhao

NeurIPS 2024, 2024

With high-dimensional state spaces, visual reinforcement learning (RL) faces significant challenges in exploitation and exploration, resulting in low sample efficiency and training stability. As a time-efficient diffusion model, although consistency models have been validated in online state-based RL, it is still an open question whether it can be extended to visual RL. In this paper, we investigate the impact of non-stationary distribution and the actor-critic framework on consistency policy in online RL, and find that consistency policy was unstable during the training, especially in visual RL with the high-dimensional state space. To this end, we suggest sample-based entropy regularization to stabilize the policy training, and propose a consistency policy with prioritized proximal experience regularization (CP3ER) to improve sample efficiency. CP3ER achieves new state-of-the-art (SOTA) performance in 21 tasks across DeepMind control suite and Meta-world. To our knowledge, CP3ER is the first method to apply diffusion/consistency models to visual RL and demonstrates the potential of consistency models in visual RL.