Welcome to Dapeng Li (李大鹏)’s Homepage!

I received my Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, and School of Artificial Intelligence, University of Chinese Academy of Sciences, advised by Prof. Guoliang Fan. Currently, I am working as the Head of Agent Reinforcement Learning Group at Li Auto Inc. (理想汽车), and also a Joint Postdoctoral Researcher at Li Auto Inc. and Tsinghua University (清华大学). My research focuses on Agent Reinforcement Learning, Multi-agent Systems, and Large Language Model (LLM) Agents.

我于中国科学院自动化研究所、中国科学院大学人工智能学院获得博士学位，师从范国梁教授。目前在理想汽车智能体强化学习组从事算法研发，同时也是理想汽车与清华大学联合在职博士后，关注智能体强化学习、多智能体系统和大语言模型智能体相关的研究与应用。

🔥 Fields of Interest

My research interests include reinforcement learning, multi-agent systems and data mining & analysis. Currently, I focus on the following research topics:

Multi-agent collaboration
Multi-agent Reinforcement Learning
Quantitative Finance
Large Language Model (LLM) Agents

Other Interest:

Guitar
Ski

📖 Education & Experience

2025.08 - Present Head of Agent RL Group at Li Auto Inc. and Joint Postdoctoral Researcher at Li Auto Inc. and Tsinghua University
2020.09 - 2025.06 Ph.D. in Institute of Automation, Chinese Academy of Sciences
Supervisor: Prof. Guoliang Fan
2016.09 - 2020.06 B.E. in Department of information, Beijing University Of Technology
Major: Electronic Engineering

🎖 Selected Competitions and Awards

The 1st (1/1122), 2021, DataFountain Green Future Competition, Wind Power Abnormal Data Recognition Track
The 1st (1/620), 2021, DataFountain Green Future Competition, Photovoltaic Abnormal Data Recognition Track
The 3rd (3/172), 2021, Global Open Data Application Innovation Competition, Wind Field Downscaling track
The 2nd (2/423), 2021, Global Open Data Application Innovation Competition, Road Detection track
The Grand Prize (1/158), 2021, Golden Wind Cup, Tsinghua
The 3rd (3/1511), 2021, DCIC Digital China Innovation Competition
The 3rd (3/739), 2021, iFLYTEK A.I. Advertising Picture Material Classification Algorithm Challenge
The 3rd, 2021, NeurIPS workshop MineRL intro
The 1st (1/4337), 2021, Tianchi Global AI Innovation Contest
The Second Prize (National), 2021, National Post-Graduate Mathematical Contest in Modeling
The 1st (1/2800), 2019, China Datathon
The First Prize, 2019, The “Challenge Cup” capital college students competition
The 1st (1/475), 2019, National University Student Transportation Science and Technology Competition
Silver medal, 2019, Microsoft Malware Prediction, Kaggle
The Second Prize (Global), 2019, International Competition of Autonomous Running Intelligent Robots
The 1st Prize, 2018, China Robot Competition, by Chinese Association of Automation (National)

Some awards in my graduate and undergraduate school:

CCF中国计算机协会优秀导师奖（Outstanding Mentors）, CCF
2021 Excellent Student, University of Chinese Academy of Sciences
2020 Outstanding Graduates of Beijing
2020 Beijing University of Technology Top Ten Graduates
2020 Beijing University of Technology President Scholarship (Only ten recipients in the entire school)
2019 Technology Innovation and Practice Scholarship

💻 Internships

2023, Huawei, China
2024, Microsoft Research Asia (MSRA), China

📝 Selected Publications

Conferences:

Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning
Reinforcement Learning Conference(RLC), 2026.
Zhiwei Xu, Hangyu Mao, Nianmin Zhang, Shengtao Zhang, Xin Xin, Pengjie Ren, Dapeng Li, Bin Zhang, Guoliang Fan, Zhumin Chen, Changwei Wang, and Jiangjin Yin
From Traits to Roles: Consensus-Guided Composition of Orthogonal Experts for Cooperative MARL
International Joint Conference on Artificial Intelligence(IJCAI), 2026.
Yewei Zhou, Bin Zhang, Ying Zhou, Xuri Ge, Dapeng Li, Hangyu Mao, Pengjie Ren, and Zhiwei Xu
QSIM: Mitigating Overestimation in Multi-Agent Reinforcement Learning via Action Similarity Weighted Q-Learning
International Conference on Automated Planning and Scheduling(ICAPS), 2026.
Yuanjun Li, Bin Zhang, Hao Chen, Zhouyang Jiang, Dapeng Li, and Zhiwei Xu
Peak-Return Greedy Slicing: Subtrajectory Selection for Transformer-based Offline RL
The Fourteenth International Conference on Learning Representations(ICLR), 2026.
Zhiwei Xu, Miduo Cui, Dapeng Li, Zhihao Liu, Haifeng Zhang, Hangyu Mao, Guoliang Fan, and Bin Zhang
Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimization
IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2026.
Junjie Song^*, Yiwen Liu^*, Dapeng Li^*, Yin Sun, Shukun Fu, Siqi Chen, and Yuji Cao
^*Equal contribution
Efficient Communication in Multi-Agent Reinforcement Learning with Implicit Consensus Generation
The 39th Annual AAAI Conference on Artificial Intelligence(AAAI), Philadelphia, Pennsylvania, USA, 2025. (Oral)
Dapeng Li, Na Lou, Zhiwei Xu, Bin Zhang, and Guoliang Fan
Sequential asynchronous action coordination in multi-agent systems: A stackelberg decision transformer approach
The Forty-first International Conference on Machine Learning(ICML), 2024.
Bin Zhang, Hangyu Mao, Lijuan Li, Zhiwei Xu, Dapeng Li, Rui Zhao, Guoliang Fan
Reidentify: Context-Aware Identity Generation for Contextual Multi-Agent Reinforcement Learning
Forty-second International Conference on Machine Learning(ICML), Vancouver, Canada, 2025.
Zhiwei Xu, Kun Hu, Xin Xin, Weiliang Meng, Yiwei Shi, Hangyu Mao, Bin Zhang, Dapeng Li, and Jiangjin Yin
From Explicit Communication to Tacit Cooperation: A Novel Paradigm for Cooperative MARL
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Auckland, New Zealand, 2024. (Extended Abstract)
Dapeng Li, Zhiwei Xu, Bin Zhang, and Guoliang Fan
Adaptive Parameter Sharing for Multi-Agent Reinforcement Learning
IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), Seoul, Korea, 2024.
Dapeng Li, Na Lou, Bin Zhang, Zhiwei Xu, and Guoliang Fan
SEA: A Spatially Explicit Architecture for Multi-Agent Reinforcement Learning
International Joint Conference on Neural Networks(IJCNN), Queensland, Australia, 2023.
Dapeng Li, Zhiwei Xu, Bin Zhang, and Guoliang Fan
Dual Self-Awareness Value Decomposition Framework without Individual Global Max for Cooperative MARL
Thirty-seventh Conference on Neural Information Processing Systems(NeurIPS), New Orleans, USA, 2023.
Zhiwei Xu, Bin Zhang, Dapeng Li, Guangchong Zhou, Zeren Zhang, and Guoliang Fan
Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential Decision-Making in Multi-Agent Reinforcement Learning
32nd International Joint Conference on Artificial Intelligence(IJCAI), Macao, S.A.R, China, 2023.
Bin Zhang, Lijuan Li, Zhiwei Xu, Dapeng Li, and Guoliang Fan
Unveiling Decision Intention for Cooperative Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Detroit, Michigan, USA, 2025.
Zeren Zhang, Zhiwei Xu, Guangchong Zhou, Dapeng Li, Bin Zhang, and Guoliang Fan
Decentralized Extension for Centralized Multi-Agent Reinforcement Learning via Online Distillation
International Conference on Neural Information Processing(ICONIP), Auckland, New Zealand, 2024.
Zeren Zhang, Bin Zhang, Guangchong Zhou, Dapeng Li, Zhiwei Xu, and Guoliang Fan
Consensus Learning for Cooperative Multi-Agent Reinforcement Learning
Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI), Washington, DC, USA, 2023. (Oral)
Zhiwei Xu, Bin Zhang, Dapeng Li, Zeren Zhang, Guangchong Zhou, and Guoliang Fan
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism
Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI), Washington, DC, USA, 2023. (Oral)
Zhiwei Xu, Yunpeng Bai, Bin Zhang, Dapeng Li, and Guoliang Fan
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
Thirty-sixth Conference on Neural Information Processing Systems(NeurIPS), New Orleans, USA, 2022. (Spotlight)
Zhiwei Xu, Dapeng Li, Bin Zhang, Yuan Zhan, Yunpeng Bai, and Guoliang Fan
Multi-Agent Hyper-Attention Policy Optimization
International Conference on Neural Information Processing(ICONIP), New Delhi, India, 2022.
Bin Zhang, Zhiwei Xu, Yiqun Chen, Dapeng Li, Yunpeng Bai, Guoliang Fan, and Lijuan Li
Efficient Policy Generation in Multi-Agent Systems via Hypergraph Neural Network
International Conference on Neural Information Processing(ICONIP), New Delhi, India, 2022.
Bin Zhang, Yunpeng Bai, Zhiwei Xu, Dapeng Li, and Guoliang Fan
MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning
International Joint Conference on Neural Networks(IJCNN), Shenzhen, China, 2021. (Poster)
Zhiwei Xu, Dapeng Li, Yunpeng Bai, and Guoliang Fan
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Auckland, New Zealand, 2022. (Oral)
Zhiwei Xu, Yunpeng Bai, Dapeng Li, Bin Zhang, and Guoliang Fan
Learning to Coordinate via Multiple Graph Neural Networks
International Conference on Neural Information Processing(ICONIP), BALI, Indonesia, 2021.
Zhiwei Xu, Bin Zhang, Yunpeng Bai, Dapeng Li, and Guoliang Fan

Pre-prints:

Constructing Informative Subtask Representations for Multi-Agent Coordination
Guangchong Zhou, Zhiwei Xu, Bin Zhang, Dapeng Li, Zeren Zhang, Guoliang Fan
Style Miner: Find Significant and Stable Explanatory Factors in Time Series with Constrained Reinforcement Learning
Dapeng Li, Feiyang Pan, Jia He, Zhiwei Xu, Dandan Tu, and Guoliang Fan

💻 Services

Program Committee Member or Reviewer:

Neural Information Processing Systems (NeurIPS)
International Conference on Learning Representations (ICLR)
International Conference on Machine Learning (ICML)
AAAI Conference on Artificial Intelligence (AAAI)