Welcome to Dapeng Li (李大鹏)’s Homepage!
I received my Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, and School of Artificial Intelligence, University of Chinese Academy of Sciences, advised by Prof. Guoliang Fan. Currently, I am working as the Head of Agent Reinforcement Learning Group at Li Auto Inc. (理想汽车), and also a Joint Postdoctoral Researcher at Li Auto Inc. and Tsinghua University (清华大学). My research focuses on Agent Reinforcement Learning, Multi-agent Systems, and Large Language Model (LLM) Agents.
我于中国科学院自动化研究所、中国科学院大学人工智能学院获得博士学位,师从范国梁教授。目前在理想汽车智能体强化学习组从事算法研发,同时也是理想汽车与清华大学联合在职博士后,关注智能体强化学习、多智能体系统和大语言模型智能体相关的研究与应用。
🔥 Fields of Interest
My research interests include reinforcement learning, multi-agent systems and data mining & analysis. Currently, I focus on the following research topics:
- Multi-agent collaboration
- Multi-agent Reinforcement Learning
- Quantitative Finance
- Large Language Model (LLM) Agents
Other Interest:
- Guitar
- Ski
📖 Education & Experience
-
2025.08 - Present Head of Agent RL Group at Li Auto Inc. and Joint Postdoctoral Researcher at Li Auto Inc. and Tsinghua University
-
2020.09 - 2025.06 Ph.D. in Institute of Automation, Chinese Academy of Sciences
Supervisor: Prof. Guoliang Fan -
2016.09 - 2020.06 B.E. in Department of information, Beijing University Of Technology
Major: Electronic Engineering
🎖 Selected Competitions and Awards
- The 1st (1/1122), 2021, DataFountain Green Future Competition, Wind Power Abnormal Data Recognition Track
- The 1st (1/620), 2021, DataFountain Green Future Competition, Photovoltaic Abnormal Data Recognition Track
- The 3rd (3/172), 2021, Global Open Data Application Innovation Competition, Wind Field Downscaling track
- The 2nd (2/423), 2021, Global Open Data Application Innovation Competition, Road Detection track
- The Grand Prize (1/158), 2021, Golden Wind Cup, Tsinghua
- The 3rd (3/1511), 2021, DCIC Digital China Innovation Competition
- The 3rd (3/739), 2021, iFLYTEK A.I. Advertising Picture Material Classification Algorithm Challenge
- The 3rd, 2021, NeurIPS workshop MineRL intro
- The 1st (1/4337), 2021, Tianchi Global AI Innovation Contest
- The Second Prize (National), 2021, National Post-Graduate Mathematical Contest in Modeling
- The 1st (1/2800), 2019, China Datathon
- The First Prize, 2019, The “Challenge Cup” capital college students competition
- The 1st (1/475), 2019, National University Student Transportation Science and Technology Competition
- Silver medal, 2019, Microsoft Malware Prediction, Kaggle
- The Second Prize (Global), 2019, International Competition of Autonomous Running Intelligent Robots
- The 1st Prize, 2018, China Robot Competition, by Chinese Association of Automation (National)
Some awards in my graduate and undergraduate school:
- CCF中国计算机协会优秀导师奖(Outstanding Mentors), CCF
- 2021 Excellent Student, University of Chinese Academy of Sciences
- 2020 Outstanding Graduates of Beijing
- 2020 Beijing University of Technology Top Ten Graduates
- 2020 Beijing University of Technology President Scholarship (Only ten recipients in the entire school)
- 2019 Technology Innovation and Practice Scholarship
💻 Internships
- 2023, Huawei, China
- 2024, Microsoft Research Asia (MSRA), China
📝 Selected Publications
Conferences:
-
Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning
Reinforcement Learning Conference(RLC), 2026.
Zhiwei Xu, Hangyu Mao, Nianmin Zhang, Shengtao Zhang, Xin Xin, Pengjie Ren, Dapeng Li, Bin Zhang, Guoliang Fan, Zhumin Chen, Changwei Wang, and Jiangjin Yin -
From Traits to Roles: Consensus-Guided Composition of Orthogonal Experts for Cooperative MARL
International Joint Conference on Artificial Intelligence(IJCAI), 2026.
Yewei Zhou, Bin Zhang, Ying Zhou, Xuri Ge, Dapeng Li, Hangyu Mao, Pengjie Ren, and Zhiwei Xu -
QSIM: Mitigating Overestimation in Multi-Agent Reinforcement Learning via Action Similarity Weighted Q-Learning
International Conference on Automated Planning and Scheduling(ICAPS), 2026.
Yuanjun Li, Bin Zhang, Hao Chen, Zhouyang Jiang, Dapeng Li, and Zhiwei Xu -
Peak-Return Greedy Slicing: Subtrajectory Selection for Transformer-based Offline RL
The Fourteenth International Conference on Learning Representations(ICLR), 2026.
Zhiwei Xu, Miduo Cui, Dapeng Li, Zhihao Liu, Haifeng Zhang, Hangyu Mao, Guoliang Fan, and Bin Zhang -
Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimization
IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2026.
Junjie Song*, Yiwen Liu*, Dapeng Li*, Yin Sun, Shukun Fu, Siqi Chen, and Yuji Cao
*Equal contribution -
Efficient Communication in Multi-Agent Reinforcement Learning with Implicit Consensus Generation
The 39th Annual AAAI Conference on Artificial Intelligence(AAAI), Philadelphia, Pennsylvania, USA, 2025. (Oral)
Dapeng Li, Na Lou, Zhiwei Xu, Bin Zhang, and Guoliang Fan -
Sequential asynchronous action coordination in multi-agent systems: A stackelberg decision transformer approach
The Forty-first International Conference on Machine Learning(ICML), 2024.
Bin Zhang, Hangyu Mao, Lijuan Li, Zhiwei Xu, Dapeng Li, Rui Zhao, Guoliang Fan -
Reidentify: Context-Aware Identity Generation for Contextual Multi-Agent Reinforcement Learning
Forty-second International Conference on Machine Learning(ICML), Vancouver, Canada, 2025.
Zhiwei Xu, Kun Hu, Xin Xin, Weiliang Meng, Yiwei Shi, Hangyu Mao, Bin Zhang, Dapeng Li, and Jiangjin Yin -
From Explicit Communication to Tacit Cooperation: A Novel Paradigm for Cooperative MARL
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Auckland, New Zealand, 2024. (Extended Abstract)
Dapeng Li, Zhiwei Xu, Bin Zhang, and Guoliang Fan -
Adaptive Parameter Sharing for Multi-Agent Reinforcement Learning
IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), Seoul, Korea, 2024.
Dapeng Li, Na Lou, Bin Zhang, Zhiwei Xu, and Guoliang Fan -
SEA: A Spatially Explicit Architecture for Multi-Agent Reinforcement Learning
International Joint Conference on Neural Networks(IJCNN), Queensland, Australia, 2023.
Dapeng Li, Zhiwei Xu, Bin Zhang, and Guoliang Fan -
Dual Self-Awareness Value Decomposition Framework without Individual Global Max for Cooperative MARL
Thirty-seventh Conference on Neural Information Processing Systems(NeurIPS), New Orleans, USA, 2023.
Zhiwei Xu, Bin Zhang, Dapeng Li, Guangchong Zhou, Zeren Zhang, and Guoliang Fan -
Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential Decision-Making in Multi-Agent Reinforcement Learning
32nd International Joint Conference on Artificial Intelligence(IJCAI), Macao, S.A.R, China, 2023.
Bin Zhang, Lijuan Li, Zhiwei Xu, Dapeng Li, and Guoliang Fan -
Unveiling Decision Intention for Cooperative Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Detroit, Michigan, USA, 2025.
Zeren Zhang, Zhiwei Xu, Guangchong Zhou, Dapeng Li, Bin Zhang, and Guoliang Fan -
Decentralized Extension for Centralized Multi-Agent Reinforcement Learning via Online Distillation
International Conference on Neural Information Processing(ICONIP), Auckland, New Zealand, 2024.
Zeren Zhang, Bin Zhang, Guangchong Zhou, Dapeng Li, Zhiwei Xu, and Guoliang Fan -
Consensus Learning for Cooperative Multi-Agent Reinforcement Learning
Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI), Washington, DC, USA, 2023. (Oral)
Zhiwei Xu, Bin Zhang, Dapeng Li, Zeren Zhang, Guangchong Zhou, and Guoliang Fan -
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism
Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI), Washington, DC, USA, 2023. (Oral)
Zhiwei Xu, Yunpeng Bai, Bin Zhang, Dapeng Li, and Guoliang Fan -
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
Thirty-sixth Conference on Neural Information Processing Systems(NeurIPS), New Orleans, USA, 2022. (Spotlight)
Zhiwei Xu, Dapeng Li, Bin Zhang, Yuan Zhan, Yunpeng Bai, and Guoliang Fan -
Multi-Agent Hyper-Attention Policy Optimization
International Conference on Neural Information Processing(ICONIP), New Delhi, India, 2022.
Bin Zhang, Zhiwei Xu, Yiqun Chen, Dapeng Li, Yunpeng Bai, Guoliang Fan, and Lijuan Li -
Efficient Policy Generation in Multi-Agent Systems via Hypergraph Neural Network
International Conference on Neural Information Processing(ICONIP), New Delhi, India, 2022.
Bin Zhang, Yunpeng Bai, Zhiwei Xu, Dapeng Li, and Guoliang Fan -
MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning
International Joint Conference on Neural Networks(IJCNN), Shenzhen, China, 2021. (Poster)
Zhiwei Xu, Dapeng Li, Yunpeng Bai, and Guoliang Fan -
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS), Auckland, New Zealand, 2022. (Oral)
Zhiwei Xu, Yunpeng Bai, Dapeng Li, Bin Zhang, and Guoliang Fan -
Learning to Coordinate via Multiple Graph Neural Networks
International Conference on Neural Information Processing(ICONIP), BALI, Indonesia, 2021.
Zhiwei Xu, Bin Zhang, Yunpeng Bai, Dapeng Li, and Guoliang Fan
Pre-prints:
-
Constructing Informative Subtask Representations for Multi-Agent Coordination
Guangchong Zhou, Zhiwei Xu, Bin Zhang, Dapeng Li, Zeren Zhang, Guoliang Fan -
Style Miner: Find Significant and Stable Explanatory Factors in Time Series with Constrained Reinforcement Learning
Dapeng Li, Feiyang Pan, Jia He, Zhiwei Xu, Dandan Tu, and Guoliang Fan
💻 Services
Program Committee Member or Reviewer:
- Neural Information Processing Systems (NeurIPS)
- International Conference on Learning Representations (ICLR)
- International Conference on Machine Learning (ICML)
- AAAI Conference on Artificial Intelligence (AAAI)