About Me
π I am a penultimate year PhD student at the University of Cambridge, supervised by Prof. Mihaela van der Schaar. During my M.Phil. study at MMLab@CUHK, I was advised by Prof. Dahua Lin and Prof. Bolei Zhou; I received my BSc in Physics from the Yuanpei Honor Program, at Peking University, and a BSc from the Guanghua School of Management, at Peking University. My undergrad thesis was advised by Prof. Zhouchen Lin.
π€οΈ I believe Reinforcement Learning is a vital component of the solution for achieving AGI. My previous work on deep reinforcement learning is motivated by practical applications like robotics, healthcare, finance, and large language models. My research keywords during the past 4 years include:
- RL via Supervised Learning (2020-); Goal-Conditioned RL (2020-)
- Value-Based DRL (2021-); Offline RL (2021-); Optimism in Exploration (2021-);
- Uncertainty Quantification (2022-); Data-Centric Off-Policy Evaluation (2022-);
- Interpretable RL (2023-); RL in Language Models. (2023-)
π€ Iβm open to collaborations. Please drop me an email if you find my work interesting. Let us push RL closer to genuine general intelligence!
News
π¬ (2023.11) Iβm thrilled to share my thoughts on RLHF with the CoAI group, THU. slide is online
π (2023.10) Prompt-OIRL is selected as an oral presentation at the NeurIPS 2023 ENLSP workshop!
π (2023.10) I wrote an article on RLHF to share my thoughts as an RL researcher in the Era of LLMs.
π (2023.9) Our work Prompt-OIRL on offline prompt evaluation and optimization using IRL is online.
π (2023.9) 2 papers are accepted by NeurIPS 2023. Iβm looking forward to the reunion in New Orleans!
π¬ (2023.9) Iβm honored to share my experience and ideas with Kuaishou Research in a talk titled βReinforcement Learning in the Era of LLMsβ. slide is online
π (2023.2) 2 papers are accepted by AISTATS 2023.
π¬ (2022.11) Iβm honored to share my experience and ideas with HW Cloud Research through a talk on value-based DRL. slide is online
π (2022.9) 1 paper is accepted by NeurIPS 2022. 2 papers are presented at the FMDM workshop, and 2 papers are presented at the DeepRL workshop.
π (2022.1) 1 paper is accepted by ICLR 2022.