News!

๐Ÿ‡ธ๐Ÿ‡ฌ (2025.04) Iโ€™ll attend ICLR 2025 in-person.

๐Ÿ‡บ๐Ÿ‡ธ (2025.03) Guest lecture on Inverse RL Meets LLMs at the UCLA Reinforcement Learning course.

๐Ÿ‡บ๐Ÿ‡ธ (2025.02) Attending AAAI 2025 to run the Tutorial: Inverse RL Meets LLMs. Thanks for joining us in Philadelphia! Slide.

๐Ÿ“„ (2025.02) Our Reward Model Paper Part IV: Multi-Objective and Personalized Alignment with PCA is online.

๐Ÿ“„ (2025.02) Our Reward Model Paper Part III: Infrastructure for Reproducible Reward Model Research os online.

๐Ÿ“„ (2025.02) Our Reward Model Paper Part II: Active Reward Modeling is online.

๐Ÿ“„ (2025.01) Our Reward Model Paper Part I: Foundation, Theory, and Alternatives is accepted by ICLR as an Oral ๐ŸŽ‰. It is an amazing experience to co-lead this paper wity Yunyi and advised by Jef.

๐Ÿ‡ฆ๐Ÿ‡น (2024.12) We will run the Tutorial: Inverse RL Meets LLMs at ACL-2025, see you at Vienna!

๐Ÿ‡ฌ๐Ÿ‡ง (2024.10) New talk on Inverse RL Meets LLMs at the vdsLab2024 OpenHouse and UCLA Zhou Lab. Slide is online

๐Ÿ“„ (2024.09) Our Data Centric Reward Modeling paper is accepted by the Journal of Data-Centric Machine Learning Research (DMLR).

๐Ÿ‡บ๐Ÿ‡ธ (2024.08) InverseRLignment is presented at the RL beyond reward workshop (accepted with score 9) at the 1-st RLConference, it builds reward models from SFT data..

๐Ÿ“„ (2024.05) Our RLHF with Dense Reward paper is accepted by ICML 2024.

๐Ÿ‡ฌ๐Ÿ‡ง (2024.03) Prompt-OIRL and RATP are featured at the Inspiration Exchange, recording is online .

๐Ÿ‡ฆ๐Ÿ‡น (2024.01) 1 RL + LLM Reasoning paper is accepted by ICLR 2024! Prompt-OIRL uses Inverse RL to evaluate and optimize prompts for Math Reasoning.

๐Ÿ‡บ๐Ÿ‡ธ (2024.01) Invited talk on RLHF at the Intuit AI Research Forum. slide

๐Ÿ‡จ๐Ÿ‡ณ (2023.12) Invited talk on RLHF at the Likelihood Lab slide

๐Ÿ‡จ๐Ÿ‡ณ (2023.11) Invited talk on RLHF at the CoAI group, THU.. slide

๐Ÿ“„ (2023.10) Prompt-OIRL is selected as an oral presentation ๐ŸŽ‰ at the NeurIPS 2023 ENLSP workshop!

๐Ÿ“„ (2023.10) I wrote an article to share my thoughts as an RL researcher in the Era of LLMs.

๐Ÿ“„ (2023.09) 2 papers on Interpretable Offline RL and Interpretable Uncertainty Quantification are accepted by NeurIPS 2023.

๐Ÿ‡จ๐Ÿ‡ณ (2023.9) Invited talk on โ€œReinforcement Learning in the Era of LLMsโ€ at Kuaishou Research. slide is online

๐Ÿ“„ (2023.2) 2 papers are accepted by AISTATS 2023.

๐Ÿ‡ฎ๐Ÿ‡ช (2022.11) Invited talk on value-based DRL at HW Cloud Research. slide is online

๐Ÿ“„ (2022.9) 1 paper on Value-Based DeepRL is accepted by NeurIPS 2022. 2 papers are presented at the FMDM workshop, and 2 papers are presented at the DeepRL workshop.

๐Ÿ“„ (2022.1) 1 paper on Offline GCRL is accepted by ICLR 2022.