News!

πŸ“„ (2024.10) Our Tutorial: Inverse RL Meets LLMs has been accepted for AAAI-2025! Join us in Philadelphia and let us explore the potential of Inverse RL in the era of LLMs!
πŸ’¬ (2024.10) New talk on Inverse RL Meets LLMs at the vdsLab2024 OpenHouse and UCLA Zhou Lab. This talk summarizes our efforts in using IRL for better Prompting, Fine-Tuning, and Inference-Time Optimization. Slide is online
πŸ“„ (2024.09) Our Data Centric Reward Modeling paper is accepted by the Journal of Data-Centric Machine Learning Research (DMLR).
πŸ“„ (2024.08) InverseRLignment is presented at the RL beyond reward workshop (accepted with score 9) at the 1-st RLC.
πŸ“„ (2024.05) InverseRLignment is online, it builds reward models from SFT data.
πŸ“„ (2024.05) Our Dense Reward Model paper is accepted by ICML 2024.
πŸ“„ (2024.03) I wrote an article arguing that Supervised Fine Tuning is Inverse Reinforcement Learning!
πŸ’¬ (2024.03) Prompt-OIRL and RATP are featured at the Inspiration Exchange, recording is online .
πŸ“„ (2024.02) 2 RL+LLM papers are online! ABC uses the attention mechanism to solve the credit assignment problem in RLHF; RATP uses MCTS to enhance the reasoning ability of LLMs with external documents
πŸ“„ (2024.01) 1 RL+LLM paper is accepted by ICLR 2024! Prompt-OIRL uses Inverse RL to Evaluate and Optimize Prompts for Reasoning.
πŸ’¬ (2024.01) Invited talk on RLHF at the Intuit AI Research Forum. slide
πŸ’¬ (2023.12) Invited talk on RLHF at the Likelihood Lab slide
πŸ’¬ (2023.11) Invited talk on RLHF at the CoAI group, THU.. slide
πŸ“„ (2023.10) Prompt-OIRL is selected as an oral presentation at the NeurIPS 2023 ENLSP workshop!
πŸ“„ (2023.10) I wrote an article on RLHF to share my thoughts as an RL researcher in the Era of LLMs.
πŸ“„ (2023.9) 2 papers on Interpretable Offline RL and Interpretable Uncertainty Quantification are accepted by NeurIPS 2023.
πŸ’¬ (2023.9) Invited talk on β€œReinforcement Learning in the Era of LLMs” at Kuaishou Research. slide is online
πŸ“„ (2023.2) 2 papers are accepted by AISTATS 2023.
πŸ’¬ (2022.11) Invited talk on value-based DRL at HW Cloud Research. slide is online
πŸ“„ (2022.9) 1 paper on Value-Based DeepRL is accepted by NeurIPS 2022. 2 papers are presented at the FMDM workshop, and 2 papers are presented at the DeepRL workshop.
πŸ“„ (2022.1) 1 paper on Offline GCRL is accepted by ICLR 2022.