Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human FeedbackJan 1, 2024·Yifu Yuan,HAO Jianye,Yi Ma,Zibin Dong,Hebin LiangJinyi Liu,Zhixin Feng,Kai Zhao,Yan Zheng· 0 min read CiteTypeConference paperPublicationThe Twelfth International Conference on Learning RepresentationsLast updated on Jan 1, 2024DRL PbRL AuthorsJinyi LiuPh.D. Candidate ← Peria: Perceive, reason, imagine, act via holistic language and vision planning for manipulation Jan 1, 2024vMFER: Von Mises-Fisher experience resampling based on uncertainty of gradient directions for policy improvement Jan 1, 2024 →