Jinyi Liu (刘金毅)
  • Bio
  • Papers
  • News
  • Experience
  • Projects
  • Publications
    • AFE-Master: Enhancing LLM-Driven Autonomous Feature Engineering with Domain-Specific Language Parsing and Guided Local Search
    • Hands-on LLM-based Agents: A Tutorial for General Audiences
    • Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning
    • From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models
    • RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration
    • Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model
    • Unlocking Multi-Agent Debate Potential: Enhancing Effective Scaling through Role Allocation Strategies
    • DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering
    • War of Thoughts: Competition Stimulates Stronger Reasoning in Large Language Models
    • From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
    • Key Decision-Makers in Multi-Agent Debates: Who Holds the Power?
    • SheetAgent: towards a generalist agent for spreadsheet reasoning and manipulation via large language models
    • Cellagent: An llm-driven multi-agent framework for automated single-cell data analysis
    • Optimizing Reward Models with Proximal Policy Exploration in Preference-Based Reinforcement Learning
    • A trajectory perspective on the role of data sampling techniques in offline reinforcement learning
    • Ovd-explorer: Optimism should not be the sole pursuit of exploration in noisy environments
    • Enhancing robotic manipulation with AI feedback from multimodal large language models
    • ENOTO: improving offline-to-online reinforcement learning with Q-ensembles
    • Kisa: A unified keyframe identifier and skill annotator for long-horizon robotics demonstrations
    • Peria: Perceive, reason, imagine, act via holistic language and vision planning for manipulation
    • Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
    • vMFER: Von Mises-Fisher experience resampling based on uncertainty of gradient directions for policy improvement
    • EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model
    • Exploration in deep reinforcement learning: From single-agent to multiagent domain
    • OSCAR: OOD State-Conservative Offline Reinforcement Learning for Sequential Decision Making
    • Figcps: Effective failure-inducing input generation for cyber-physical systems with deep reinforcement learning
  • Projects
    • LLM Agent Tutorial
    • CellAgent
    • Uni-RLHF
  • Projects
  • Experience

Uni-RLHF

Jan 1, 2024 · 1 min read
Go to Project Site

Universal Platform for Reinforcement Learning with Diverse Feedback Types.

Last updated on Jan 1, 2024
PbRL RLHF
Jinyi Liu
Authors
Jinyi Liu
Ph.D. Candidate

← CellAgent Jan 1, 2025

© 2026 Jinyi Liu. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.