Sep 26, 2025
Jul 1, 2024
We propose Optimistic Value Distribution Explorer (OVD-Explorer) to achieve a noise-aware optimistic exploration for continuous control.
May 1, 2024
Organizing samples in a trajective manner can improve the learning efficiency for offline RL algorithms.
May 1, 2024
Feb 1, 2024
Jan 1, 2024
Jan 1, 2024
Jan 1, 2024
Jan 1, 2023
Jan 1, 2023