Beyond Scalar Critics: A Distributional Perspective on Reinforcement Learning with Verifiable Rewards for LLMs
Jan 1, 2026·
,,,,,,,,,,·
0 min read
Jinyi Liu
Yiboyun Chen
Hongyao Tang
Yi Ma
Shuyue Hu
Yang Chen
Fei Ni
Qiaosheng Zhang
Lei Bai
Yan Zheng
Jianye Hao