FSD connects spatial visual reasoning with robotic action by generating structured intermediate representations that improve generalization on unseen manipulation tasks.
Jan 3, 2026
This paper presents RoboAnnotatorX, a comprehensive and universal framework for annotating long-horizon robot demonstrations to enable accurate understanding.
Jul 10, 2025
A study of how multimodal LLM feedback can improve robotic manipulation planning and execution.
Feb 1, 2024
A holistic language-and-vision planning framework that unifies perception, reasoning, imagination, and action for manipulation.
Jan 1, 2024
A unified keyframe identification and skill annotation method for long-horizon robot demonstrations.
Jan 1, 2024