Embodied AI

From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

FSD connects spatial visual reasoning with robotic action by generating structured intermediate representations that improve generalization on unseen manipulation tasks.

Jan 3, 2026

RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration

This paper presents RoboAnnotatorX, a comprehensive and universal framework for annotating long-horizon robot demonstrations to enable accurate understanding.

Jul 10, 2025

Enhancing robotic manipulation with AI feedback from multimodal large language models

A study of how multimodal LLM feedback can improve robotic manipulation planning and execution.

Feb 1, 2024

Peria: Perceive, reason, imagine, act via holistic language and vision planning for manipulation

A holistic language-and-vision planning framework that unifies perception, reasoning, imagination, and action for manipulation.

Jan 1, 2024

Kisa: A unified keyframe identifier and skill annotator for long-horizon robotics demonstrations

A unified keyframe identification and skill annotation method for long-horizon robot demonstrations.

Jan 1, 2024