This is the Hi Robot: a Hierarchical Interactive Robot that can listen and think harder to get tasks done. Researchers managed to get robots to think hierarchically with this approach:
– A high-level VLM interprets user input, generates language commands & verbal responses
– A low-level VLA executes atomic actions (e.g., “pick up a slice of bread“)
This lets robots to break down complex prompts and adapt in real-time.
Introducing Hi Robot – Hierarchical Interactive Robot
Our first step at @physical_int towards teaching robots to listen and think harder.A 🧵 on how we make robots more steerable 👇 pic.twitter.com/L77RMdxVOO
— Lucy Shi (@lucy_x_shi) February 26, 2025
By using VLMs, it is possible to relabel demos with hypothetical human prompts and interjections. The above video shows how this robot can make sandwiches.
[HT]