Using π0 VLA + Kinesthetic Teaching on OpenDroid R2D3
A weekend hackathon project that trained a dual-arm robot to pour heart patterns using vision-language-action models and flow matching.
From human demonstration to autonomous pouring
Human guides robot arms through pouring motions
3 cameras + 12D joint states at 20Hz
Learning subtle wrist movements and timing
Hardware and workspace configuration
OpenDroid R2D3 with dual Realman RM65 arms and kinesthetic teaching backpack. The backpack allows demonstrators to physically guide the robot's movements naturally.
Data Collection Props: Milk pitchers, espresso cups, milk frother, and all tools needed for demonstrating professional latte art pours.
The Goal: Professional heart patterns requiring smooth, coordinated pouring
Publicly available on HuggingFace Hub
# Load the dataset
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
dataset = LeRobotDataset("ridxm/latte-pour-demos")
episode = dataset[0]
print(f"State shape: {episode['observation.state'].shape}") # [12]
print(f"Action shape: {episode['action'].shape}") # [12]
Built on π0 VLA with flow matching
Pre-trained VLA from Physical Intelligence with 10k+ hours of robot data
Generates smooth, continuous action sequences for fluid pouring motions
15k training steps with bfloat16 mixed precision
Overlapping action chunks with exponential weighting for smooth execution
Complete guide to train your own latte art robot
# Quick start training
git clone https://github.com/ridxm/latte-art-robot
cd latte-art-robot
pip install lerobot torch wandb
# Set your dataset
# Edit scripts/train.py: --dataset.repo_id=YOUR_USERNAME/your-dataset
python scripts/train.py