π0.5: A VLA with open-world generalization by lachyg

9CommentsShare PostShare on Facebook Share on XShare by EmailSend Link

Vídeo

π0.5: A VLA with open-world generalization by lachyg

ByHackTech April 22, 2025

9Comments

Share This Article

Sed ut perspiciatis unde.

Send to HN

π0.5: A VLA with open-world generalization
Read More

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (9)

9 Comments

Post Author

beklein

Posted April 22, 2025 at 6:06 pm

This is amazing! As someone working with industrial robots, normally under strict environmental constraints and control, witnessing such real-world robotics progress truly excites me about the future!

By the way, they’ve open-sourced their π0 model (code and model weights).
More information can be found here: https://github.com/Physical-Intelligence/openpi

0Likes Log in to Reply
Post Author

gs17

Posted April 22, 2025 at 6:08 pm

Is the robot platform they're using something they've developed themselves? The paper doesn't seem to mention any details outside of sensors and actuators.

0Likes Log in to Reply
Post Author

meisel

Posted April 22, 2025 at 6:32 pm

These variable-length arrays are getting quite advanced

0Likes Log in to Reply
Post Author

djoldman

Posted April 22, 2025 at 6:40 pm

I'm genuinely asking (not trying to be snarky)… Why are these robots so slow?

Is it a throughput constraint given too much data from the environment sensors?

Is it processing the data?

I'm curious about where the bottleneck is.

0Likes Log in to Reply
Post Author

airstrike

Posted April 22, 2025 at 7:09 pm

I'm just a layman, but I can't see this design scaling. It's way too slow and "hard" for fine motor tasks like cleaning up a kitchen or being anywhere around humans, really.

I think the future is in "softer" type of robots that can sense whether their robot fingers are pushing a cabinet door (or if it's facing resistance) and adjust accordingly. A quick google search shows this example (animated render) which is closer to what I imagine the ultimate solution will be: https://compliance-robotics.com/compliance-industry/

Human flesh is way too squishy for us to allow hard tools to interface with it, unless the human is in control. The difference between a blunt weapon and the robot from TFA is that the latter is very slow and on wheels.

0Likes Log in to Reply
Post Author

huydotnet

Posted April 22, 2025 at 8:01 pm

Amazing! On a fun note, I believe if a human kid were cleaning up the spill and threw the sponge into the sink like that, the kid would be in trouble. XD

0Likes Log in to Reply
Post Author

th0ma5

Posted April 22, 2025 at 8:05 pm

Does the general laws of demos apply here? Than any automation shown is the extent of capabilities not the start?

0Likes Log in to Reply
Post Author

desertmonad

Posted April 22, 2025 at 8:48 pm

Finally, machines doing the work we dont want to do

0Likes Log in to Reply
Post Author

bytesandbits

Posted April 22, 2025 at 9:32 pm

Most of it is open source. Their VLAs are based upon Gemma models + vision encoders, plus their own action experts. You can download and play around or fine tune their Pi0 VLAs from their servers directly (JAX format) or from Huggingface LeRobot safetensors port. They also have notebooks and code in their repo to get started with fine-tuning. Inference runs in a single 4090 RTX streamed over WiFi to the robot.

0Likes Log in to Reply

π0.5: A VLA with open-world generalization by lachyg

π0.5: A VLA with open-world generalization by lachyg

Share This Article

Newsletter

HackTech

9 Comments

beklein

gs17

meisel

djoldman

airstrike

huydotnet

th0ma5

desertmonad

bytesandbits

Leave a comment Cancel reply

Editor's Choice

π0.5: A VLA with open-world generalization by lachyg

π0.5: A VLA with open-world generalization by lachyg

Share This Article

Newsletter

9 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter