@ Conference on Robot Learning (CoRL) 2023
1UC Berkeley
2Google DeepMind
3Stanford University
4Simon Fraser University
TLDR
We train anthropomorphic robot hands to play the piano using deep RL
and release a simulated benchmark and dataset to advance high-dimensional control.
Interactive Demo
This is a demo of our simulated piano playing agent trained with reinforcement learning. It runs
MuJoCo natively in your browser thanks to WebAssembly.
You can use your mouse to interact with it, for example by dragging down the piano keys to generate sound
or pushing the hands to perturb them. The controls section in the top right corner can be used to
change songs and the simulation section to pause or reset the agent. Make sure you click the
demo at least once to enable sound.
Overview
Simulation
We build our simulated piano-playing environment using the open-source MuJoCo
physics engine.
It consists in a full-size 88-key digital keyboard and two Shadow Dexterous
Hands,
each with 24 degrees of freedom.
Musical representation
We use the Musical Instrument Digital Interface (MIDI) standard
to represent
a musical piece as a sequence of time-stamped messages corresponding to “note-on” or “note-off” events. A message
carries
additional pieces of information such as the pitch of a note and its velocity.
We convert the MIDI file into a time-indexed note trajectory (also known as a piano roll),
where each note is represented as a one-hot vector of length 88 (the number of keys on a piano). This trajectory
is used as the goal
representation for our agent, informing it which keys to press at each time step.
The interactive plot below shows the song Twinkle Twinkle Little Star encoded as a piano roll. The x-axis
represents
time in seconds, and the y-axis represents musical pitch as a number between 1 and 88. You can hover over each
note to
see what additional information it carries.
A synthesizer can be used to convert MIDI files to raw
audio:
Musical evaluation
We use precision, recall and F1 scores to evaluate the proficiency of our agent. If at a given instance of time
there are keys that should be “on”
and keys that should be “off”, precision measures how good the agent is at not hitting any of the keys that should
be “off”, while recall measures
how good the agent is at hitting all the keys that should be “on”. The F1 score combines the precision and recall
into a single metric, and ranges
from 0 (if either precision or recall is 0) to 1 (perfect precision and recall).
Piano fingering and dataset
Piano fingering refers to the assignment of fingers to notes in a piano piece (see figure below).
Sheet music will typically provide sparse fingering labels for the tricky sections of
15 Comments
fedeb95
waiting for Robot Devil.
AIFounder
[dead]
kbouck
Any way we can get the Westworld intro song into the demo list?
amelius
I mean, keys are always in the place where you expect them to be, so what is the challenge here? (Yes, too lazy to read the article).
davidanekstein
Funny enough I’m listening to Rhapsody in Blue while browsing HN. I’d like to see it do that for 17 minutes.
criddell
Is this actually buildable? I'd like to hear it on a real piano because (IMHO) the one in the videos sounds bad.
I'd like to also hear how loud the mechanical noise of the machine playing the piano would be. Does the left hand work harder with the heavier keys? What would the hands be mounted to?
rossant
Nice work! And great interactive 3d application. My 6yo had a lot of fun annoying the robot while it's playing by forcefully moving its hands around.
pama
This was a nice demonstration of letting these robotic hands learn to play a keyboard. The technique is limited due to constraints of the hands, and is closer to parts of the technique of early keyboard instruments like an organ or a harpsichord, rather than that of a modern grand piano, which requires a lot more control of the body core, shoulders, elbows, arms, and wrists with the fingers doing as minimal a motion as possible. I suppose a similar algorithm learning to play on a grand piano using a full humanoid body could learn a technique that would be exciting to analyze.
PatronBernard
The thing needs a teacher, terrible technique! /s
beardyw
All the right notes but not necessarily at the right time.
linwangg
This's awesome!!!
gtirloni
Imediatelly reminded of the anonymous quote: "I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do laundry and dishes."
I understand it's a bad argument because 1) people may need assistance to do art/writing and 2) the advancements gained from teaching AI to do art can be applied to other non-artistic endeavors (e.g. piano AI could be really good to operate machines with lots of buttons and no computerized interface).
However, the cost cutting side of the argument is the one that bothers me because companies/people WILL use AI like that in place of actual humans because they're likely to be cheaper in the future. So that pianist or musician playing in a local restaurant can be sure their job will be automated away by a subpar AI and real humans will be relegated to very expensive locations (an extension of replacing humans with recorded music, in a way).
My pessismist side thinks greed will be the downfall of humanity.
pacificmaelstrm
"Player Piano"
Tycho
Has anyone tried to get a robot to do oil paintings?
torusle
Honestly,
this is really bad. It might be a breakthrough of what you are doing, but when I listen to the output all of the timing and phrasing is aweful.