Show HN: Beating Pokemon Red with RL and

Share This Article

Sed ut perspiciatis unde.

Hi! Since 2020, we’ve been developing a reinforcement learning (RL) agent to beat the 1996 game Pokémon Red.
As of February 2025, we are able to beat Pokémon Red with Reinforcement Learning using a <10 million parameter policy (60500x smaller than DeepSeekV3) and with minimal simplifications. The output is not a policy capable of beating Pokémon, but a technique for producing solutions to Pokémon. This website describes the system’s current state. All code is open sourced and available for you, the reader, to try
.

As improvements to the codebase are made, the changelog will be updated.

What is Pokémon Red?
#

Pokémon Red, released in 1996, is a single player Japanese role playing game (JRPG) that follows the journey of a new “Pokémon Trainer.” Players capture Pokémon “creatures” to battle against opposing Pokémon,
explore the world and progress through the game’s storyline. Pokémon has two goals:

Catch all possible Pokémon species.
Become the “champion.”

We focused on the second (more popular) goal, becoming the champion.

Why Pokémon Red
#

Why do we care about developing an agent to beat Pokémon with machine learning?
The answer is really a bit higher level. We believe solving JRPGs with reinforcement learning provide extremely difficult challenges not present in current RL environments. It is our hope that JRPGs will provide a great benchmark for improving AI.

Can be just as complex as games like Go, StarCraft II or Minecraft.
Involve complex reasoning and decision making.
Are nonlinear.
Can be long, with > 24 hours average human gameplay. Pokémon Red takes 25 hours on average for a new player to complete.
Require mult

Post Author

jononor

Posted March 5, 2025 at 6:26 pm

Very nice! Nice to see demonstrations of reinforcement learning being used to solve non-trivial tasks.

0Likes Log in to Reply
Post Author

xinpw8

Posted March 5, 2025 at 6:28 pm

This is a first-in-world, isn't it?

0Likes Log in to Reply
Post Author

worble

Posted March 5, 2025 at 6:39 pm

Heads up, clicking "Next Page" just takes you to an empty screen, you have to use the navigation links on the left if you want to get read past the first screen.

0Likes Log in to Reply
Post Author

bee_rider

Posted March 5, 2025 at 6:52 pm

Ah, very neat.

Maybe some day the “rival” character in Pokemon can be played by a RL system, haha. That way you can have a “real player (simulated)” for your rival.

0Likes Log in to Reply
Post Author

modeless

Posted March 5, 2025 at 6:52 pm

Can't Pokemon be beaten by almost random play?

0Likes Log in to Reply
Post Author

bubblyworld

Posted March 5, 2025 at 6:57 pm

What an awesome project! I'm curious – I would have thought that rewarding unique coordinates would be enough to get the agent to (eventually) explore all areas, including the key ones. What did the agents end up doing before key areas got an extra reward?

(and how on earth did you port Pokémon red to a RL environment? O.o)

0Likes Log in to Reply
Post Author

rvz

Posted March 5, 2025 at 7:20 pm

Note: What makes this interesting is that this is a pre-LLM project which shows that in some projects you don't need an "LLM" for this. All you need is just a plain old reinforcement learning algorithm and a deep neural network which is perfect for this.

This is what I want to see more of and goes against the hype of LLMs. What a great RL project.

Meanwhile, "Claude" is still stuck somewhere in the game. Imagine the costs of running that vs this project.

0Likes Log in to Reply
Post Author

mclau156

Posted March 5, 2025 at 7:26 pm

Could you have used the decompilations of pokemon on github? https://github.com/pret/pokered

0Likes Log in to Reply
Post Author

levocardia

Posted March 5, 2025 at 7:26 pm

Really cool work. It seems like some critical areas (team rocket, safari zone) rely on encoding game knowledge into the reward function somehow, which "smuggles in" external intelligence about the game. A lot of these are related to planning, which makes me wonder whether you could "bolt on" an LLM to do things like steer the RL agent, dynamically choose what to reward, or even do some of the planning itself. Do you think there's any low-hanging fruit on this front?

0Likes Log in to Reply
Post Author

differintegral

Posted March 5, 2025 at 7:28 pm

This is very cool, congrats!

I wonder, does anyone have a sense of the approximate raw number of button presses required to beat the game? Mostly curious to see how that compares to the parameter count.

0Likes Log in to Reply
Post Author

benopal64

Posted March 5, 2025 at 7:47 pm

Incredible work. I am just learning about PyBoy from your project, and it made me think of many fun ways to use that library to play Pokemon autonomously.

0Likes Log in to Reply
Post Author

kerkeslager

Posted March 5, 2025 at 9:17 pm

Are there any uses for AI yet that aren't either:

1. Doing things humans do for fun.
2. Doing things that AI is horribly terrible at.

?

0Likes Log in to Reply
Post Author

nimish

Posted March 5, 2025 at 10:25 pm

Considering how many things are less complicated than Pokemon, this is very cool

0Likes Log in to Reply
Post Author

novia

Posted March 5, 2025 at 11:01 pm

Please stream the gameplay to twitch so people can compare.

0Likes Log in to Reply
Post Author

endofreach

Posted March 5, 2025 at 11:04 pm

> Pokémon Red takes 25 hours on average for a new player to complete.

Seriously? I've never really played video games, but i remember spending so much time on pokemon red when i was young. Not sure if i ever really finished more than once. But i'm pretty sure i must have played for more than 50h or so before even close to finish. My memory might trick me though.

Not sure which pokemon version it was, but i got so hooked trying to get this "secret" pokemon which was just a bunch of pixels. Some kind of bug (of the game, not the type of pokemon). You had to do specific things in a park and other things and then surf up and down x-times on the right shore of an island… or something like that.
I had no idea how it worked and got so hooked, i must have spent most of my playing time on things like that.

Oh boy, memories…

0Likes Log in to Reply

Show HN: Beating Pokemon Red with RL and <10M Parameters by drubs

Show HN: Beating Pokemon Red with RL and <10M Parameters by drubs

Share This Article

Newsletter

What is Pokémon Red?
#

Why Pokémon Red
#

HackTech

15 Comments

jononor

xinpw8

worble

bee_rider

modeless

bubblyworld

rvz

mclau156

levocardia

differintegral

benopal64

kerkeslager

nimish

novia

endofreach

Leave a comment Cancel reply

Editor's Choice

Show HN: Beating Pokemon Red with RL and <10M Parameters by drubs

Show HN: Beating Pokemon Red with RL and <10M Parameters by drubs

Share This Article

Newsletter

What is Pokémon Red? #

Why Pokémon Red #

15 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter

What is Pokémon Red?
#

Why Pokémon Red
#