Skip to content Skip to footer

0 items - $0.00 0

Ladder: Self-Improving LLMs Through Recursive Problem Decomposition by fofoz

10CommentsShare PostShare on Facebook Share on XShare by EmailSend Link

Top

Ladder: Self-Improving LLMs Through Recursive Problem Decomposition by fofoz

ByHackTech March 7, 2025

10Comments

Share This Article

Sed ut perspiciatis unde.

Send to HN

[Submitted on 2 Mar 2025 (v1), last revised 5 Mar 2025 (this version, v3)]

View PDF
HTML (experimental)

Abstract:We introduce LADDER (Learning through Autonomous Difficulty-Driven Example Recursion), a framework which enables Large Language Models to autonomously improve their problem-solving capabilities through self-guided learning by recursively generating and solving progressively simpler variants of complex problems. Unlike prior approaches that require curated datasets or human feedback, LADDE

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (10)

10 Comments

Post Author

majordroid

Posted March 7, 2025 at 7:50 am

> We also introduce TTRL (Test-Time Reinforcement Learning), where we perform reinforcement learning on variants of test problems at inference time. TTRL enables Qwen2.5 7B Deepseek-R1 Distilled to achieve a state-of-the-art score of 90% on the MIT Integration Bee qualifying examination, surpassing OpenAI o1's performance.

That's incredible!

0Likes Log in to Reply
Post Author

mentalgear

Posted March 7, 2025 at 7:57 am

It's exciting to see approaches like RL and curriculum learning, which I always felt were the way to go for real self-improvement ~7y ago when training in robotics (openAI gym days), finally getting successfully applied to NLP/LLM to highly boost small model performance.

(Ladder is a sort of RL self curriculum learning approach)

0Likes Log in to Reply
Post Author

mentalgear

Posted March 7, 2025 at 7:59 am

> We demonstrate LADDER's effectiveness in the subject of mathematical integration, improving Llama 3.2 3B's accuracy from 1% to 82% on undergraduate-level problems

0Likes Log in to Reply
Post Author

EMIRELADERO

Posted March 7, 2025 at 8:07 am

What the hell is going on this week?!?!? (asking positively, with a smile on my face)

I have seen at least 3 interesting/mildly promising breakthroughs on ML just these past two days! I mean, a Google research team just discovered that you can combine NNs with CLAs using digital logic gates as a medium, so you could potentially reduce many kinds of non-linear problems to a simple, efficient digital circuit! And it was on the HN front page, TODAY![1]

I keep seeing more mind-bending stuff related to neural nets and logic/intelligence in general, my mind has been running wild with speculation about the future and just how close we could (or could not) be to truly understanding how intelligence works from first principles.

[1] https://news.ycombinator.com/item?id=43286161

0Likes Log in to Reply
Post Author

bloomingkales

Posted March 7, 2025 at 8:20 am

I’m kinda getting the sense this is still just prompt engineering in a loop.

Persona-based prompting: We prompted the model to adopt different mathematical perspectives (e.g., "think like Euler focusing on series", "approach like Gauss looking for patterns").

I mean … I guess that’s scientific?

Besides that, how can the model learn at test time (at inferencing)?. It’s stateless, it doesn’t incorporate the last prompt into the model.

0Likes Log in to Reply
Post Author

Davidzheng

Posted March 7, 2025 at 8:23 am

test-time training/RL is definitely the right approach for math AI in the future. It is probably one of only a few ways to spend an obscene amounts of compute at a given problem (think 10^5 gpus for a few days) and has hopes of making progress when test-time inference scaling may not at first (think if you try to do MCTS on a go position with a bad value/policy net). Alphaproof already did this but nice to see it done again–good results!

0Likes Log in to Reply
Post Author

neoneye2

Posted March 7, 2025 at 8:42 am

Sidenote: `Tufa Labs` team includes the `MindsAI` team of ARC-AGI fame.
https://tufalabs.ai/team.html

0Likes Log in to Reply
Post Author

niemandhier

Posted March 7, 2025 at 9:27 am

Frank Herbert knew it: This is basically an implementation of the mentats recursive self inspection described in Dune.

0Likes Log in to Reply
Post Author

isaacfrond

Posted March 7, 2025 at 9:47 am

Reminds me of a quote by famous number theoretic mathematician Hendrik Lenstra:

For every problem you can't solve, there's a simpler problem that you also can't solve.

0Likes Log in to Reply
Post Author

pyryt

Posted March 7, 2025 at 10:25 am

Some names are just too tempting https://arxiv.org/abs/1507.02672

0Likes Log in to Reply

Ladder: Self-Improving LLMs Through Recursive Problem Decomposition by fofoz

Ladder: Self-Improving LLMs Through Recursive Problem Decomposition by fofoz

Share This Article

Newsletter

HackTech

10 Comments

majordroid

mentalgear

mentalgear

EMIRELADERO

bloomingkales

Davidzheng

neoneye2

niemandhier

isaacfrond

pyryt

Leave a comment Cancel reply

Editor's Choice

Ladder: Self-Improving LLMs Through Recursive Problem Decomposition by fofoz

Ladder: Self-Improving LLMs Through Recursive Problem Decomposition by fofoz

Share This Article

Newsletter

10 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter