Show HN: LLM Verified with Monte Carlo Tree Search by namin

Share This Article

Sed ut perspiciatis unde.

This prototype synthesizes verified code with an LLM.

Using Monte Carlo Tree Search (MCTS), it explores the space of possible generation of a verified program, and it checks at every step that it’s on the right track by calling the verifier.

This prototype uses Dafny.

Logs for example runs can be found in the log directory.
Scroll to the very end of a log to see a chosen solution.
Note that the linked solution is optimal for the problem.

By using this technique, weaker models that might not even know the generated language all that well can compete with stronger models.

We can also reinforce the snippets that succeed positively and that fail negatively through PPO training.
The model after PPO can solve the prompts without backtracking!
For example, the log for solving the problem fact after PPO training on another problem, opt0.

Running

This project relies on GPU access. It has been tested on a multi-GPU machine with two NVIDIA A100s.

Setup

Using mamba or equivalently conda:

mamba create --name llm-verified python=3.10
mamba activate llm-verified
pip install -r requirements.txt

Execution

Pick an LLM in model.py and a prompt in prompts.py, then:

For the PPO trainer (slow!), do:

TODOs

Support other verifiers in addition to Dafny:
- Coq
Support other LLM infrastructures in addition to Hugging Face:
- Ollama
Support test cases.
Design a steerable interaction to give human or tool feedback to the LLM.
Design a reinforcement learning scheme, whereas the LLM learns from trial.
- Evaluate whether the model after PPO suffers degradation for some tasks, even unrelated.
- Force the PPO solution to converge to an optimal known one, using it entirely for training rather than discovery.
Get wandb to work.

Credits

The montecarlo lib

Show HN: LLM Verified with Monte Carlo Tree Search by namin

Show HN: LLM Verified with Monte Carlo Tree Search by namin

Share This Article

Newsletter

Running

Setup

Execution

TODOs

Credits

HackTech

Leave a comment Cancel reply

Editor's Choice

Show HN: LLM Verified with Monte Carlo Tree Search by namin

Show HN: LLM Verified with Monte Carlo Tree Search by namin

Share This Article

Newsletter

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter