Show HN: An open-source megarepo turning hackers into frontier AI researchers by fizzbuzz07

Share This Article

Sed ut perspiciatis unde.

Beyond NanoGPT: Go From LLM Beginner to AI Researcher!

Beyond-NanoGPT is the minimal and educational repo aiming to bridge between nanoGPT and research-level deep learning.
This repo includes annotated and from-scratch implementations of almost 100 crucial modern techniques in frontier deep learning, aiming to help newcomers learn enough to start running experiments of their own.

The repo implements everything from KV caching and speculative decoding for LLMs to
architectures like vision transformers and MLP-mixers; from attention variants like linear or multi-latent attention to generative models like denoising diffusion models and flow matching algorithms; from landmark RL papers like PPO, A3C, and AlphaZero to
systems fundamentals like GPU communication algorithms and data/tensor parallelism.

Because everything is implemented by-hand, the code comments explain the especially subtle details often glossed over both in papers and production codebases.

_{A glimpse of some plots you can make!

(Left) Language model speedups from

attention-variants/linear_attention.ipynb,

(Center) Samples from a small denoising diffusion model trained on MNIST in

generative-models/train_ddpm.py,

(Right) Reward over time for a small MLP policy on CartPole in

rl/fundamentals/train_ppo.py.}

LESSONS.md documents some of the things I’ve learned in the months spent writing this codebase.

Clone the Repo:

git clone https://github.com/tanishqkumar/beyond-nanogpt.git

Get Minimal Dependencies:

pip install torch numpy torchvision wandb tqdm transformers datasets diffusers matplotlib pillow jupyter gym

Start learning!
The code is meant for you to read carefully, hack around with, then re-implement yourself from scratch and compare to.
You can just run .py files with vanilla Python in the following way.
```
cd architectures/
python train_dit.py
```
or for instance
```
cd rl/fundamentals/
python train_reinforce.py --verbose --wandb 
```
Everything is written to be run on a single GPU. The code is self-documenting with comments for intuition and elaborating
on subtleties I found tricky to implement.
Arguments are specified at the bottom of each file.
Jupyter notebooks are meant to be stepped through.

Current Implementations and Roadmap

Basic Transformer language-models/transformer.py and train_naive.py [paper]
Vision Transformer (ViT) architectures/train_vit.py [paper]
Diffusion Transformer (DiT) architectures/train_dit.py [paper]
Recurrent Neural Network (RNN) architectures/train_rnn.py [paper]
Residual Networks (ResNet) architectures/train_resnet.py [paper]
MLP-Mixer architectures/train_mlp_mixer.py [paper]
LSTM architectures/train_lstm.py [paper]
Mixture-of-Experts (MoE) architectures/train_moe.py [paper]
Mamba architectures/train_mamba.py [paper]

Vanilla Self-Attention attention-variants/vanilla_attention.ipynb [paper]
Multi-head Self-Attention attention-variants/mhsa.ipynb [paper]
Grouped-Query Attention attention-variants/gqa.ipynb [paper]
Linear Attention attention-variants/linear_attention.ipynb [paper]
Sparse Attention attention-variants/sparse_attention.ipynb [paper]
Cross Attention attention-variants/cross_attention.ipynb [paper]
Multi-Latent Attention attention-variants/mla.ipynb [paper]

Optimized Dataloading language-models/dataloaders [reference]
- Producer-consumer asynchronous dataloading
- Sequence packing
Byte-Pair Encoding language-models/bpe.ipynb [paper]
KV Caching language-models/KV_cache.ipynb [reference]
Speculative Decoding language-models/speculative_decoding.ipynb [paper]
RoPE embeddings language-models/rope.ipynb [paper]
Multi-token Prediction language-models/train_mtp.py [paper]

Deep RL
- Fundamentals rl/fundamentals
  - DQN train_dqn.py [paper]
  - REINFORCE train_reinforce.py [paper]
  - PPO train_ppo.py [paper]
- Actor-Critic and Key Variants rl/actor-critic
  - Advantage Actor-Critic (A2C) train_a2c.py [paper]
  - Asynchronous Advantage Actor-Critic (A3C) train_a3c.py [paper]
  - IMPALA (distributed RL) train_impala.py [paper]
  - Deep Deterministic Policy Gradient (DDPG) train_ddpg.py [paper]
  - Soft Actor-Critic (SAC) train_sac.py [paper]
- Model-based RL rl/model-based
  - Model Predictive Control (MPC) train_mpc.py[reference]
  - Expert Iteration (MCTS) train_expert_iteration.py [paper]
  - Probabilistic Ense

Show HN: An open-source megarepo turning hackers into frontier AI researchers by fizzbuzz07

Show HN: An open-source megarepo turning hackers into frontier AI researchers by fizzbuzz07

Share This Article

Newsletter

Beyond NanoGPT: Go From LLM Beginner to AI Researcher!

Current Implementations and Roadmap

HackTech

Leave a comment Cancel reply

Editor's Choice

Show HN: An open-source megarepo turning hackers into frontier AI researchers by fizzbuzz07

Show HN: An open-source megarepo turning hackers into frontier AI researchers by fizzbuzz07

Share This Article

Newsletter

Beyond NanoGPT: Go From LLM Beginner to AI Researcher!

Current Implementations and Roadmap

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter