Our paper Pixelated Butterfly: Simple and Efficient Sparse Training for Neural Network Models is available on arXiv, and our code is available on GitHub. Why Sparsity? Recent results suggest that overparameterized neural networks generalize well (Belkin et al. 2019). We’ve witnessed the rise and success of large models (e.g., AlphaFold, GPT-3, DALL-E, DLRM), but they
