[Submitted on 14 Mar 2023]
Abstract: We analyze transformers from the perspective of iterative inference, seeking
to understand how model predictions are refined layer by layer. To do so, we
train an affine probe for each block in a frozen pretrained model, making it
possible to decode every h