Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.(Related text posted to Twitter; this version is edited and has a more advanced final section.)Imagine yourself in a box, trying to predict the next word – assign as much probability mass to the next token as possible – for all the text