Recently, I posted a prompt on X (formerly, Twitter) for Large Language Models like Claude sonnet, GPT-4o, Deepseek v3, and so on. The prompt instructs these models to ‘contemplate’ for a bit before providing the final answer, and it unexpectedly went viral. This is a short blog post on my thinking behind coming up with this prompt.
Example output:
You can find the full system prompt in this GitHub gist: Contemplative LLMs full prompt
The inspiration
It is clear that the next big thing to tackle in the field of language models seems to be “reasoning”. OpenAI’s latest models like o1 and o3 are a paradigm shift towards this idea. After trying the out the o1 model I was genuinely impressed by how much it ‘thought’ before responding to a user’s query.
In essence, the o1 model is trained with Reinforcement Learning (RL) on tasks that require heavy reasoning (coding, math, etc.) possibly using a ‘verifier’ model to evaluate the reasoning steps during training, and it uses something called test-time compute to spend more time “thinking” through the steps during inference. From their official blog post:
Our large-scale reinforcement learning algorithm teaches the model how to think productively using its chain of thought in a highly data-efficient training process. We have found that the performance of o1 consistently improves with more reinforcement learning (train-time compute) and with more time spent thinking (test-time compute).
The main motivation behind creating this prompt was by just looking at the raw Chain Of Thought (CoT) text of o1 from the official blog post. For example some portions, of the raw CoT text, look like:
[...] Alternatively, perhaps subtract: 25 - 15 = 10. No. Alternatively, perhaps combine the numbers in some way. Alternatively, think about their positions in the alphabet. Alternatively, perhaps the letters are encrypted via a code. [...] [...] Wait, actually, this may not help us directly without specific terms. [...]
This gave me the idea: Can we prompt an LLM (which is not o1) in such a way that it mimics the thought process and also the ‘exploration’ of alternate possibilities? If yes, what will the results look like?