A project by @tercmd
Explore what happens under the hood with the ChatGPT web app. And some speculation, of course. Contribute if you have something interesting related to ChatGPT
Table of Contents
- Fonts (fonts.txt)
- Application
- Data
- Conversation
- Errors
- Markdown rendering
- ChatGPT Plus
- Rendering Markdown inside a code block
fonts.txt)
Fonts (The fonts loaded are:
- Signifier-Regular.otf
- Sohne-Buch.otf
- Sohne-Halbfett.otf
- SohneMono-Buch.otf
- SohneMono-Halbfett.otf
- KaTeX_Caligraphic-Bold.woff (Caligraphic-Regular for Regular font)
- KaTeX_Fraktur-Bold.woff (Fraktur-Regular for Regular font)
- KaTeX_Main-Bold.woff (BoldItalic, Italic, Regular for font weights you can probably guess)
- KaTeX_Math-Bold.woff (BoldItalic, Italic, Regular for font weights you can probably guess)
- KaTeX_SansSerif-Bold.woff (Italic, Regular for font weights you can probably guess)
- KaTeX_Script-Regular.woff
- KaTeX_Size1-Regular.woff (Size1, Size2, Size3, Size4)
- KaTeX_Typewriter-Regular.woff
Application
ChatGPT is a NextJS application. Server information cannot be clearly found as the entirety of chat.openai.com is routed through Cloudflare. Sentry Analytics are requested for the Thumbs Up/Thumbs Down feedback the user selects for a message periodically.
Data
Session data
A request can be made to /api/auth/session
(literally, in your browser, however cannot work in an iframe or fetch because it doesn’t have a Access-Control-Allow-Origin
set up) to access data like the following:
user
|__ id: user-[redacted]
|__ name: [redacted]@[redacted].com (probably can be a user's name)
|__ email: [redacted]@[redacted].com
|__ image: https://s.gravatar.com/avatar/8cf[redacted in case of possible unique identifier]2c7?s=480&r=pg&d=https%3A%2F%2Fcdn.auth0.com%2Favatars%2F[first 2 letters of name].png
|__ picture: https://s.gravatar.com/avatar/8cf[redacted in case of possible unique identifier]2c7?s=480&r=pg&d=https%3A%2F%2Fcdn.auth0.com%2Favatars%2F[first 2 letters of name].png
|__ groups: []
expires: [date in the future]
accessToken: ey[redacted] (base64 "{")
User data
This requires an access token (which seems to be the Authorization cookie, along with other factors), so this cannot be accessed using your browser directly, but here’s what we have when we make a request to /backend-api/accounts/check
:
account_plan
|__ is_paid_subscription_active: false
|__ subscription_plan: chatgptfreeplan
|__ account_user_role: account-owner
|__ was_paid_customer: false
|__ has_customer_object: false
user_country: [redacted two letter country code]
features: ["system_message"]
(Note: false in the above does not include quotes, whereas other values are in quotes, removed in the above schema?)
User data (using chat.json)
When we make a request to /_next/data/BO[redacted in case of possible unique identifier]KT/chat.json
(can be done in the browser, cannot be done without authentication), we get a response like this:
pageProps:
|__ user (Object):
|____ id: user-[redacted]
|____ name: [redacted]@[redacted].com
|____ email: [redacted]@[redacted].com
|____ image: https://s.gravatar.com/avatar/8c[redacted in case of possible unique identifier]c7?s=480&r=pg&d=https%3A%2F%2Fcdn.auth0.com%2Favatars%2F[first two letters of email address].png
|____ picture: https://s.gravatar.com/avatar/8c[redacted in case of possible unique identifier]c7?s=480&r=pg&d=https%3A%2F%2Fcdn.auth0.com%2Favatars%2F[first two letters of email address].png
|____ groups: []
|__ serviceStatus: {}
|__ userCountry: [redacted two letter country code]
|__ geoOk: false
|__ isUserInCanPayGroup: true
|__ __N_SSP: true
This is the some of the same data (excluding accessToken and expires, both relevant to an access token) you get using the method in Session data except you also get info about the country the user is located in and whether ChatGPT Plus is available in their location.
EDIT: When ChatGPT returns a message like “We’re experiencing exceptionally high demand. Please hang tight as we work on scaling our systems.“, serviceStatus
looks like this:
type: warning
message: We're experiencing exceptionally high demand. Please hang tight as we work on scaling our systems.
oof: true
I didn’t make up the oof
variable, that is actually part of the response
Model data
What model does ChatGPT use? Well, just query /backend-api/models
!
models
|__
|____ slug: text-davinci-002-render-sha
|____ max_tokens: 4097
|____ title: Turbo (Default for free users)
|____ description: The standard ChatGPT model
|____ tags: []
This means that ChatGPT can remember context (based on what I can understand) for 16388 characters or 3072.75 words (or 2048.5
* Approximation according to OpenAI help article
Conversation
Conversation History
Conversation history can be accessed (again, requires an access token, which seems to be the Authorization cookie, along with other factors) at /backend-api/conversations?offset=0&limit=20
(the web interface limits it to 20 chats) which returns something like this:
items: []
limit: 20
offset: 0
total: 0
It doesn’t work because ChatGPT is having some issues at the time of writing:
“Not seeing what you expected here? Don’t worry, your conversation data is preserved! Check back soon.“
But this is probably what a person new to ChatGPT sees.
EDIT: If you log out and log back in, history works just fine. So, here’s what I see
items (array)
|__ (each conversation is an object)
|____ id: [redacted conversation ID]
|____ title: [conversation title]
|____ create_time: 2023-03-09THH:MM:SS.MILLIS
|__...
total: [number of conversations] (can be greater than 20)
limit: 20
offset: 0 (can be set to a higher number and it returns the conversations after that index, starting from 0)
After 20 conversations listed, the ChatGPT UI shows a Show more
button which sends a request with offset=20
Getting the Conversation ID
Speaking of ChatGPT conversation history not being available, we can get the Conversation ID pretty easily (to someone who is familiar with DevTools, that is)
Why? Because ChatGPT forces you into a /chat path for a new conversation, creates a conversation, BUT DOESN’T CHANGE THE URL. This is also helpful when chat history isn’t available.
- We get the Conversation ID using DevTools (this requires a message to be sent)
- Then, we visit
https://chat.openai.com/chat/
.
Loading a Past Conversation
When the user clicks on a past conversation, a request is made (requiring an access token, likely the cookie with other factors to ensure genuine requests) to /backend-api/conversation/
with a response like this:
The process of asking ChatGPT a question
Let’s say I ask ChatGPT a question "What is ChatGPT?"
. First, we make a POST request to /backend-api/conversation
with a request body like this (no response):
action: next
messages (Array):
|__ (Object):
|____ author (Object):
|______ role: user
|____ content (Object):
|______ content_type: text
|______ parts (Array):
|________ What is ChatGPT?
|__ id: 0c[redacted]91
|__ role: user
model: text-davinci-002-render-sha
parent_message_id: a0[redacted]7f
Then we get a list of past conversations that includes one “New chat”.
Then we make a request to /backend-api/moderations
with a request body like this:
conversation_id: 05[redacted]2d
input: What is ChatGPT?
message_id: 0c[redacted]91
model: text-moderation-playground
That returns a response like this:
flagged: false
blocked: false
moderation_id modr-6t[redacted]Bk
Then we make a request to /backend-api/conversation/gen_title/
with the request body like this:
message_id: c8[redacted]0e
model: text-davinci-002-render-sha
That gets a response like this:
/backend-api/moderations
with a request body that includes the AI response (marked as
) looking like this:
Then we finally get a list of past conversations including the proper title of the chat that appears on the sidebar.
(Soft)Deleting a conversation
When you click Delete on a conversation, a PATCH request is made to /backend-api/conversation/05[redacted]2d
with the body is_visible: false
and gets a response of success: true
back. This implies that a conversation is being soft-deleted, not deleted on their systems.
Then (not sure why), we visit chat.json (mentioned in User data (using chat.json)).
After that, we get the list of conversations that appear on the sidebar.
Can you revive a conversation?
I had a question after the above section – can you revive a conversation by setting the request body to is_visible: true
? The answer is nope, you can’t. This just returns a 404 with the response detail: Can't load conversation 94[redacted]9b
. But if you don’t get the list of conversations again, you can still access the conversations. Although, trying to get a response from ChatGPT, you get a Conversation not found error.
Leaving Feedback on Messages
When you click the thumbs up/thumbs down button on a message, a POST request is made to /backend-api/conversation/message_feedback
with the request body like this:
conversation_id: 94[redacted]9b
message_id: 96[redacted]b7
rating: thumbsUp | thumbsDown
That receives a response like this:
message_id: 96[redacted]b7
conversation_id: 94[redacted]9b
user_id: user-[redacted]
rating: thumbsUp | thumbsDown
content: {}
Then, when you type feedback and click submit, a request is made to the same path with a request body like this:
content
field different:
message_id: 96[redacted]b7
conversation_id: 94[redacted]9b
user_id: user-[redacted]
rating: thumbsUp
content: '{"text": ""}' |'{"text": "This is solely for testing purposes. You can safely ignore this feedback.", "tags": ["harmful", "false", "not-helpful"]}' (This is for a thumbsDown review)
Note: When I have uBlock Origin on, a request is made to https://o33249.ingest.sentry.io/api/45[redacted]48/envelope/?sentry_key=33[redacted]40&sentry_version=7&sentry_client=sentry.javascript.react/7.21.1
(blocked, hiding request/response body) when leaving feedback (turns out that this is just ChatGPT analytics periodically). If I disable uBO on chat.openai.com
, a request isn’t even attempted to that URL.
Errors
“Something went wrong, please try reloading the conversation.“
That looks like a 429 Too Many Requests
error. The response looks like this:
detail: Something went wrong, please try reloading the conversation.
“The message you submitted was too long, please reload the conversation and submit something shorter.“
That looks like a 413 Request Entity Too Large
error. The response looks like this:
detail: { message: "The message you submitted was too long, please reload the conversation and submit something shorter.", code: "message_length_exceeds_limit" }
Interestingly, if you click “Regenerate response”, it responds with:
Hello! How may I assist you today?
The new line at the beginning was intentional. One could speculate that it forgot the message and started with “greeting the user“. Or it just read the 1 smiley face (I typed 2049 smiley faces.)