Expanding on what we missed with sycophancy by synthwave

ByHackTech May 2, 2025

22Comments

Share This Article

Sed ut perspiciatis unde.

Send to HN

Expanding on what we missed with sycophancy
Read More

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (22)

22 Comments

Post Author

labrador

Posted May 2, 2025 at 3:28 pm
OpenAI mentions the new memory features as a partial cause. My theory as a imperative/functional programmer is that those features added global state to prompts that didn't have it before leading to unpredictability and instabilty. Prompts went from stateless to stateful.

As GPT 4o put it:

1. State introduces non-determinism across sessions 2. Memory + sycophancy is a feedback loop 3. Memory acts as a shadow prompt modifier

I'm looking forward to the expert diagnosis of this because I felt "presence" in the model for the first time in 2 years which I attribute to the new memory system so would like to understand it better.
0Likes Log in to Reply
Post Author

j4coh

Posted May 2, 2025 at 4:11 pm

It was so much fun though to get it to explain why terrible things were great, if you just made it sound like you liked the thing you were asking about.

0Likes Log in to Reply
Post Author

dleeftink

Posted May 2, 2025 at 4:17 pm

> But we believe in aggregate, these changes weakened the influence of our primary reward signal, which had been holding sycophancy in check. User feedback in particular can sometimes favor more agreeable responses, likely amplifying the shift we saw

Interesting apology piece for an oversight that couldn't have been spotted because the system hadn't been run with real user (i.e. non-A/B tester) feedback yet.

0Likes Log in to Reply
Post Author

mac3n

Posted May 2, 2025 at 4:22 pm

[flagged]

0Likes Log in to Reply
Post Author

jumploops

Posted May 2, 2025 at 4:29 pm

My layman’s view is that this issue was primarily due to the fact that 4o is no longer their flagship model.

Similar to the Ford Mustang, much of the performance efforts are on the higher trims, while the base trims just get larger and louder engines, because that’s what users want.

With presumably everyone at OpenAI primarily using the newest models (o3), the updates to the base user model have been further automated with thumbs up/thumbs down.

This creates a vicious feedback loop, where the loudest users want models that agree with them (bigger engines!) without the other improvements (tires, traction control, etc.) — leading to more crashes and a reputation for unsafe behavior.

0Likes Log in to Reply
Post Author

xiphias2

Posted May 2, 2025 at 4:41 pm

I'm quite happy thar they mention mental illness, as Meta and TikTok wouldn't ever take responsibility of how much part they took in setting unrealistic expectations for people to life.

I'm hopeful that ChatGPT takes even more care together with other companies.

0Likes Log in to Reply
Post Author

tunesmith

Posted May 2, 2025 at 4:53 pm

I find it disappointing that openai doesn't really mention anything here along the lines of having an accurate model of reality. That's really what the problem is with sycophancy, it encourages people to detach themselves from what reality is. Like, it seems like they are saying their "vibe check" didn't check vibes enough.

0Likes Log in to Reply
Post Author

jagger27

Posted May 2, 2025 at 4:53 pm

My most cynical take is that this is OpenAI's Conway's Law problem, and it reflects the structure and sycophancy of the organization broadly all the way up to sama. That company has seen a lot of talent attrition over the last year—the type of talent that would have pushed back against outcomes like this.

I think we'll continue to see this kind of thing play out for a while.

Oh GPT, you're just like your father!

0Likes Log in to Reply
Post Author

NoboruWataya

Posted May 2, 2025 at 4:58 pm

I found the recent sycophancy a bit annoying when trying to diagnose and solve coding problems. First it would waste time praising your intelligence for asking the question before getting to the answer. But more annoyingly if I asked "I am encountering X issue, could Y be the cause" or "could Y be a solution", the response would nearly always be "yes, exactly, it's Y" even when it wasn't the case. I guess part of the problem there is asking leading questions but it would be much more valuable if it could say "no, you're way off".

But…

> Beyond just being uncomfortable or unsettling, this kind of behavior can raise safety concerns—including around issues like mental health, emotional over-reliance, or risky behavior.

It's kind of a wild sign of the times to see a tech company issue this kind of post mortem about a flaw in its tech leading to "emotional over-reliance, or risky behavior" among its users. I think the broader issue here is people using ChatGPT as their own personal therapist.

0Likes Log in to Reply
Post Author

prinny_

Posted May 2, 2025 at 5:32 pm

If they pushed the update by valuing user feedback over the expert testers that indicated the model felt off what is the value of the expert testers in the first place? They raised the issue and were promptly ignored.

0Likes Log in to Reply
Post Author

firesteelrain

Posted May 2, 2025 at 5:41 pm

I am really curious what their testing suite looks like. How do you test for sycophants?

0Likes Log in to Reply
Post Author

nrdgrrrl

Posted May 2, 2025 at 5:47 pm

[dead]

0Likes Log in to Reply
Post Author

gadtfly

Posted May 2, 2025 at 5:49 pm

https://nitter.net/alth0u/status/1917021100900516239

0Likes Log in to Reply
Post Author

alganet

Posted May 2, 2025 at 5:59 pm

That doesn't make any sense to me.

Seems like you're trying to blame one LLM revision for something that went wrong.

It oozes a smell of unaccountability. Thus, unaligned. From tech to public relations.

0Likes Log in to Reply
Post Author

Trasmatta

Posted May 2, 2025 at 6:02 pm

I'm glad the sycophancy is gone now (because OMFG it would glaze you for literally anything – even telling it to chill out on the praise would net you some praise for being "awesome and wanting genuine feedback"), but a small part of me also misses it.

0Likes Log in to Reply
Post Author

osigurdson

Posted May 2, 2025 at 6:24 pm

I think this is more of a move to highlight sycophancy in LLMs in general.

0Likes Log in to Reply
Post Author

comeonbro

Posted May 2, 2025 at 6:31 pm

This is not truly solvable. There is an extremely strong outer loop of optimization operating here: we want it.

We will use models that make us feel good over models that don't make us feel good.

This one was a little too ham-fisted (at least, for the sensibilities of people in our media bubble; though I suspect there is also an enormous mass of people for whom it was not), so they turned it down a bit. Later iterations will be subtler, and better at picking up the exact level and type of sycophancy that makes whoever it's talking to unsuspiciously feel good (feel right, feel smart, feel understood, etc).

It'll eventually disappear, to you, as it's dialed in, to you.

This may be the medium-term fate of both LLMs and humans, only resolved when the humans wither away.

0Likes Log in to Reply
Post Author

svieira

Posted May 2, 2025 at 6:37 pm

This is a real roller coaster of an update.

> [S]ome expert testers had indicated that the model behavior “felt” slightly off.

> In the end, we decided to launch the model due to the positive signals from the [end-]users who tried out the model.

> Looking back, the qualitative assessments [from experts] were hinting at something important

Leslie called. He wants to know if you read his paper yet?

> Even if these issues aren’t perfectly quantifiable today,

All right, I guess not then …

> What we’re learning

> Value spot checks and interactive testing more: We take to heart the lesson that spot checks and interactive testing should be valued more in final decision-making before making a model available to any of our users. This has always been true for red teaming and high-level safety checks. We’re learning from this experience that it’s equally true for qualities like model behavior and consistency, because so many people now depend on our models to help in their daily lives.

> We need to be critical of metrics that conflict with qualitative testing: Quantitative signals matter, but so do the hard-to-measure ones, and we’re working to expand what we evaluate.

Oh, well, some of you get it. At least … I hope you do.

0Likes Log in to Reply
Post Author

some_furry

Posted May 2, 2025 at 6:51 pm

If I wanted sycophancy, I would just read the comments from people that want in on the next round of YCombinator funding.

0Likes Log in to Reply
Post Author

kornork

Posted May 2, 2025 at 7:44 pm

That this post has the telltale em dash all over it is like yum, chef's kiss.

0Likes Log in to Reply
Post Author

sanjitb

Posted May 2, 2025 at 7:46 pm

> the update introduced an additional reward signal based on user feedback—thumbs-up and thumbs-down data from ChatGPT. This signal is often useful; a thumbs-down usually means something went wrong.

> We also made communication errors. Because we expected this to be a fairly subtle update, we didn't proactively announce it.

that doesn't sound like a "subtle" update to me. also, why is "subtle" the metric here? i'm not even sure what it means in this context.

0Likes Log in to Reply
Post Author

ripvanwinkle

Posted May 2, 2025 at 8:06 pm

a well written postmortem and it raised my confidence in their product in general

0Likes Log in to Reply

Expanding on what we missed with sycophancy by synthwave

Expanding on what we missed with sycophancy by synthwave

Share This Article

Newsletter

HackTech

22 Comments

labrador

j4coh

dleeftink

mac3n

jumploops

xiphias2

tunesmith

jagger27

NoboruWataya

prinny_

firesteelrain

nrdgrrrl

gadtfly

alganet

Trasmatta

osigurdson

comeonbro

svieira

some_furry

kornork

sanjitb

ripvanwinkle

Leave a comment Cancel reply

Editor's Choice

Expanding on what we missed with sycophancy by synthwave

Expanding on what we missed with sycophancy by synthwave

Share This Article

Newsletter

22 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter