Skip to content Skip to footer

Expanding on what we missed with sycophancy by synthwave

22 Comments

  • Post Author
    labrador
    Posted May 2, 2025 at 3:28 pm

    OpenAI mentions the new memory features as a partial cause. My theory as a imperative/functional programmer is that those features added global state to prompts that didn't have it before leading to unpredictability and instabilty. Prompts went from stateless to stateful.

    As GPT 4o put it:

        1. State introduces non-determinism across sessions 
        2. Memory + sycophancy is a feedback loop 
        3. Memory acts as a shadow prompt modifier
    

    I'm looking forward to the expert diagnosis of this because I felt "presence" in the model for the first time in 2 years which I attribute to the new memory system so would like to understand it better.

  • Post Author
    j4coh
    Posted May 2, 2025 at 4:11 pm

    It was so much fun though to get it to explain why terrible things were great, if you just made it sound like you liked the thing you were asking about.

  • Post Author
    dleeftink
    Posted May 2, 2025 at 4:17 pm

    > But we believe in aggregate, these changes weakened the influence of our primary reward signal, which had been holding sycophancy in check. User feedback in particular can sometimes favor more agreeable responses, likely amplifying the shift we saw

    Interesting apology piece for an oversight that couldn't have been spotted because the system hadn't been run with real user (i.e. non-A/B tester) feedback yet.

  • Post Author
    mac3n
    Posted May 2, 2025 at 4:22 pm

    [flagged]

  • Post Author
    jumploops
    Posted May 2, 2025 at 4:29 pm

    My layman’s view is that this issue was primarily due to the fact that 4o is no longer their flagship model.

    Similar to the Ford Mustang, much of the performance efforts are on the higher trims, while the base trims just get larger and louder engines, because that’s what users want.

    With presumably everyone at OpenAI primarily using the newest models (o3), the updates to the base user model have been further automated with thumbs up/thumbs down.

    This creates a vicious feedback loop, where the loudest users want models that agree with them (bigger engines!) without the other improvements (tires, traction control, etc.) — leading to more crashes and a reputation for unsafe behavior.

  • Post Author
    xiphias2
    Posted May 2, 2025 at 4:41 pm

    I'm quite happy thar they mention mental illness, as Meta and TikTok wouldn't ever take responsibility of how much part they took in setting unrealistic expectations for people to life.

    I'm hopeful that ChatGPT takes even more care together with other companies.

  • Post Author
    tunesmith
    Posted May 2, 2025 at 4:53 pm

    I find it disappointing that openai doesn't really mention anything here along the lines of having an accurate model of reality. That's really what the problem is with sycophancy, it encourages people to detach themselves from what reality is. Like, it seems like they are saying their "vibe check" didn't check vibes enough.

  • Post Author
    jagger27
    Posted May 2, 2025 at 4:53 pm

    My most cynical take is that this is OpenAI's Conway's Law problem, and it reflects the structure and sycophancy of the organization broadly all the way up to sama. That company has seen a lot of talent attrition over the last year—the type of talent that would have pushed back against outcomes like this.

    I think we'll continue to see this kind of thing play out for a while.

    Oh GPT, you're just like your father!

  • Post Author
    NoboruWataya
    Posted May 2, 2025 at 4:58 pm

    I found the recent sycophancy a bit annoying when trying to diagnose and solve coding problems. First it would waste time praising your intelligence for asking the question before getting to the answer. But more annoyingly if I asked "I am encountering X issue, could Y be the cause" or "could Y be a solution", the response would nearly always be "yes, exactly, it's Y" even when it wasn't the case. I guess part of the problem there is asking leading questions but it would be much more valuable if it could say "no, you're way off".

    But…

    > Beyond just being uncomfortable or unsettling, this kind of behavior can raise safety concerns—including around issues like mental health, emotional over-reliance, or risky behavior.

    It's kind of a wild sign of the times to see a tech company issue this kind of post mortem about a flaw in its tech leading to "emotional over-reliance, or risky behavior" among its users. I think the broader issue here is people using ChatGPT as their own personal therapist.

  • Post Author
    prinny_
    Posted May 2, 2025 at 5:32 pm

    If they pushed the update by valuing user feedback over the expert testers that indicated the model felt off what is the value of the expert testers in the first place? They raised the issue and were promptly ignored.

  • Post Author
    firesteelrain
    Posted May 2, 2025 at 5:41 pm

    I am really curious what their testing suite looks like. How do you test for sycophants?

  • Post Author
    nrdgrrrl
    Posted May 2, 2025 at 5:47 pm

    [dead]

  • Post Author
    alganet
    Posted May 2, 2025 at 5:59 pm

    That doesn't make any sense to me.

    Seems like you're trying to blame one LLM revision for something that went wrong.

    It oozes a smell of unaccountability. Thus, unaligned. From tech to public relations.

  • Post Author
    Trasmatta
    Posted May 2, 2025 at 6:02 pm

    I'm glad the sycophancy is gone now (because OMFG it would glaze you for literally anything – even telling it to chill out on the praise would net you some praise for being "awesome and wanting genuine feedback"), but a small part of me also misses it.

  • Post Author
    osigurdson
    Posted May 2, 2025 at 6:24 pm

    I think this is more of a move to highlight sycophancy in LLMs in general.

  • Post Author
    comeonbro
    Posted May 2, 2025 at 6:31 pm

    This is not truly solvable. There is an extremely strong outer loop of optimization operating here: we want it.

    We will use models that make us feel good over models that don't make us feel good.

    This one was a little too ham-fisted (at least, for the sensibilities of people in our media bubble; though I suspect there is also an enormous mass of people for whom it was not), so they turned it down a bit. Later iterations will be subtler, and better at picking up the exact level and type of sycophancy that makes whoever it's talking to unsuspiciously feel good (feel right, feel smart, feel understood, etc).

    It'll eventually disappear, to you, as it's dialed in, to you.

    This may be the medium-term fate of both LLMs and humans, only resolved when the humans wither away.

  • Post Author
    svieira
    Posted May 2, 2025 at 6:37 pm

    This is a real roller coaster of an update.

    > [S]ome expert testers had indicated that the model behavior “felt” slightly off.

    > In the end, we decided to launch the model due to the positive signals from the [end-]users who tried out the model.

    > Looking back, the qualitative assessments [from experts] were hinting at something important

    Leslie called. He wants to know if you read his paper yet?

    > Even if these issues aren’t perfectly quantifiable today,

    All right, I guess not then …

    > What we’re learning

    > Value spot checks and interactive testing more: We take to heart the lesson that spot checks and interactive testing should be valued more in final decision-making before making a model available to any of our users. This has always been true for red teaming and high-level safety checks. We’re learning from this experience that it’s equally true for qualities like model behavior and consistency, because so many people now depend on our models to help in their daily lives.

    > We need to be critical of metrics that conflict with qualitative testing: Quantitative signals matter, but so do the hard-to-measure ones, and we’re working to expand what we evaluate.

    Oh, well, some of you get it. At least … I hope you do.

  • Post Author
    some_furry
    Posted May 2, 2025 at 6:51 pm

    If I wanted sycophancy, I would just read the comments from people that want in on the next round of YCombinator funding.

  • Post Author
    kornork
    Posted May 2, 2025 at 7:44 pm

    That this post has the telltale em dash all over it is like yum, chef's kiss.

  • Post Author
    sanjitb
    Posted May 2, 2025 at 7:46 pm

    > the update introduced an additional reward signal based on user feedback—thumbs-up and thumbs-down data from ChatGPT. This signal is often useful; a thumbs-down usually means something went wrong.

    > We also made communication errors. Because we expected this to be a fairly subtle update, we didn't proactively announce it.

    that doesn't sound like a "subtle" update to me. also, why is "subtle" the metric here? i'm not even sure what it means in this context.

  • Post Author
    ripvanwinkle
    Posted May 2, 2025 at 8:06 pm

    a well written postmortem and it raised my confidence in their product in general

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.