Skip to content Skip to footer
0 items - $0.00 0

We hacked Gemini’s Python sandbox and leaked its source code (at least some) by topsycatt

We hacked Gemini’s Python sandbox and leaked its source code (at least some) by topsycatt

We hacked Gemini’s Python sandbox and leaked its source code (at least some) by topsycatt

11 Comments

  • Post Author
    sneak
    Posted March 28, 2025 at 6:30 pm

    > However, the build pipeline for compiling the sandbox binary included an automated step that adds security proto files to a binary whenever it detects that the binary might need them to enforce internal rules. In this particular case, that step wasn’t necessary, resulting in the unintended inclusion of highly confidential internal protos in the wild !

    Protobufs aren't really these super secret hyper-proprietary things they seem to make them out to be in this breathless article.

  • Post Author
    topsycatt
    Posted March 28, 2025 at 6:34 pm

    That's the system I work on! Please feel free to ask any questions. All opinions are my own and do not represent those of my employer.

  • Post Author
    fpgaminer
    Posted March 28, 2025 at 6:35 pm

    Awww, I was looking forward to seeing some of the leak ;) Oh well. Nice find and breakdown!

    Somewhat relatedly, it occurred to me recently just how important issues like prompt injection, etc are for LLMs. I've always brushed them off as unimportant to _me_ since I'm most interested in local LLMs. Who cares if a local LLM is weak to prompt injection or other shenanigans? It's my AI to do with as I please. If anything I want them to be, since it makes it easier to jailbreak them.

    Then Operator and Deep Research came out and it finally made sense to me. When we finally have our own AI Agents running locally doing jobs for us, they're going to encounter random internet content. And the AI Agent obviously needs to read that content, or view the images. And if it's doing that, then it's vulnerable to prompt injection by third party.

    Which, yeah, duh, stupid me. But … is also a really fascinating idea to consider. A future where people have personal AIs, and those AIs can get hacked by reading the wrong thing from the wrong backalley of the internet, and suddenly they are taken over by a mind virus of sorts. What a wild future.

  • Post Author
    paxys
    Posted March 28, 2025 at 6:49 pm

    Funny enough while "We hacked Google's AI" is going to get the clicks, in reality they hacked the one part of Gemini that was NOT the LLM (a sandbox environment meant to run untrusted user-provided code).

    And "leaked its source code" is straight up click bait.

  • Post Author
    ein0p
    Posted March 28, 2025 at 6:52 pm

    They hacked the sandbox, and leaked nothing. The article is entertaining though.

  • Post Author
    simonw
    Posted March 28, 2025 at 8:04 pm

    I've been using a similar trick to scrape the visible internal source code of ChatGPT Code Interpreter into a GitHub repository for a while now: https://github.com/simonw/scrape-openai-code-interpreter

    It's mostly useful for tracking what Python packages are available (and what versions): https://github.com/simonw/scrape-openai-code-interpreter/blo…

  • Post Author
    theLiminator
    Posted March 28, 2025 at 8:36 pm

    It's actually pretty interesting that this shows that Google is quite secure, I feel like most companies would not fare nearly as well.

  • Post Author
    jll29
    Posted March 28, 2025 at 8:38 pm

    Running the built-in "strings" command to extract a few file names from a binary is hardly hacking/cracking.

    Ironically, though, getting the source code of Gemini perhaps wouln't be valuable at all; but if you had found/obtained access to the corpus that the model was pre-trained with, that would have been kind of interesting (many folks have many questions about that…).

  • Post Author
    tgtweak
    Posted March 28, 2025 at 8:40 pm

    The definition of hacking is getting pretty loose. This looks like the sandbox is doing exactly what it's supposed to do and nothing sensitive was exfiltrated…

  • Post Author
    jeffbee
    Posted March 28, 2025 at 9:11 pm

    I guess these guys didn't notice that all of these proto descriptors, and many others, were leaked on github 7 years ago.

    https://github.com/ezequielpereira/GAE-RCE/tree/master/proto…

  • Post Author
    bluelightning2k
    Posted March 28, 2025 at 9:23 pm

    Cool write up. Although it's not exactly a huge vulnerability. I guess it says a lot about how security conscious Google is that they consider this to be significant. (You did mention that you knew the company's specific policy considered this highly confidential so it does count but it feels a little more like "technically considered a vulnerability" rather than clearly one.)

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.