Skip to content Skip to footer
0 items - $0.00 0

An Appeal to Apple from Anukari: one tiny macOS detail to make Anukari fast by humbledrone

An Appeal to Apple from Anukari: one tiny macOS detail to make Anukari fast by humbledrone

An Appeal to Apple from Anukari: one tiny macOS detail to make Anukari fast by humbledrone

8 Comments

  • Post Author
    humbledrone
    Posted May 6, 2025 at 3:40 am

    Some folks may have seen my Show HN post for Anukari here: https://news.ycombinator.com/item?id=43873074

    In that thread, the topic of macOS performance came up there. Basically Anukari works great for most people on Apple silicon, including base-model M1 hardware. I've done all my testing on a base M1 and it works wonderfully. The hardware is incredible.

    But to make it work, I had to implement an unholy abomination of a workaround to get macOS to increase the GPU clock rate for the audio processing to be fast enough. The normal heuristics that macOS uses for the GPU performance state don't understand the weird Anukari workload.

    Anyway, I finally had time to write down the full situation, in terrible detail, so that I could ask for help getting in touch with the right person at Apple, probably someone who works on the Metal API.

    Help! :)

  • Post Author
    krackers
    Posted May 6, 2025 at 8:03 am

    >The Metal profiler has an incredibly useful feature: it allows you to choose the Metal “Performance State” while profiling the application. This is not configurable outside of the profiler.

    Seems like there might be a private API for this. Maybe it's easier to go the reverse engineering route? Unless it'll end up requiring some special entitlement that you can't bypass without disabling SIP.

  • Post Author
    LiamPowell
    Posted May 6, 2025 at 8:44 am

    The problem with exposing an API for this is that far too many developers will force the highest performance state all the time. I don't know if there's really a good way to stop that and have the API at the same time.

  • Post Author
    Someone
    Posted May 6, 2025 at 9:51 am

    One thing I don’t understand: if latency is important for this use case, why isn’t the CPU busy preparing the next GPU ‘job’ while a GPU ‘job’ is running?

    Is that a limitation of the audio plug-in APIs?

  • Post Author
    threeseed
    Posted May 6, 2025 at 9:58 am

    Best way to do this:

    1. Go through WWDC videos and find the engineer who seems the most knowledgable about the issue you're facing.

    2. Email them directly with this format: mthomson@apple.com for Michael Thomson.

  • Post Author
    SOLAR_FIELDS
    Posted May 6, 2025 at 11:04 am

    https://xkcd.com/1172/ feels a lot like the workaround OP describes

  • Post Author
    sgt
    Posted May 6, 2025 at 11:10 am

    I have zero need for this app but it's so cool. Apps like these bring the "fun" back into computing. I don't mean there's no fun at the moment, but reminds me of the old days with more graphical and experimental programs that floated around, even the demoscene.

  • Post Author
    charcircuit
    Posted May 6, 2025 at 11:33 am

    >Any MTLCommandQueue managed by an Audio Workgroup thread could be treated as real-time and the GPU clock could be adjusted accordingly.

    >The Metal API could simply provide an option on MTLCommandQueue to indicate that it is real-time sensitive, and the clock for the GPU chiplet handling that queue could be adjusted accordingly.

    Realtime scheduling on a GPU and what the GPU is clocked to are separate concepts. From the article it sounds like the issue is with the clock speeds and not how the work is being scheduled. It sounds like you need something else for providing a hint for requesting a higher GPU clock.

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.