opens in new window
PRESS RELEASE
March 5, 2025
The new chip delivers up to 2.6x the performance of M1 Ultra, along with Thunderbolt 5 connectivity and support for more than half a terabyte of unified memory — the most ever in a personal computer
M3 Ultra features a 32-core CPU, an 80-core GPU, double the Neural Engine cores, Thunderbolt 5, and support for the most unified memory ever in a personal computer.
CUPERTINO, CALIFORNIA Apple today announced M3 Ultra, the highest-performing chip it has ever created, offering the most powerful CPU and GPU in a Mac, double the Neural Engine cores, and the most unified memory ever in a personal computer. M3 Ultra also features Thunderbolt 5 with more than 2x the bandwidth per port for faster connectivity and robust expansion. M3 Ultra is built using Apple’s innovative UltraFusion packaging architecture, which links two M3 Max dies over 10,000 high-speed connections that offer low latency and high bandwidth. This allows the system to treat the combined dies as a single, unified chip for massive performance while maintaining Apple’s industry-leading power efficiency. UltraFusion brings together a total of 184 billion transistors to take the industry-leading capabilities of the new Mac Studio to new heights.
“M3 Ultra is the pinnacle of our scalable system-on-a-chip architecture, aimed specifically at users who run the most heavily threaded and bandwidth-intensive applications,” said Johny Srouji, Apple’s senior vice president of Hardware Technologies. “Thanks to its 32-core CPU, massive GPU, support for the most unified memory ever in a personal computer, Thunderbolt 5 connectivity, and industry-leading power efficiency, there’s no other chip like M3 Ultra.”
52 Comments
nottorp
> support for more than half a terabyte of unified memory
Soldered?
universenz
96gb on baseline model m3 ultra with a max of 512gb! Looks like they’re leaning in hard with the AI crowd.
datadrivenangel
Unclear what devices this will be in outside of the mac studio. Also most of the comparisons were with M1 and M2 chips, not M4.
mythz
Ultra disappointing, they waited 2 years just to push out a single gen bump, even my last year's iPad Pro runs M4.
TheTxT
512GB unified memory is absolutely wild for AI stuff! Compared to how many NVIDIA GPUs you would need, the pricing looks almost reasonable.
varjag
Call me a unit fundamentalist but calling 512Gb "over half a terabyte memory" irks me to no end.
okamiueru
Don't know what the prior extreme apple is alluding to here. But, apple marketing is what it is.
dlachausse
Interesting that they’re releasing M3 Ultra after the M4 Macs have already shipped.
I wonder if the plan is to only release Ultras for odd number generations.
iambateman
People who know more than me: they’re talking a lot about RAM and not much about GPU.
Do you expect this will be able to handle AI workloads well?
All I’ve heard for the past two years is how important a beefy GPU is. Curious if that holds true here too.
chvid
Now make a data center version.
ksec
Previous model of M2 Ultra had max memory of 192GB. Or 128GB for Pro and some other M3 model, which I think is plenty for even 99.9% of professional task.
They now bump it to 512GB. Along with insane price tag of $9499 for 512GB Mac Studio. I am pretty sure this is some AI Gold rush.
desertmonad
Time to upgrade m1 ultra I guess! M1 ultra has been pretty good with deepseek locally.
InTheArena
Whoa. M3 instead of M4. I wonder if this was basically binning, but I thought that I had read somewhere that the interposer that enabled this for the M1 chips where not available.
That Said, 512GB of unified ram with access to the NPU is absolutely a game changer. My guess is that Apple developed this chip for their internal AI efforts, and are now at the point where they are releasing it publicly for others to use. They really need a 2U rack form for this though.
This hardware is really being held back by the operating system at this point.
behnamoh
819GB/s bandwidth…
what's the point of 512GB RAM for LLMs on this Mac Studio if the speed is painfully slow?
it's as if Apple doesn't want to compete with Nvidia… this is really disappointing in a Mac Studio. FYI: M2 Ultra already has 800GB/s bandwidth
pier25
So weird they released the Mac Studio with an M4 Max and M3 Ultra.
Why? They have too many M3 chips on stock?
johntitorjr
Lots of AI HW is focused on RAM (512GB!). I have a cost-sensitive application that needs speed (300+ TOPS), but only 1GB of RAM. Are there any HW companies focused on that space?
crest
Too bad it lacks even the streaming mode SVE2 found in M4 cores. If only Apple would provide a full SVE2 implementation to put pressure on ARM to make it non-optional so AArch64 isn't effectively restricted to NEON for SIMD.
lauritz
They update the Studio to M3 Ultra now, so M4 Ultra can presumably go directly into the Mac Pro at WWDC? Interesting timing. Maybe they'll change the form factor of the Mac Pro, too?
Additionally, I would assume this is a very low-volume product, so it being on N3B isn't a dealbreaker. At the same time, these chips must be very expensive to make, so tying them with luxury-priced RAM makes some kind of sense.
mrtksn
Let's say you want to have the absolute max memory(512GB) to run AI models and let's say that you are O.K. with plugging a drive to archive your model weights then you can get this for a little bit shy of $10K. What a dream machine.
Compared to Nvidia's Project DIGITS which is supposed to cost $3K and be available "soon", you can get a specs matching 128GB & 4TB version of this Mac for about $4700 and the difference would be that you can actually get it in a week and will run macOS(no idea how much performance difference to expect).
I can't wait to see someone testing the full DeepSeek model on this, maybe this would be the first little companion AI device that you can fully own and can do whatever you like with it, hassle-free.
giancarlostoro
At 9 grand I would certainly hope that they support the device software wise longer than they supported my 2017 Macbook Air. I see no reason to be forced to cough up 10 grand essentially every 7 years to Apple, that's ridiculous.
moondev
> support for more than half a terabyte of unified memory — the most ever in a personal computer
AMD Ryzen Threadripper PRO 3995WX released over four years ago and supports 2TB (64c/128t)
> Take your workstation's performance to the next level with the AMD Ryzen Threadripper PRO 3995WX 2.7 GHz 64-Core sWRX8 Processor. Built using the 7nm Zen Core architecture with the sWRX8 socket, this processor is designed to deliver exceptional performance for professionals such as artists, architects, engineers, and data scientists. Featuring 64 cores and 128 threads with a 2.7 GHz base clock frequency, a 4.2 GHz boost frequency, and 256MB of L3 cache, this processor significantly reduces rendering times for 8K videos, high-resolution photos, and 3D models. The Ryzen Threadripper PRO supports up to 128 PCI Express 4.0 lanes for high-speed throughput to compatible devices. It also supports up to 2TB of eight-channel ECC DDR4 memory at 3200 MHz to help efficiently run and multitask demanding applications.
gatienboquet
No benchmarks yet for the LLMs :(
xyst
I might like Apple again if the SoC could be sold separately and opened up. It would be interesting to see a PC with Asahi or Windows running on Apple’s chips.
c0deR3D
When would Apple silicons made natively support for OSes such as Linux? Apple seemlingly reluctant to release detailed technical reference manual for M-series SoCs, which makes running Linux natively on Apple silicon challenging.
NorwegianDude
The memory amount is fantastic, memory bandwidth is half decent(~800 GB/s), and the compute capabilities are terrible(36 TOPS).
For comparison, a single consumer card like the RTX 5090 is only 32 GB of memory, has 1792 GB/s memory and 3593 TOPS of compute.
The use cases will be limited. While you can't run a 600B model directly like Apple says(cause you need more memory for that), you can run a quantized version, but it will be very slow unless its a MoE architecture.
tempodox
I could salivate over the hardware no end, if only Apple software (including the OS) weren't that shoddy.
bredren
Apart from enabling an 120h update to the XDR Pro, does TB5 offer a viable pathway for eGPUs on Apple Silicon macbooks?
This is a cool computer, but not something I'd want to lug around.
submeta
I am confused. I got an M4 with 64 GB Ram. Did I buy something from the future? :) Now why M3 now? Not M4 Ultra.
JacksCracked
[dead]
ferguess_k
Ah, if we can have the hardware and the freedom of installing a good Linux repo on top of it. How is Asahi? Is it good enough? I assume, that since Asahi is focused on Apple hardware, it should have an easier time figuring out drivers and etc?
catlover76
[dead]
_alex_
apple keeps talking about the Neural Engine. Does anything actually use it? Seems like all the current LLM and Stable Diffusion packages (including MLX) use the GPU.
827a
Very curiously: They upgraded the Mac Studio but not the Mac Pro today.
FloatArtifact
So, what's the question if the M1/M2 Ultra was limited by GPU/NPU or more memory bandwidth at this point?
I'm curious what instruction sets may have been included with the M3 chip that the other two lack for AI.
So far the candidates seem to be NVIDIA digits, Framework Desktop, M1 64gb M2/M3 128gb studio/ultra.
The GPU market isn't competitive enough for the amount of VRAM needed. I was hoping for an Battlemage GPU Model with 24GB that would be reasonably priced and available.
The framework desktop and devices I think a second generation will be significantly better than what's currently on offer today. Rationale below…
For a max spec processor with ram at $2,000, this seems like a decent deal given today's market. However, this might age very fast for three reasons.
Reason 1: LPDDR6 may debut in the next year or two this could bring massive improvements to memory bandwidth and capacity for soldered on memory.
LPDDR6 vs LPDDR5 – Data bus width – 24 bits, 16 bits Burst length – 24 bits, 15 bits Memory bandwidth – Up to 38.4 GB/s, Up to 6.7 GB/s
– Camm ram may or may not be maintain signal integrity as memory bandwidth increases. Until I see it implemented for a AI use-case in a cost-effective manner, I am skeptical.
Reason 2: – It's a laptop chip with limited PCI lanes and reduced power envelope. Theoretically, a desktop chip could have better performance, more lanes, socketable (Although, I don't think I've seen a socketed CPU with soldered RAM)
Reason 3: In addition, what does hardware look like being repurposed in the future compared to alternatives?
– Unlike desktop or server counterparts which can have a higher cpu core count, PCEe/IO Expansion, this processor with its motherboard is limited on re-purposing later down the line as a server to self-host other software besides AI. I suppose could be turned into a overkill, NAS with ZFS and HBA Single Controller Card in new case.
– Buying into the framework desktop is pretty limited based on the form factor. Next generation might be able to include a 16x slot fully populated, a 10G nic. That seems about it if they're going to maintain the backward compatibility philosophy given the case form factor.
gpapilion
I think this will eventually morph into apples server fleet. This in conjunction with the ai server factory they are opening makes a lot of sense.
api
Half a terabyte could run 8 bit quantized versions of some of those full size llama and deepseek models. Looking forward to seeing some benchmarks on that.
ntqvm
Disappointing announcement. M4 brings a significant uplift over M3, and the ST performance of the M3 Ultra will be significantly worse than the M4 Max.
Even for its intended AI audience, the ISA additions in M4 brought significant uplift.
Are they waiting to put M4 Ultra into the Mac Pro?
Acelar0
[dead]
tuananh
but is it actually usable for anything if it's too slow.
Has anyone has a ballpark number how many tokens per second we can get with this?
cxie
512GB of unified memory is truly breaking new ground. I was wondering when Apple would overcome memory constraints, and now we're seeing a half-terabyte level of unified memory. This is incredibly practical for running large AI models locally ("600 billion parameters"), and Apple's approach of integrating this much efficient memory on a single chip is fascinating compared to NVIDIA's solutions.
I'm curious about how this design of "fusing" two M3 Max chips performs in terms of heat dissipation and power consumption though
daft_pink
Really? M4 Max or M3 Ultra instead of M4 Ultra?
cynicalpeace
Can someone explain what it would take for Apple to overtake NVIDIA as the preferred solution for AI shops?
This is my understanding (probably incorrect in some places)
1. NVIDIA's big advantage is that they design the hardware (chips) and software (CUDA). But Apple also designs the hardware (chips) and software (Metal and MacOS).
2. CUDA has native support by AI libraries like PyTorch and Tensorflow, so works extra well during training and inference. It seems Metal is well supported by PyTorch, but not well supported by Tensorflow.
3. NVIDIA uses Linux rather than MacOS, making it easier in general to rack servers.
fintechie
IMO this is a bigger blow to the AI big boys than Deepseek's release. This is massive for local inference. Exciting times ahead for open source AI.
rjeli
Wow, incredible. I told myself I’d stop waffling and just buy the next 800gb/s mini or studio to come out, so I guess I’m getting this.
Not sure how much storage to get. I was floating the idea of getting less storage, and hooking it up to a TB5 NAS array of 2.5” SSDs, 10-20tb for models + datasets + my media library would be nice. Any recommendations for the best enclosure for that?
Sharlin
> it can be configured up to 512GB, or over half a terabyte.
Hah, I see what they did there.
aurareturn
You can run the full Deepseek 671b q4 model at 40 tokens/s. 37B active params at a time because R1 is MoE.
screye
How does the 500gb vram compare with 8xA100s ? ($15/hr rentals)
If it is equivalent, then the machine pays for itself in 300 hours. That's incredible value.
ummonk
Is the Mac Pro dead or are they waiting for M4 Ultra refresh it?
ozten
We've come a long way since beowulf clusters of smart toasters.
perfmode
32 core, 512GB RAM, 8TB SSD
please take my money now
raydev
I know it's basically nitpicking competing luxury sports cars at this point, but I am very bothered that existing benchmarks for the M3 show single core perf that is approximately 70% of M4 single core perf.
I feel like I should be able to spend all my money to both get the fastest single core performance AND all the cores and available memory, but Apple has decided that we need to downgrade to "go wide". Annoying.
1attice
Now with Ultra-class backdoors? https://news.ycombinator.com/item?id=43003230