The cost of a single GPU card is $379, so the total cost for all GPUs is ~$1516. To fit within GPU memory, the first layer is not loaded on the root node; instead, it’s loaded into RAM. For this, I used a new argument: –gpu-segments. 4 x RTX 3060 12 GB Llama 3.3 70B
