the 7950X3D was supported on Windows from day 1, while on Linux the scheduler is still unaware of the different perf characteristics to this day.
That may be true, but with the ridiculous increase in performance for this CPU due to the massive amount of L3 cache (X3D), I don’t care. I just replaced a Linux compute node with an Intel Xeon Silver compute node with a custom built Linux node that features the 7950X3D, and I’m benchmarking now at over twice the speed (CFD-type work)! Not bad for a $650 consumer CPU. The difference between 128MB and 12MB of L3 cache is apparently pretty huge, from what I’m seeing. I think it’s important to note that L3 cache can be shared across CPU cores.
The problem is that only half of the chiplets have access to the large cache. If the scheduler isn’t aware of that and a lot of data is shared across cores (as in the case for many games), you’ll miss out on most of that performance. AMD wrote a driver for Windows to help optimally schedule threads with high cache intensity to the expanded cache chiplets, but they didn’t do it for Linux. If your workload is not very chatty between cores, and threads don’t need to synchronize at 60Hz, it won’t matter as much. But for game workloads, it makes a big difference, and can actually result in worse performance than the homogenous chiplet design of the mid-tier 7800X3D if you get it wrong.
That may be true, but with the ridiculous increase in performance for this CPU due to the massive amount of L3 cache (X3D), I don’t care. I just replaced a Linux compute node with an Intel Xeon Silver compute node with a custom built Linux node that features the 7950X3D, and I’m benchmarking now at over twice the speed (CFD-type work)! Not bad for a $650 consumer CPU. The difference between 128MB and 12MB of L3 cache is apparently pretty huge, from what I’m seeing. I think it’s important to note that L3 cache can be shared across CPU cores.
The problem is that only half of the chiplets have access to the large cache. If the scheduler isn’t aware of that and a lot of data is shared across cores (as in the case for many games), you’ll miss out on most of that performance. AMD wrote a driver for Windows to help optimally schedule threads with high cache intensity to the expanded cache chiplets, but they didn’t do it for Linux. If your workload is not very chatty between cores, and threads don’t need to synchronize at 60Hz, it won’t matter as much. But for game workloads, it makes a big difference, and can actually result in worse performance than the homogenous chiplet design of the mid-tier 7800X3D if you get it wrong.