M1 chip outperforms computational fluid dynamics

12/27/2023

AMD appears to be better than Intel right now, depending on which generations one compares, but I consider the non-uniform cache and memory issues on AMD to be a major inconvenience. The closet equivalent in the ARM world would be the Apple M1 Studio with 10-20 cores, which is \$2-5K, depending on how you equip it.įor most CFD codes, I'd expect the microarchitectural differences between Intel, AMD, ARM Neoverse and Apple Silicon to be far less important than memory configuration and core count. The Puget Systems mini-ATX desktops for Intel and AMD 12-core CPUs with 4x32 GB of memory both come in around \$4K. Has 32- and 64-core AMD options with 512 GB of DRAM for \$11K and Option for a 32-core Ice Lake Xeon with 512 GB of DRAM around \$11K. With a 64-core CPU and 256-512 GB of DRAM is around \$13K. Not all codes strong- and weak-scale ideally, and I recall that FDS has at least some MPI scaling problems, at least in multi-node scenarios, although I can't determine if they are are relevant to your use cases.Īnyways, here some examples of systems that I'd considering building, assuming that I was running open-source codes that don't depend on some vendor compiling binaries for me, and thus there is no problem using ARM AArch64 instead of Intel/AMD x86.

The other issue is whether the code scales ideally. OpenMP parallelism usually needs a monolithic cache, whereas MPI mostly doesn't care. this for details), which may be relevant depending on how the code is parallelized. AMD is offering more 元 than Intel, although Intel 元 caches are monolithic, which means they are fully shared by all cores, whereas AMD 元 caches are per CCD/CCX (see e.g. Now, if your CFD problems are small and potentially fit into 元 cache, you should consider that in more detail. Intel Core and AMD ThreadRipper workstations have fewer DDR channels (at most 4, from what I've seen, but I haven't looked exhaustively). Ice Lake) as well as all AMD and ARM servers I know of have 8 DDR4 channels. Cascade Lake) have 6 channel DDR4, whereas Intel Xeon 10 nm servers (e.g. Second, essentially all CFD workloads are bandwidth-limited, which means DRAM channels are more important than cores, so long as you have enough cores to saturate your DRAM channels.Īs of September 2022, Intel Xeon 14 nm servers (e.g. Most likely it won't matter but if you are doing things where a single erroneous result would create problems, you ought be using ECC. Here are some generic recommendations on HPC workstation and server node procurement.įirst, figure out if you need ECC memory or not, since that determines whether you buy Intel Core and AMD Ryzen desktops/workstations versus Intel Xeon, AMD ThreadRipper/EPYC or ARM workstations/servers. If this is a faster option (72cores c5n.18xlarge instance) then I might just do that. Note: I have also been exploring running FDS on AWS following a tutorial from documentation ( ) but it is tough to do all the coding. TLDR: Are professional benchmark comparisons accurate for say an i9-12 versus a current Xeon Gold or Platinum? If the overall speed is the same, will I gain anything with a slower CPU with more cores, or one designed for scientific/workstation use? Or does this vary case by case for software? What is best CPU for MPI parallel computing for CFD and/or serial computing for CFD on a $3,000-5,000 budget?

The scores seem consistent with other benchmarks (Geekbench, PCMark). The Xeon is $1,100 more expensive and usually sold with a Workstation setup, whereas the i9 is more for high performance desktops. Xeon Gold 6230R (26 cores, 2.10GHz base) with a score of 30,391.

RAM recommended at 2GB per core.Īny suggestions on speeds, architecture, number of cores?īased on PassMark Benchmarks, for instance, the i9-12900K (16 cores, 3.2Ghz base) has a score of 40,258 vs. There are decreased benefits at higher number of meshes, limit to performance around 16^3 cells per mesh. The software runs MPI parallel computing, and it is ideal to run 1 mesh per 1 thread with no hyperthreading, but computation improvements are seen for running in serial (8 meshes in 16 cores, etc.). I would like to run a Windows 10/11 PC instead of Linux. I would like to improve my computation times significantly and decrease my mesh size to 1 or 2mm, ideally (so almost a factor of 16*16 in 3 dimensions). The total mesh is about 20,000 cells (4.5 by 4.5 by 10cm), divided into 8 meshes, 3.75mm inner cell size, and it takes about 3 days 10 hours to run the current iteration. I am purchasing a new workstation to run FDS (Fire Dynamics Simulator) simulations, a CFD code for thermally driven fluid flow.Ĭurrently, I am using an Ubuntu Linux build with a Xeon E5-2630 v3 2.40 GHz with 8 cores and 16GB of RAM.

0 Comments

M1 chip outperforms computational fluid dynamics

Leave a Reply.

Author

Archives

Categories