TL;DR Around 100 of Iceberg’s nodes are ancient and weaker than a decent laptop. You may get better performance by switching to ShARC. You’ll get even better performance by investing in the RSE project on ShARC.
## Benchmarking different nodes on our HPC systems
I have been benchmarking various nodes on Iceberg and ShARC using Matrix-Matrix multiplication. This operation is highly parallel and optimised these days and is also a vital operation in many scientific workflows.
The benchmark units are GigaFlops (Billion operations per second) and higher is better Here are the results for maximum matrix sizes of 10000 by 10000, sorted worst to best
According to the Iceberg cluster specs, over half of Iceberg is made up of the old ‘Westmere’ nodes. According to these benchmarks, these are almost 4 times slower than a standard node on ShARC.
We in the RSE group have co-invested with our collaborators in additional hardware on ShARC to form a ‘Premium queue’. This hardware includes large memory nodes (768 Gigabytes per node - 12 times the amount that’s normally available), Advanced GPUs (A DGX-1 server) and ‘dense-core’ nodes with 32 CPUs each.
These 32 core nodes are capable of over 800 Gigaflops and so are 6.7 times faster than the old Iceberg nodes. Furthermore, since they are only available to contributors, the queues will be shorter too!
Details of how to participate in the RSE-queue experiment on ShARC can be found on our website
These benchmarks give reproducible evidence that ShARC can be significantly faster than Iceberg when well-optimised code is used. We have heard some unconfirmed reports that code run on ShARC can be slower than code run on Iceberg. If this is the case for you, please get in touch with us and give details.
For queries relating to collaborating with the RSE team on projects: email@example.com
Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.