Here in the Dept of Computer Science (DCS) at the University of Sheffield the need for access to high-performance GPU hardware has increased considerably in the last couple of years. Within the dept uses currently include machine/deep learning and agent-based modelling (e.g. using FLAME GPU).
One way of getting access to GPUs is via the University’s ShARC HPC system, which contains several GPU-equipped nodes including an NVIDIA DGX-1 node with 8x NVIDIA P100 GPUs^. This DGX-1 is currently only accessible to DCS researchers and collaborators.
The DGX-1 has proved fairly popular, so DCS has decided to invest heavily in GPUs for the University’s next HPC system, Bessemer, which will run concurrently with ShARC for the next few years. DCS has purchased eight GPU nodes for Bessemer that again will be available for use by all DCS researchers and academics plus collaborators. The specification per node is:
That’s 32 NVIDIA V100 GPUs in total! These V100 devices have several advantages over the P100 cards available in the DGX-1. Firstly, the V100 devices have a number of tensor cores, which multiply two half-precision 4x4 matrices then add a half or full precision matrix to the result. Tensor cores can expedite neural network training.
Secondly, the V100 devices offer more performance in several ways:
|Model||V100 (NVLink)||P100 (NVLink)|
|Memory bandwidth||900 GB/sec||720 GB/sec|
|Half Precision perf||30 TFLOPS||21.2 TFLOPS|
|Single Precision perf||15 TFLOPS||10.6 TFLOPS|
|Double Precision perf||7.5 TFLOPS||5.3 TFLOPS|
|Tensor perf (Deep Learning)||120 TFLOPS||N/A|
|NVLink bandwidth||300 GB/s||160 GB/s|
One major difference between Bessemer and ShARC is that
Bessemer is to be the first University of Sheffield HPC system to run the SLURM job/resource manager rather than Sun Grid Engine aka SGE (or a variant thereof).
SLURM has native support for GPUs, which is much improved as of this year’s 19.05 release (PDF).
The main benefits to users will be:
Expect to hear much more about Bessemer in the next month or two!
I should also mention that DCS has more dedicated hardware in ShARC other than the DGX-1: there are 12 CPU-only nodes with high core counts and/or large amounts of RAM (<= 768 GB).
^ Only seven are usable at present as there is a memory fault with one device.
For queries relating to collaborating with the RSE team on projects: firstname.lastname@example.org
Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.