Twin Karmakharm
Research Software Engineer
Helping to increase your research impact
through sustainble research sofware.
When your machine is just not enough...
"It takes me X days to run my algorithm/analysis/simulation."
"I can't load X GBs of dataset into memory."
"There's not enough room to hold my X TBs of dataset."
High Performance Computing (HPC) refers to a network (cluster) of connected computers (nodes)
Like your PC but bigger...
And there's a lot of them...
With fast connection between nodes...
Many have specialist compute hardware e.g. GPUs, FPGAs, etc.
Take advantage of the fact that there's a lot of CPUs and Nodes:
Break down your problem into small tasks and run them concurrently:
Break down your problem into small tasks and run them concurrently:
There are different tiers of HPC, with generally more computing resource the higher you go, from the level of local institutions (Tier-3) to multi-national (Tier-0).
If additional compute resource is needed...
A few tips to improve your HPC experience
More resource requested means it's harder for the scheduler to find a slot for your job.
Always set your job execution-time!
SGE (Iceberg, ShARC) | SLURM (Bessemer, JADE) | |
---|---|---|
Execution-time | -l h_rt=hh:mm:ss |
-t [mins] -t [days-hh:mm:ss] |
Memory | -l rmem=xxG |
--mem=xxG |
No. CPU cores | -pe <env> <nn> |
-c <nn> |
... to standardise your software environment across your local machine, HPCs and cloud.
e.g. here's a depency tree for Tensorflow 2
tensorflow-gpu==2.1.0
- absl-py [required: >=0.7.0, installed: 0.9.0]
- six [required: Any, installed: 1.14.0]
- astor [required: >=0.6.0, installed: 0.8.1]
- gast [required: ==0.2.2, installed: 0.2.2]
- google-pasta [required: >=0.1.6, installed: 0.2.0]
- six [required: Any, installed: 1.14.0]
- grpcio [required: >=1.8.6, installed: 1.28.1]
- six [required: >=1.5.2, installed: 1.14.0]
- keras-applications [required: >=1.0.8, installed: 1.0.8]
- h5py [required: Any, installed: 2.10.0]
- numpy [required: >=1.7, installed: 1.18.2]
- six [required: Any, installed: 1.14.0]
- numpy [required: >=1.9.1, installed: 1.18.2]
- keras-preprocessing [required: >=1.1.0, installed: 1.1.0]
- numpy [required: >=1.9.1, installed: 1.18.2]
- six [required: >=1.9.0, installed: 1.14.0]
- numpy [required: >=1.16.0,<2.0, installed: 1.18.2]
- opt-einsum [required: >=2.3.2, installed: 3.2.0]
- numpy [required: >=1.7, installed: 1.18.2]
- protobuf [required: >=3.8.0, installed: 3.11.3]
- setuptools [required: Any, installed: 45.2.0.post20200210]
- six [required: >=1.9, installed: 1.14.0]
- scipy [required: ==1.4.1, installed: 1.4.1]
- numpy [required: >=1.13.3, installed: 1.18.2]
- six [required: >=1.12.0, installed: 1.14.0]
- tensorboard [required: >=2.1.0,<2.2.0, installed: 2.1.1]
- absl-py [required: >=0.4, installed: 0.9.0]
- six [required: Any, installed: 1.14.0]
- google-auth [required: >=1.6.3,<2, installed: 1.13.1]
- cachetools [required: >=2.0.0,<5.0, installed: 4.0.0]
- pyasn1-modules [required: >=0.2.1, installed: 0.2.8]
- pyasn1 [required: >=0.4.6,<0.5.0, installed: 0.4.8]
- rsa [required: >=3.1.4,<4.1, installed: 4.0]
- pyasn1 [required: >=0.1.3, installed: 0.4.8]
- setuptools [required: >=40.3.0, installed: 45.2.0.post20200210]
- six [required: >=1.9.0, installed: 1.14.0]
- google-auth-oauthlib [required: >=0.4.1,<0.5, installed: 0.4.1]
- google-auth [required: Any, installed: 1.13.1]
- cachetools [required: >=2.0.0,<5.0, installed: 4.0.0]
- pyasn1-modules [required: >=0.2.1, installed: 0.2.8]
- pyasn1 [required: >=0.4.6,<0.5.0, installed: 0.4.8]
- rsa [required: >=3.1.4,<4.1, installed: 4.0]
- pyasn1 [required: >=0.1.3, installed: 0.4.8]
- setuptools [required: >=40.3.0, installed: 45.2.0.post20200210]
- six [required: >=1.9.0, installed: 1.14.0]
- requests-oauthlib [required: >=0.7.0, installed: 1.3.0]
- oauthlib [required: >=3.0.0, installed: 3.1.0]
- requests [required: >=2.0.0, installed: 2.22.0]
- certifi [required: >=2017.4.17, installed: 2019.11.28]
- chardet [required: >=3.0.2,<3.1.0, installed: 3.0.4]
- idna [required: >=2.5,<2.9, installed: 2.8]
- urllib3 [required: >=1.21.1,<1.26,!=1.25.1,!=1.25.0, installed: 1.25.7]
- grpcio [required: >=1.24.3, installed: 1.28.1]
- six [required: >=1.5.2, installed: 1.14.0]
- markdown [required: >=2.6.8, installed: 3.2.1]
- setuptools [required: >=36, installed: 45.2.0.post20200210]
- numpy [required: >=1.12.0, installed: 1.18.2]
- protobuf [required: >=3.6.0, installed: 3.11.3]
- setuptools [required: Any, installed: 45.2.0.post20200210]
- six [required: >=1.9, installed: 1.14.0]
- requests [required: >=2.21.0,<3, installed: 2.22.0]
- certifi [required: >=2017.4.17, installed: 2019.11.28]
- chardet [required: >=3.0.2,<3.1.0, installed: 3.0.4]
- idna [required: >=2.5,<2.9, installed: 2.8]
- urllib3 [required: >=1.21.1,<1.26,!=1.25.1,!=1.25.0, installed: 1.25.7]
- setuptools [required: >=41.0.0, installed: 45.2.0.post20200210]
- six [required: >=1.10.0, installed: 1.14.0]
- werkzeug [required: >=0.11.15, installed: 0.15.4]
- wheel [required: >=0.26, installed: 0.34.2]
run
Help with programming problems and general advice on best practice related to research software.