RSE Sheffield Blog

Pytest Parametrisation

Neil Shephard
24 January 2024 12:00

Pytest is an excellent framework for writing tests in Python. One of the neat features it includes is the ability to parameterise your tests which means you can write one test and pass different sets of parameters into it to test the range of actions that the function/method are meant to handle.

Repository Review

Neil Shephard
17 November 2023 12:00

I’ve written before about Python Packaging and pre-commit which I’m a big fan of. Today I discovered a really useful tool for checking your packaging configuration and pre-commit configuration from the Scientific Python Development Guide.

R Resources

Neil Shephard
11 October 2023 13:00

R is a statistical programming language and one of the most popular languages for data analysis, statistics and plotting in academia and industry. Learning a new language can be daunting, particularly if you have no experience of scripting and are used to Graphical User Interfaces (GUIs) where you point and click to perform your statistical analysis.

Fear not though, there are lot of resources and very friendly, enthusiastic and helpful R users out there who can help you on your journey learning R. This post details some of them, and I’d welcome additions.

Python Packaging

Neil Shephard
18 September 2023 13:00

Python packaging is in a constant state of flux. There is the official Python Packaging User Guide and the Python Packaging Authority (PyPA) which is probably the best resource to read but things change, and often quickly. The focus here is on the PyPA Setuptools using pyproject.toml which works with Python >= 3.7, but you may wish to consider other packages such as Poetry or PDM which offer some advantages but with additional frameworks to learn.

Benchmarking FLAME GPU 2 on H100, A100 and V100 GPUs

Peter Heywood
18 August 2023 13:00

H100 GPUs now available at the University of Sheffield

Members of the University of Sheffield have access to a range of GPU resources for carrying out their research, available in local (Tier 3) and affiliated regional (Tier 2) HPC systems.

As of the 7th of August 2023, 12 new H100 PCIe GPUs (6 nodes, 2 GPUs per node) have been added to the Stanage Tier 3 HPC facility and are available for all users.

At the time of writing, the following GPUs are available for members of the university, free at the point of use:

  • Stanage (Tier 3, The University of Sheffield):
    • 60 public NVIDIA A100 SXM4 80GB GPUs
    • 12 public NVIDIA H100 PCIe 80GB GPUs
  • Bessemer (Tier 3, The University of Sheffield):
  • JADE 2 (Tier 2 - Machine Learning and Molecular Dynamics):
    • 504 NVIDIA Tesla V100 MAXQ 32GB GPUs
  • N8 CIR Bede (Tier 2 - PPC64LE CPUs & Multi-node jobs):
    • 136 NVIDIA Tesla V100 SXM2 32GB GPUs with host-device NVLink
    • 16 NVIDIA Tesla T4 16GB PCIe GPUs

The newly available H100 GPUs in Stanage are the 300W PCIe variant, this means that for some workloads the university’s 500W A100 SXM4 GPUs may offer higher performance (with higher power consumption). For example, multi-GPU workloads which perform a large volume of GPU to GPU communication may be better suited to the A100 nodes than the H100 nodes. The new H100 nodes in Stanage each contain 2 GPUs which are only connected via PCIe to the host. The existing A100 nodes each contain 4 GPUs which are directly connected to one another via NVLink and are connected to the host via PCIe. The NVLink interconnect offers higher memory bandwidth for GPU to GPU communication, which combined with twice as many GPUs per node may lead to shorter application run-times than offered by the H100 nodes. If even more GPUs are required moving to the Tier 2 systems may be required, with Jade 2 offering up to 8 GPUs per Job, and Bede being the only current option for multi-node GPU jobs, with up to 128 GPUs per job.

Upcoming Training Courses

David Wilby
17 April 2023 13:00

Exciting news! We have an upcoming series of training courses, free to researchers (including PhD Students) at the University of Sheffield.

Follow the links below to find out more and register!

More courses will be announced in the future so keep your eyes peeled for announcements, or join the RSE community mailing list to be the first to find out!

Who's to Blame?

Neil Shephard
22 December 2022 12:00

Git blame shows who made changes to which line of code for a given point in its history. This is useful if you are struggling to understand changes to a section of code as you can potentially contact the author and gain some insight.

Upcoming : Git & GitHub through GitKraken - Zero to Hero!

Neil Shephard
20 December 2022 12:00

The RSE Team are pleased to announce three scheduled sessions of the increasingly popular Git & GitHub through GitKraken - Zero to Hero!. These courses will run over two consecutive days in morning sessions from 09:30 to 13:00 on the following days.

What are Git, GitHub and GitKraken?

Git is a system of version controlling your code. Think of it as a lab-book or doctors notes that are taken as you progress through your work, recording conditions, saving what has worked and correcting what doesn’t.

GitHub is a website that allows people to work collaboratively on version controlled code.

GitKraken is a client for working with Git and GitHub that includes both a GUI (Graphical User Interface) and a CLI (Command Line Interface)

Who is the course for?

Everyone who writes code! If you write scripts to analyse your code in R, Stata or Matlab you would benefit from using Git to version control your code and GitHub to share your code and make it open. If you write Python, JavaScript, C/++ code as part of a team in your research group you would benefit from using Git and GitHub to work together.

Getting started with these tools can be overwhelming but by taking this course you will be introduced to the concepts behind them and how to use them effectively to not just version control your own work but work with others on the same code.

The course material is available online if you want to take a peek and the first half using Git and publishing web-pages can be worked through in your own time. The real benefit comes from participating in the collaborative exercises in the second half where you work together on projects making Pull Requests and resolving problems that arise.

If you’ve never used Git, GitHub or GitKraken or have only just started then sign-up and come and learn more about these powerful tools.

Writing better and more shareable code

David Wilby
20 October 2022 13:00

If you know how to write code, it’s not actually that hard to make it better! You just need to know how.

This quick blog is intended to be fairly non-specific with regard to programming language, my experience is mainly with Python & MATLAB, but I believe the principles should extend to any language. In theory.

Firstly, what do I mean by “better” here? I mean:

  • more readable
  • less repetetive
  • easier to debug
  • less prone to errors
  • simpler for others to use

Let’s get into it.

Demonstrating Importance and Value in Research Software

David Wilby
13 October 2022 10:00

As research software engineers (RSEs) or researchers who develop software & code, at some point we will need to provide evidence that our software actually has some positive effect on the world. Whether this is for career progression, the REF, just for your own peace of mind or any of a whole host of reasons, collecting usage data takes many forms but can often come as an afterthought.

This International RSE day (Thursday 13th October 2022) let’s have a look at how we can plan to collect some evidence to demonstrate the value of our work and hopefully mitigate some potential stumbling blocks in our career progression.

When it comes to ‘traditional research outputs’ (i.e. peer-reviewed publications), the dreaded journal impact factor1 has come to be the de facto standard measure of the quality of a piece of research, however flawed we know it to be.2 Along with numbers of citations3, these measures are a big part of what’s used to determine how worthy the author of some research is of: promotion, funding, recognition etc. But with software, we don’t implicitly have the *ahem* “luxury” of impact factors and numbers of citations so using other ways of showing that our work has merit is something of a necessity, but how do we do that?

  1. Originally conceived as a metric to help librarians select journals to subscribe to, the impact factor is a measure of a journal’s ratio of citations to papers published. 

  2. See the Declaration on Research Assessment (DORA) for more on the movement to improve the way researchers and scholarly work are evaluated. 

  3. Flawed for many reasons as well, not least because of the massive disparity in citation rates across disciplines. 

Contact Us

For queries relating to collaborating with the RSE team on projects:

Information and access to JADE II and Bede.

Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.

Queries regarding free research computing support/guidance should be raised via our Code clinic or directed to the University IT helpdesk.