Pytest is an excellent framework for writing tests in Python. One of the neat features it includes is the ability to parameterise your tests which means you can write one test and pass different sets of parameters into it to test the range of actions that the function/method are meant to handle.
I’ve written before about Python Packaging and pre-commit which I’m a big fan of. Today I discovered a really useful tool for checking your packaging configuration and pre-commit configuration from the Scientific Python Development Guide.
R is a statistical programming language and one of the most popular languages for data analysis, statistics and plotting in academia and industry. Learning a new language can be daunting, particularly if you have no experience of scripting and are used to Graphical User Interfaces (GUIs) where you point and click to perform your statistical analysis.
Fear not though, there are lot of resources and very friendly, enthusiastic and helpful R users out there who can help you on your journey learning R. This post details some of them, and I’d welcome additions.
Python packaging is in a constant state of flux. There is the official Python
Packaging User Guide and the Python Packaging
Authority (PyPA) which is probably the best resource to read but things change, and often
quickly. The focus here is on the PyPA Setuptools using
pyproject.toml which works with Python >= 3.7, but you may wish to consider other packages such as
Poetry or PDM which offer some advantages but with additional frameworks to learn.
Members of the University of Sheffield have access to a range of GPU resources for carrying out their research, available in local (Tier 3) and affiliated regional (Tier 2) HPC systems.
At the time of writing, the following GPUs are available for members of the university, free at the point of use:
The newly available H100 GPUs in Stanage are the 300W PCIe variant, this means that for some workloads the university’s 500W A100 SXM4 GPUs may offer higher performance (with higher power consumption). For example, multi-GPU workloads which perform a large volume of GPU to GPU communication may be better suited to the A100 nodes than the H100 nodes. The new H100 nodes in Stanage each contain 2 GPUs which are only connected via PCIe to the host. The existing A100 nodes each contain 4 GPUs which are directly connected to one another via NVLink and are connected to the host via PCIe. The NVLink interconnect offers higher memory bandwidth for GPU to GPU communication, which combined with twice as many GPUs per node may lead to shorter application run-times than offered by the H100 nodes. If even more GPUs are required moving to the Tier 2 systems may be required, with Jade 2 offering up to 8 GPUs per Job, and Bede being the only current option for multi-node GPU jobs, with up to 128 GPUs per job.
Exciting news! We have an upcoming series of training courses, free to researchers (including PhD Students) at the University of Sheffield.
Follow the links below to find out more and register!
More courses will be announced in the future so keep your eyes peeled for announcements, or join the RSE community mailing list to be the first to find out!
Git blame shows who made changes to which line of code for a given point in its history. This is useful if you are struggling to understand changes to a section of code as you can potentially contact the author and gain some insight.
The RSE Team are pleased to announce three scheduled sessions of the increasingly popular Git & GitHub through GitKraken - Zero to Hero!. These courses will run over two consecutive days in morning sessions from 09:30 to 13:00 on the following days.
Git is a system of version controlling your code. Think of it as a lab-book or doctors notes that are taken as you progress through your work, recording conditions, saving what has worked and correcting what doesn’t.
GitHub is a website that allows people to work collaboratively on version controlled code.
GitKraken is a client for working with Git and GitHub that includes both a GUI (Graphical User Interface) and a CLI (Command Line Interface)
Getting started with these tools can be overwhelming but by taking this course you will be introduced to the concepts behind them and how to use them effectively to not just version control your own work but work with others on the same code.
The course material is available online if you want to take a peek and the first half using Git and publishing web-pages can be worked through in your own time. The real benefit comes from participating in the collaborative exercises in the second half where you work together on projects making Pull Requests and resolving problems that arise.
If you’ve never used Git, GitHub or GitKraken or have only just started then sign-up and come and learn more about these powerful tools.
If you know how to write code, it’s not actually that hard to make it better! You just need to know how.
This quick blog is intended to be fairly non-specific with regard to programming language, my experience is mainly with Python & MATLAB, but I believe the principles should extend to any language. In theory.
Firstly, what do I mean by “better” here? I mean:
Let’s get into it.
As research software engineers (RSEs) or researchers who develop software & code, at some point we will need to provide evidence that our software actually has some positive effect on the world. Whether this is for career progression, the REF, just for your own peace of mind or any of a whole host of reasons, collecting usage data takes many forms but can often come as an afterthought.
This International RSE day (Thursday 13th October 2022) let’s have a look at how we can plan to collect some evidence to demonstrate the value of our work and hopefully mitigate some potential stumbling blocks in our career progression.
When it comes to ‘traditional research outputs’ (i.e. peer-reviewed publications), the dreaded journal impact factor1 has come to be the de facto standard measure of the quality of a piece of research, however flawed we know it to be.2 Along with numbers of citations3, these measures are a big part of what’s used to determine how worthy the author of some research is of: promotion, funding, recognition etc. But with software, we don’t implicitly have the *ahem* “luxury” of impact factors and numbers of citations so using other ways of showing that our work has merit is something of a necessity, but how do we do that?
Originally conceived as a metric to help librarians select journals to subscribe to, the impact factor is a measure of a journal’s ratio of citations to papers published. https://en.wikipedia.org/wiki/Impact_factor ↩
Flawed for many reasons as well, not least because of the massive disparity in citation rates across disciplines. ↩
For queries relating to collaborating with the RSE team on projects: firstname.lastname@example.org
Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.