Summary and Schedule
Welcome to Performance Profiling & Optimisation (Python) Training!
The training curriculum for this course is designed for researchers that are writing Python and lack formal computer science training. The curriculum covers how to assess where time is being spent during execution of a Python program, it also provides a high level understanding of how code executes and how this maps to the limiting factors of performance and good practice.
If you are now comfortable using Python, this course may be of interest to supplement and advance your programming knowledge. This course is particularly relevant if you are writing research code and desire greater confidence that your code is both performant and suitable for publication.
This is an all-day course, however it normally finishes by early afternoon.
If you would like to register to take the course, check the registration information.
Learning Objectives
After attending this training, participants will be able to:
- identify the most expensive functions and lines of code using
cprofile
andline_profiler
. - evaluate code to determine the limiting factors of it’s performance.
- recognise and implement optimisations for common limiting factors of performance.
Prerequisites
Before joining Performance Profiling & Optimisation (Python) Training, participants should be able to:
- implement basic algorithms in Python.
- follow the control flow of Python code, and dry run the execution in their head or on paper.
See the Research Computing Training Hub for other courses to help with learning these skills.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introduction to Profiling |
Why should you profile your code? How should you choose which type of profiler to use? Which test case should be profiled? |
Duration: 00h 25m | 2. Function Level Profiling |
When is function level profiling appropriate? How can cProfile and snakeviz be used to profile a
Python program?How are the outputs from function level profiling interpreted? |
Duration: 01h 05m | 3. Break | |
Duration: 01h 20m | 4. Line Level Profiling |
When is line level profiling appropriate? What adjustments are required to Python code to profile with line_profiler ?How can kernprof be used to
profile a Python program?
|
Duration: 02h 10m | 5. Profiling Conclusion | What has been learnt about profiling? |
Duration: 02h 15m | 6. Introduction to Optimisation | Why could optimisation of code be harmful? |
Duration: 02h 25m | 7. Data Structures & Algorithms |
What’s the most efficient way to construct a list? When should tuples be used? When are sets appropriate? What is the best way to search a list? |
Duration: 03h 00m | 8. Break | |
Duration: 04h 00m | 9. Understanding Python (NumPy/Pandas) |
Why are Python loops slow? Why is NumPy often faster than raw Python? How can processing rows of a Pandas data table be made faster? |
Duration: 04h 30m | 10. Keep Python & Packages up to Date |
Why would a newer version of Python or a package be faster? Are there any risks to updating Python and packages? How can reproducibility be ensured through package upgrades? |
Duration: 04h 40m | 11. Understanding Memory |
How does a CPU look for a variable it requires? What impact do cache lines have on memory accesses? Why is it faster to read/write a single 100mb file, than 100 1mb files? |
Duration: 05h 10m | 12. Optimisation Conclusion | What has been learnt about writing performant Python? |
Duration: 05h 15m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Software Setup
Details
This course uses Python and was developed using Python 3.11, therefore it is recommended that you have a Python 3.11 or newer environment.
You may want to create a new Python virtual environment for the
course, this can be done with your preferred Python environment manager
(e.g. conda
, pipenv
), the required packages
can all be installed via pip
.
The non-core Python packages required by the course are
pytest
, snakeviz
, line_profiler
,
numpy
, pandas
and matplotlib
which can be installed via pip
.
To complete some of the exercises you will need to use a text-editor or Python IDE, so make sure you have your favourite available.