Introduction to Optimisation

Last updated on 2025-03-26 | Edit this page

Overview

Questions

  • Why could optimisation of code be harmful?

Objectives

  • Able to explain the cost benefit analysis of performing code optimisation

Introduction


Now that you’re able to find the most expensive components of your code with profiling, we can think about ways to improve it. However, the best way to do this will depend a lot on your specific code! For example, if your code is spending 60 seconds waiting to download data files and then 1 second to analyse that data, then optimizing your data analysis code won’t make much of a difference. We’ll talk briefly about some of these external bottlenecks at the end. For now, we’ll assume that you’re not waiting for anything else and we’ll look at the performance of your code.

In order to optimise code for performance, it is necessary to have an understanding of what a computer is doing to execute it.

A high-level understanding of how your code executes, such as how Python and the most common data-structures and algorithms are implemented, can help you identify suboptimal approaches when programming. If you have learned to write code informally out of necessity, to get something to work, it’s not uncommon to have collected some “unpythonic” habits along the way that may harm your code’s performance.

These are the first steps in code optimisation, and knowledge you can put into practice by making more informed choices as you write your code and after profiling it.

The remaining content is often abstract knowledge, that is transferable to the vast majority of programming languages. This is because the hardware architecture, data-structures and algorithms used are common to many languages and they hold some of the greatest influence over performance bottlenecks.

Performance vs Maintainability


Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. - Donald Knuth

This classic quote among computer scientists emphasises the importance of considering both performance and maintainability when optimising code and prioritising your optimisations.

While advanced optimisations may boost performance, they often come at the cost of making the code harder to understand and maintain. Even if you’re working alone on private code, your future self should be able to easily understand the implementation. Hence, when optimising, always weigh the potential impact on both performance and maintainability. While this course does not cover most advanced optimisations, you may already be familiar with and using some.

Profiling is a valuable tool for prioritising optimisations. Should effort be expended to optimise a component which occupies 1% of the runtime? Or would that time be better spent optimising the most expensive components?

This doesn’t mean you should ignore performance when initially writing code. Choosing the right algorithms and data structures, as we will discuss in this course, is good practice. However, there’s no need to obsess over micro-optimising every tiny component of your code—focus on the bigger picture.

Performance of Python


If you’ve read about different programming languages, you may have heard that there’s a difference between “interpreted” languages (like Python) and “compiled” languages (like C). You may have heard that Python is slow because it is an interpreted language. To understand where this comes from (and how to get around it), let’s talk a little bit about how Python works.

A diagram illustrating the difference between integers in C and Python. In C, the integer is a raw number in memory. In Python, it additionally contains a header with metadata.

In C, integers (or other basic types) are raw data in memory. It is up to the programmer to keep track of the data type. The compiler can then turn the source code directly into machine code. This allows the compiler to perform low-level optimisations that better exploit hardware nuance to achieve fast performance. This however comes at the cost of compiled software not being cross-platform.

C

/* C code */
int a = 1;
int b = 2;
int c = a + b;

In Python, everything is a complex object. The interpreter uses extra fields in the header to keep track of data types at runtime or take care of memory management. This adds a lot more flexibility and makes life easier for programmers. However, it comes at the cost of some overhead in both time and memory usage.

PYTHON

# Python code
a = 1
b = 2
c = a + b

Callout

Objects store both their raw data (like an integer or string) and some internal information used by the interpreter. We can see that additional storage space with sys.getsizeof(), which shows how many bytes an object takes up:

PYTHON

import sys

sys.getsizeof("")  # 41
sys.getsizeof("a")  # 42
sys.getsizeof("ab")  # 43

sys.getsizeof([])  # 56
sys.getsizeof(["a"])  # 64

sys.getsizeof(1)  # 28

(Note: For container objects (like lists and dictionaries) or custom classes, values returned by getsizeof() are implementation-dependent and may not reflect the actual memory usage.)

We effectively gain programmer performance by sacrificing some code performance. Most of the time, computers are “fast enough” so this is the right trade-off, as Donald Knuth said.

However, there are the few other cases where code performance really matters. To handle these cases, Python has the capability to integrate with code written in lower-level programming language (like C, Fortran or Rust) under the hood. Some performance-sensitive libraries therefore perform a lot of the work in such low-level code, before returning a nice Python object back to you. (We’ll discuss NumPy in a later section; but many parts of the Python standard library also use this pattern.)

Therefore, it is often best to tell the interpreter/library at a high level what you want, and let it figure out how to do it.

That way, the interpreter/library is free to do all its work in the low-level code, and adds overhead only once, when it creates and returns a Python object in the end. This usually makes your code more readable, too: When someone else reads your code, they can see exactly what you want to do, without getting overwhelmed by overly detailed step-by-step instructions.

Ensuring Reproducible Results


When optimising existing code, you’re often making speculative changes, which can lead to subtle mistakes. To ensure that your optimisations aren’t also introducing errors, it’s crucial to have a strategy for checking that the results remain correct.

Testing should already be an integral part of your development process. It helps clarify expected behaviour, ensures new features are working as intended, and protects against unintended regressions in previously working functionality. Always verify your changes through testing to ensure that the optimisations don’t compromise the correctness of your code.

pytest Overview


There are a plethora of methods for testing code. Most Python developers use the testing package pytest, it’s a great place to get started if you’re new to testing code.

Here’s a quick example of how a test can be used to check your function’s output against an expected value.

Tests should be created within a project’s testing directory, by creating files named with the form test_*.py or *_test.py. pytest looks for file names with these patterns when running the test suite.

Within the created test file, any functions named in the form test* are considered tests that will be executed by pytest.

The assert keyword is used, to test whether a condition evaluates to True.

PYTHON

# file: test_demonstration.py

# A simple function to be tested, this could instead be an imported package
def squared(x):
    return x**2

# A simple test case
def test_example():
    assert squared(5) == 24

When py.test is called inside a working directory, it will then recursively find and execute all the available tests.

SH

>py.test
================================================= test session starts =================================================
platform win32 -- Python 3.10.12, pytest-7.3.1, pluggy-1.3.0
rootdir: C:\demo
plugins: anyio-4.0.0, cov-4.1.0, xdoctest-1.1.2
collected 1 item

test_demonstration.py F                                                                                          [100%]

====================================================== FAILURES =======================================================
____________________________________________________ test_example _____________________________________________________

    def test_example():
>       assert squared(5) == 24
E       assert 25 == 24
E        +  where 25 = squared(5)

test_demonstration.py:9: AssertionError
=============================================== short test summary info ===============================================
FAILED test_demonstration.py::test_example - assert 25 == 24
================================================== 1 failed in 0.07s ==================================================

Whilst not designed for benchmarking, it does provide the total time the test suite took to execute. In some cases this may help identify whether the optimisations have had a significant impact on performance.

This is only the simplest introduction to using pytest, it has advanced features common to other testing frameworks such as fixtures, mocking and test skipping. pytest’s documentation covers all this and more. You may already have a different testing workflow in-place for validating the correctness of the outputs from your code.

Key Points

  • The knowledge necessary to perform high-level optimisations of code is largely transferable between programming languages.
  • When considering optimisation it is important to focus on the potential impact, both to the performance and maintainability of the code.
  • Many high-level optimisations should be considered good-practice.