Introduction to Profiling


  • Profiling is a relatively quick process to analyse where time is being spent and bottlenecks during a program’s execution.
  • Code should be profiled when ready for deployment if it will be running for more than a few minutes during it’s lifetime.
  • There are several types of profiler each with slightly different purposes.
    • function-level: cProfile (visualised with snakeviz)
    • line-level: line_profiler
    • timeline: viztracer
    • hardware-metric
  • A representative test-case should be profiled, that is large enough to amplify any bottlenecks whilst executing to completion quickly.

Function Level Profiling


  • A python program can be function level profiled with cProfile via python -m cProfile -o <output file> <script name> <arguments>.
  • The output file from cProfile can be visualised with snakeviz via python -m snakeviz <output file>.
  • Function level profiling output displays the nested call hierarchy, listing both the cumulative and total minus sub functions time.

Break


Line Level Profiling


  • Specific methods can be line-level profiled if decorated with @profile that is imported from line_profiler.
  • kernprof executes line_profiler via python -m kernprof -lvr <script name> <arguments>.
  • Code in global scope must wrapped in a method if it is to be profiled with line_profiler.
  • The output from line_profiler lists the absolute and relative time spent per line for each targeted function.

Profiling Conclusion


What profiling is:

  • The collection and analysis of metrics relating to the performance of a program during execution .

Why programmers can benefit from profiling:

  • Narrows down the costly areas of code, allowing optimisation to be prioritised or decided to be unnecessary.

When to Profile:

  • Profiling should be performed on functional code, either when concerned about performance or prior to release/deployment.

What to Profile:

  • The collection of profiling metrics will often slow the execution of code, therefore the test-case should be narrow whilst remaining representative of a realistic run.

How to function-level profile:

  • Execute cProfile via python -m cProfile -o <output file> <script name> <arguments>
  • Execute snakeviz via python -m snakeviz <output file>

How to line-level profile:

  • Import profile from line_profiling
  • Decorate targeted methods with @profile
  • Execute line_profiler via python -m kernprof -lvr <script name> <arguments>

Introduction to Optimisation


  • The knowledge necessary to perform high-level optimisations of code is largely transferable between programming languages.
  • When considering optimisation it is important to focus on the potential impact, both to the performance and maintainability of the code.
  • Many high-level optimisations should be considered good-practice.

Data Structures & Algorithms


  • List comprehension should be preferred when constructing lists.
  • Where appropriate, tuples should be preferred over Python lists.
  • Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
  • When used appropriately, dictionaries and sets are significantly faster than lists.
  • If searching a list or array is required, it should be sorted and searched using bisect_left() (binary search).

Break


Understanding Python (NumPy/Pandas)


  • Python is an interpreted language, this adds an additional overhead at runtime to the execution of Python code. Many core Python and NumPy functions are implemented in faster C/C++, free from this overhead.
  • NumPy can take advantage of vectorisation to process arrays, which can greatly improve performance.
  • Pandas’ data tables store columns as arrays, therefore operations applied to columns can take advantage of NumPys vectorisation.

Keep Python & Packages up to Date


  • Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
  • There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
  • Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.

Understanding Memory


  • Sequential accesses to memory (RAM or disk) will be faster than random or scattered accesses.
    • This is not always natively possible in Python without the use of packages such as NumPy and Pandas
  • One large file is preferable to many small files.
  • Memory allocation is not free, avoiding destroying and recreating objects can improve performance.

Optimisation Conclusion


  • Data Structures & Algorithms
    • List comprehension should be preferred when constructing lists.
    • Where appropriate, Tuples and Generator functions should be preferred over Python lists.
    • Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
    • When used appropriately, dictionaries and sets are significantly faster than lists.
    • If searching a list or array is required, it should be sorted and searched using bisect_left() (binary search).
  • Minimise Python Written
    • Python is an interpreted language, this adds an additional overhead at runtime to the execution of Python code. Many core Python and NumPy functions are implemented in faster C/C++, free from this overhead.
    • NumPy can take advantage of vectorisation to process arrays, which can greatly improve performance.
    • Pandas’ data tables store columns as arrays, therefore operations applied to columns can take advantage of NumPys vectorisation.
  • Newer is Often Faster
    • Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
    • There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
    • Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.
  • How the Computer Hardware Affects Performance
    • Sequential accesses to memory (RAM or disk) will be faster than random or scattered accesses.
      • This is not always natively possible in Python without the use of packages such as NumPy and Pandas
    • One large file is preferable to many small files.
    • Memory allocation is not free, avoiding destroying and recreating objects can improve performance.