Pre-commit is a powerful tool for executing a range of hooks prior to making commits to your Git history. This is useful because it means you can automatically run a range of linting tools on your code across an array of languages to ensure your code is up-to-scratch before you make the commit.
Photo by Neil Shephard.
For those unfamiliar with version control and Git in particular this will likely all sound alien. If you are new to the world of version control and Git I can highly recommend the Git & Github through GitKraken Client - From Zero to Hero! course offered by the Research Software Engineering at the University of Sheffield and developed by Alumni Anna Krystalli.
In computing a “hook” refers to something that is run prior to or in response to a requested action. In the context of the current discussion we are talking about hooks that relate to actions undertaken in Git version control and specifically actions that are run before a “commit” is made.
When you have initialised a directory to be under Git version control the settings and configuration are stored in the
.git/ sub-directory. There is the
.git/config file for the repositories configuration but also the
directory that is populated with a host of
*.sample files with various different names that give you an in-road into
what different hooks you might want to run. Its worth spending a little time reading through these if you haven’t done
so yet as they provide useful examples of how various hooks work.
Typically when writing code you should lint your code to ensure it conforms to agreed style guides and remove any “code smells” that may be lingering (code that violates design principles). It won’t guarantee that your code is perfect but its a good starting point to improving it. People who write a lot of code have good habits of doing these checks manually prior to making commits. Experienced coders will have configured their Integrated Development Environment (IDE) to apply many such “hooks” on saving a file they have been working on.
At regular points in your workflow you save your work and check it into Git by making a commit and that is
pre-commit comes in to play because it will run all the hooks it has been configured to run against the files
you are including in your commit. If any of the hooks fail then your commit is not made. In some cases
will automatically correct the errors (e.g. removing trailing white-space; applying
black formatting if configured) but in others you have to correct them yourself before a
commit can be successfully made.
Initially this can be jarring, but it saves you, and more importantly those who you are asking to review your code, time and effort. Your code meets the required style and is a little bit cleaner before being sent out for review. Long term linting your code is beneficial (see Linting - What is all the fluff about?).
Pre-commit is written in Python and so you will need Python installed on your system in order to use it. Aside from that there is little else extra that is required to be manually installed as pre-commit installs virtual environments specific for each enabled hook.
Most systems provide
pre-commit in their package management system but typically you should install
within your virtual environment or under your user account.
pip install pre-commit conda install -c conda-forge pre-commit
If you are working on a Python project then you should include
pre-commit as a requirement (either in
requirements-dev.txt) or under the
dev section of
[options.extras_require] in your
setup.cfg as shown below.
[options.extras_require] dev = pre-commit pytest pytest-cov
Configuration of pre-commit is via a file in the root of your Git version controlled directory called
.pre-commit-config.yaml. This file should be included in your Git repository, you can create a blank file or
pre-commit can generate a sample configuration for you.
# Empty configuration touch .pre-commit-config.yaml # Auto-generate basic configuration pre-commit sample-config git add .pre-commit-config.yaml
Each hook is associated with a repository (
repo) and a version (
rev) within it. Many are available from the
https://github.com/pre-commit/pre-commit-hooks. The default set of
pre-commit hooks might look like the following.
repos: - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.3.0 # Use the ref you want to point at hooks: - id: trailing-whitespace types: [file, text] - id: check-docstring-first - id: check-case-conflict - id: end-of-file-fixer types: [python] - id: requirements-txt-fixer - id: mixed-line-ending types: [python] args: [--fix=no] - id: debug-statements - id: fix-byte-order-marker - id: check-yaml
Some hooks are available from dedicated repositories, for example the following runs Black, Flake8 and Pylint on your code and should follow under the above (with the same level of indenting to be valid YAML).
- repo: https://github.com/psf/black rev: 22.6.0 hooks: - id: black types: [python] - repo: https://gitlab.com/pycqa/flake8.git rev: 3.9.2 hooks: - id: flake8 additional_dependencies: [flake8-print] types: [python] - repo: https://github.com/pycqa/pylint rev: v2.15.3 hooks: - id: pylint
An extensive list of supported hooks is available. It lists the repository from which the hook is derived along with its name.
You can also define new hook and configure them under the
- repo: local.
- repo: local hooks: - id: <id> name: <descriptive name> language: python entry: types: [python]
For some examples of locally defined hooks see the Pandas .pre-commit-config.yaml.
pre-commit will run you need to install it within your repository. This puts the file
.git/hooks/pre-commit in place that contains the hooks you have configured to run. To install this you should have
.pre-commit-config.yaml in place and then run the following.
Once installed and configured there really isn’t much to be said for using
pre-commit, just make commits and before
you can make a successful commit
pre-commit must run with all the hooks you have configured passing. By default
pre-commit only runs on files that are staged and ready to be committed, if you have unstaged files these will be
stashed prior to running the
pre-commit hook and restored afterwards. Should you wish to run these manually without
making a commit then, after activating a virtual environment if you are using one simply, or you can make a
If any of the configured hooks fail then the commit will not be made. Some hooks such as black may reformat files in place and you can then make another commit recording those changes and the hook should pass. Its important to pay close attention to the output.
If you want to run a specific hook you simply add the
pre-commit run <id>
Or if you want to force running against all files (except unstaged ones) you can do so.
pre-commit run --all-files # Across all files/hooks
And these two options can be combined to run a specific hook against all files.
pre-commit run <id> --all-files
You may find that you wish to switch branches to work on another feature or fix a bug but that your current work doesn’t
pre-commit and you don’t wish to sort that out immediately. The solution to this is to use
git stash to
temporarily save your current uncommitted work and restore the working directory and index to its previous state. You
are then free to switch branches and work on another feature or fix a bug, commit and push those changes and then switch
Imagine you are working on branch
a but are asked to fix a bug on branch
b. You go to commit your work but find that
a does not pass
pre-commit but you wish to work on
b anyway. Starting on branch
a you stash your changes, switch
branches, make and commit your changes to branch
b then switch back to
a and unstash your work there.
git stash git checkout b ... # Work on branch b git add <changed_files_on_branch_b> git commit -m "Fixing bug on branch b" git push git checkout a git stash apply
You can update hooks locally by running
pre-commit autoupdate. This will update your
the latest version of repositories you have configured and these will run both locally and if you use CI/CD as described
below. However this will not update any packages that are part of the
- repo: local that you may have implemented
and it is your responsibility to handle these.
Ideally contributors will have setup their system to work with pre-commit and be running such checks prior to making pushes. It is however useful to enable running pre-commit as part of your Continuous Integration/Development pipeline (CI/CD). This can be done with both GitLab and GitHub although similar methods are available for many continuous integration systems.
GitHub actions reside in the
.github/workflows/ directory of your project. A simple pre-commit action is available on
the Marketplace at pre-commit/action. Copy this template to
.github/workflows/pre-commit.yml and include it in your Git repository.
git add .github/workflows/pre-commit.yml git commit -m "Adding pre-commit GitHub Action" && git push
If you use GitLab the following article describes how to configure a CI job to run as part of your repository.
For queries relating to collaborating with the RSE team on projects: firstname.lastname@example.org
Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.