Effective commit structuring

Clean code reads like well-written prose. Clean commits read like a story.

Phil Dobson
4 min readJul 1, 2021

Hands up - we’ve all seen commit logs that look like this haven’t we? ✋

Increasingly frustrated commit messages

For many (possibly even the majority), committing this way is the norm. But the most experienced engineers commit precisely, using each commit as an opportunity to communicate a small part of a more complex story.

Commits should be as small as possible, containing the logical group of changes required to achieve one single thing. Getting this logical grouping correct will mean that the commits are ‘atomic’, the acid test for atomicity being: would I be able to deploy this commit as it is right now to production? Many of these atomic commits, each with tests passing can then build up to form a valuable enhancement in the form of a pull request.

Just as changes committed to a database are atomic, changes committed to a codebase should be atomic too!

Some common examples

Tackling technical debt along with adding a feature
I’ve added a new feature, and realised that I now have a constant that is used in multiple classes.

  • Commit 1: Add the new foo feature
  • Commit 2: Extract out constant X

I’ve added a new feature, and realised that the name of a class doesn’t make sense any more.

  • Commit 1: Rename class Y to Z
  • Commit 2: Add the new bar feature

Structuring a large, feature change to tell a story:
Introducing a new abstraction, and extracting out common code

  • Commit 1: Introduce a new abstraction to enable feature qux
  • Commit 2: Refactor foo piece of code to use this common abstraction
  • Commit 3: Remove the old orphaned abstraction

Why is this important?

Structuring commits streamlines communication with other engineers, and if you’re as forgetful as me — your future self! 🤕

It makes reviewing pull request significantly easier
When using these small atomic commits it’s almost always easier for someone to review the change commit by commit because all of the code related to a single change is in one place. Gone is the need to open a tab and cross-reference each of ‘Add feature X’, ‘Add tests for X’, ‘Fix feature X’ and ‘Fix tests for X’ (and then give up, and review all 80 files in one go any way 😬)!

Being able to review a change commit by commit adds a layer of story telling to a pull request. It shows the logical thought process taken to make the change, and how the feature was built up. This context is often important when understanding and evaluating a change, leading to less bugs in production.

Tip: Writing commit messages using the imperative form makes for an easier read. If a commit is in the imperative form, it will substitute nicely into the sentence ”If applied, this commit will <commit message>” (see above examples).

Reviewing commit by commit means reviewers can be more efficient. Reviewers can spend more time on the gnarly commits, and less on the more trivial ones. This has a tangible behavioural benefit for the engineer raising the pull request — no longer do they need to avoid doing sensible things like rename a class in fear that it would make their pull request explode. By separating something like this into a single commit and the reviewer reviewing commit by commit, the issue is removed as they just skip over it. This ultimately leads to less code rot, more readable code and a higher team velocity. Engineers aren’t scared of doing the right things because their ‘pull request is getting too big’ any more.

Git history is quick and easy to understand (helping to fix master quickly)
Git history is one of the most important tools in quickly identifying the root cause of an issue in master. By structuring commits so that each is logically encapsulated, there is less need to look across multiple commits to understand which change has broken master. It reduces the MTTR (mean time to repair/resolve), and therefore the impact of bugs in production.

How do I get there?

It’s not easy, and there are two real steps to maturity:

  1. Totally mastering the git command line (or IDE git features!): rebasing to amend, reorder, split or squash commits. This is most important when it has been difficult to structure the change in your own head before making it, especially if lacking in domain knowledge (either in the technical or business sense).
  2. Being able construct a change with surgical precision. Knowing up front exactly which logical commits you will be making to build the feature, and developing it this way with minimal rebasing. This is the level to aspire to.

Some common anti-patterns

Checkpointing
Some engineers push small, non-atomic commits remotely to avoid the potential of losing their work. This can typically be mitigated by a behavioural change, either by starting to push atomic commits instead, or by scoping the size of their change to be smaller before it is merged.

Squash and commit
Some engineers rely on squashing at the time of merging their pull request, to at least maintain a clean commit log in master. Unfortunately this means they don’t reap the rewards of giving their colleagues a nice pull request to review!

--

--

Phil Dobson

Software engineer living in the sunny South of England.