Scaling Code Review
At Dimagi, we write a lot of code.
In the last month on just our CommCare HQ repository, 25 people pushed 1,617 commits to master (and 2,270 commits to all branches) affecting 1,392 files and with over 175,000 additions and deletions.
What that also means is that we review a lot of code. Every line of code that makes it into a Dimagi repository is peer-reviewed by at least one other person. Github makes this so easy that there is really no excuse for a project of a reasonable size to not be doing this for reasons that others have articulated before.
In the early days of code review, things were pretty straightforward. The team knew each other well, everyone knew the codebase well, and we all worked out of the same physical office. In those days it was easy to keep up with what everyone was working on, handle code-review on an ad-hoc basis, and sort out any hairier discussions face to face.
As our team grew, however, we started to face issues with the informal style of code review. The same people kept on doing the bulk of the code review and it started taking up a significant amount of their days, eating up the time they could be spending writing new code. Additionally, when the main reviewers got busy or went on vacation, code would pile up and sit waiting to be reviewed for days, slowly getting stale and becoming increasingly likely to have merge conflicts.
New problems also sprang up as we brought in remote developers, some of whom had never even met the rest of the team. All of the sudden large architectural discussions were happening asynchronously and it could take ages to resolve them. Communication styles started to become more important as the team grew less familiar with each other’s personalities.
If this sounds anything like your team, then it is probably time to move on to the next phase of code review, and redesign the system to match your growing team.
At Dimagi, we decided to roll out a system that achieved the following goals:
- Distribute the time spent on code review more evenly throughout the team
- Ensure that all code can be reviewed in a timely manner
- Get everyone exposed to more areas of the codebase
- Promote more interaction among different team members
- Minimize the amount of churn, disagreement, and conflict during the code review process
What we ended up doing was extremely simple, and involved the introduction of two small concepts: (1) code buddies and (2) a documented review philosophy.
Code buddies are exactly what they sound like. Everyone on the team is assigned a buddy who is their point-person for code review. By default, you ping your buddy on every pull request you make, and it is your job to make sure that you review your buddy’s code in a reasonable time frame (e.g. ~daily). Your buddy is randomly assigned, and rotates every week. One of our developers wrote a little script to automatically assign new buddies every week and email them out for us.
Buddies were a great way to achieve almost all of our goals. Now instead of the same three people reviewing everyone’s code, everyone on the team reviewed a small slice of everyone’s code. Additionally, it was now obvious whose job it was to review a given pull request and reasonable to expect that that review would happen quickly. It also achieved the secondary goals of getting people more exposure to different parts of the code base and getting everyone more direct collaboration with everyone else on the team.
However, in order for the buddy system to work, we needed to make sure everyone was on the same page about how code review should work. This is where the review philosophy came in.
The code review philosophy is the complement to the buddy system. The goal is to ensure that everyone on the team approaches code contribution and code review with the same perspective. It is not a checklist of things to look for (although it does reference one), but rather a set of guidelines and principles to follow both when contributing and reviewing code. It helps developers understand both how best to structure and submit code for review, and how best to review it — focusing on the human side of the interaction.
You can read our full review philosophy on GitHub, but the main purpose is to make sure both authors and reviewers understand how best to engage in the process. It includes tidbits like how to mark up your code and pull request to make it easier to review, what should and should not constitute a blocking issue, and how review practices should change when working across timezones. We don’t use it too formally, but it’s been a helpful way to have the team aligned on how code review should work, and to resolve any differences of opinions.
You might be thinking right now “this all sounds well and good, but there’s no way this could work for my team because…” You might be right, and for whatever reason your project might not work this way (for example, if you’re already pair-programming you probably don’t need any additional review system). However, here are some concerns that I had as we rolled this out to our team, and why they haven’t been issues:
“We don’t have the time to code review everything”
This is likely wrong, and there is plenty of other information out there touting the benefits of code review and how it can usually save you time, so I won’t get into it much here. I will just reiterate that code review is an incredibly powerful tool to proactively get ahead of quality issues before they take more time to fix, and — in my opinion — it’s indisputably the best way to educate and onboard new developers.
“I don’t want junior people reviewing code”
As just mentioned, I would argue that reviewing code is one of the best and fastest ways to ramp up new developers to the project, the team, and the conventions. However, it’s true that new team members may not have all the context of a given change to feel comfortable signing off on it. That’s why one of the important tenets of the review philosophy is that “if you don’t understand the change well enough to feel comfortable, ask for someone else to review.” In addition to this, senior team members can always jump in and review anything even if it’s not directly assigned to them.
“I want the people with the most context/knowledge of the area to review the code”
This is often important and true — especially when a team of people are working together on a “hot” area of the codebase. In this situation, we just ping our project team and our buddy. That way the project team can do the full review, but the buddy can still lurk and learn.
People often tout the importance of code review as it pertains to quality and catching bugs; however, it is also an incredibly powerful tool for disseminating knowledge throughout a team. As a team grows, establishing good code review practices and policies becomes more and more important. For us, redesigning code review became important as our team grew from five to 20+ people. It will be interesting to see how it continues to change with the team over time!
You can check out our buddy-generator here and our review philosophy here.
Another day, another Zero Day: What all Digital Development organizations should take away from recent IT security news
Even if you don’t work as a software developer, you probably heard about recent, high profile security issues that had IT Admins and developers frantically patching servers over the holidays and again more recently. Dimagi's CTO shares what these recent issues mean for Digital Development organizations.
January 28, 2022
Join the fight to support critical open source infrastructure
Open Source tools are a critical piece of global infrastructure, and need champions for long term investment
March 17, 2020
Two big lessons that Iowa and Geneva can teach us about technology in digital development
Last week brought two high profile technology failures into the global spotlight. Although these two mishaps may seem quite different at first glance, they both highlight challenges that are inherent in providing software in the public sector (regardless of locale) and illustrate cautionary lessons worth discussing for practitioners in Digital Development. The Iowa Caucus Debacle
February 7, 2020