Paul Hammant, independent consultant, joins host Giovanni Asproni to speak about trunk-based development—a version control management practice in which developers merge small, frequent updates to a core “trunk” or main branch. The episode explores the technique in some detail, including its pros and cons and some examples from real projects, and offers suggestions on how to get started. The conversation touches on a set of related topics, including code reviews, feature flags, continuous integration, and testing.
- Episode 133: Continuous Integration with Chris Read
- Episode 400: Michaela Greiler on Code Reviews
- Episode 440: Alexis Richardson on gitops
- Episode 498: James Socol on Continuous Integration and Continuous Delivery (CI/CD)
- Trunk-Based Development
- Book: Trunk-Based Development and Branch By Abstraction
- Paul Hammant’s Blog
- DevOps Trunk-Based Development
- Introducing GitFlow
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Giovani Asproni 00:00:16 Welcome to Software Engineering Radio. I’m your host Giovani Asproni and today we’ll be discussing trunk-based development with Paul Hammant. Paul is a DevOps and continuous delivery expert, architect and coder and who has been coaching software development organizations on trunk-based development as a recommended way of working for over 20 years. He maintains trunk-based development.com and has a book on the same topic. You can find the links to the site and also to the book in the links section of this podcast. These days Paul does spot consulting remotely for teams wishing to get from where they are to trunk-based development. Paul, welcome to Software Engineering Radio. And is there anything I missed that you’d like to add?
Paul Hammant 00:00:57 No, I think you nailed it all and well thank you for having me.
Giovani Asproni 00:01:01 You’re welcome. Well let’s start then. What is trunk-based development?
Paul Hammant 00:01:05 Good question. It is a branching model that we would use with version control for team software developments.
Giovani Asproni 00:01:11 Okay. Then maybe I should ask, what is version control? Can you expand on it a little bit?
Paul Hammant 00:01:17 Version control is a tool we would use to maintain and manage our source files. Typically text, over time towards completion of a project to ship to customers.
Giovani Asproni 00:01:28 So we are talking about tools like Git or sub versions. This kind of tools are for version control. Am I correct?
Paul Hammant 00:01:36 Yeah, for sure. Git mostly today per force. In some places games companies, sub versions still used, but most of the intelligence here wants to use Git these days.
Giovani Asproni 00:01:47 Can I ask you another question about version control before digging deeper into trunk-based development? So, we know that there were different models, like Git is a decentralized model of version control, but in the past used to be more centralized like sub version or before sub version CVS. So, can you tell us a bit more about these two models?
Paul Hammant 00:02:07 Okay. Sub version per force and a few others, you would mostly require a connection to be the whole time to the server that was maintaining the canonical copy for the entire team. Now you could maybe take a flight and keep working on your stuff whilst you’re offline, but it got harder and harder as time goes on to not reconnect with that server for history and things like that. Decentralized Git material in this age allow you to originate everything offline and work for long periods of time without synchronizing, with mothership, the canonical copy. But in enterprises at least they still have that. The aversion upon GitHub that the entire team is pushing and pulling from would be considered canonical, but we have the ability to maintain a longer disconnection from that canonical server.
Giovani Asproni 00:02:55 Basically with the that centralized model, people could work also disconnected and save the history of all their own work locally and then push it to the central repository ones connected again.
Paul Hammant 00:03:07 Exactly that and actually they can rewrite their history before they push that back up. To simplify what to them was 30 commits into one.
Giovani Asproni 00:03:15 What is trunk-based development in this context?
Paul Hammant 00:03:18 With branching model? And you had many choices over the years, but in my opinion it’s the best branching model. It started its life in the nineties, evolved through the nineties into the current timeframe and it remains a persistent high throughput branching model. Sadly we have to keep reteaching to the industry.
Giovani Asproni 00:03:37 We know that nowadays use a different branching model like Git flow or feature some form of feature branchings and things like this. So, what kind of problems trunk-based development solve that those other models don’t?
Paul Hammant 00:03:50 I met a guy some years ago, Frank Compania, I’m not sure if I’ve got his name right and he gave us a quote, branches create distance between developers and we don’t want that. I was quite jealous. It’s a very good quote and I use it a lot now. Multiple branches for development. The longer we keep those branches alive, the more integration pain we encounter later. So, integration is the correct terminology for when we bring a whole batch of changes back into the principle canonical place and that’s merge really if we talk about the actual language of the source control systems. So, the longer we leave it, a week, a month, a year, the worse it gets in that merge, the more we have to coordinate with other people who would be waiting for that or who’ve been actually working elsewhere, and they were divergent over the same time span. The longer we wait, the worse that merge gets. The more confusing an error prone it could be. So, trunk-based is about trying to minimize that amount of time down to the smallest amount of time so we could save a day.
Giovani Asproni 00:04:55 So you just talk about one of the advantages of a trunk-based development that is about merges. So avoid situations where cold changes make the code diverge in different branches for a long time, making the merge of these changes really difficult at some point in the future. But there are also some other total advantages of trunk-based development. So, some people say that feedback and release cycles are faster with trunk-based development. Can you expand on this?
Paul Hammant 00:05:25 Yeah, so feedback obviously if we’re merging work back a day after origination and say that was me and then you decided you thought that I wasn’t doing the best piece of work, you could comment far more quickly one day after origination rather than one month or one year. So, we could debate and I could make a follow up commit that would make it better still that maybe was what you were thinking I should have done in the first place. But there’s also benefits that come later like in the release cycle if we were releasing more often as a consequence of doing trunk-based development, we have again a secondary faster feedback cycle these times from the community of people who would be using the application in production or in a deployed environment that wasn’t production. So that’s the secondary benefit I think, feedback on not just the code but actually on the product that’s been built from the code.
Giovani Asproni 00:06:14 Other advantages that you see in trunk-based development that we haven’t mentioned so far?
Paul Hammant 00:06:20 Merge difficulty I think we hinted at in the last couple of minutes, but also merge mistake is a thing that’s generally speaking eliminated through trunk-based development. We’re bringing back smaller pieces of work more often. There’s a decent chance that we won’t make an error during that merge back. The divergence will be smaller. So, the actual merge tools of Git and alike will be far smarter. But secondarily, anything that was requiring arbitration of a human is actually smaller in number and if a human’s error prone it’s after hours of work attempting one thing like a merge that’s taking hours and if they’re doing something that is actually error resistant, it’s a short and small piece of work that was easy to achieve. So many smaller commits, many smaller integrations that come back are less error prone than a behemoth merge that might take more than one person to finish more than one day.
Giovani Asproni 00:07:16 Are there any risks associated with trunk-based development?
Paul Hammant 00:07:19 There’s a few maybe an adept team perhaps is going to think there’s no risks, but a team transitioning to trunk-based development and is remembering habits from their past might fall into some traps. An obvious one is we need to rely on automation and if we classically came from a place of no test automation or little test automation, then we don’t get to rely on that in a trunk-based development setup. So, you know, pitfalls are we need more machines, more bots, more CI to actually look at what we’re doing. And to give a thumbs up a second one related still a thumbs up is that we need perhaps more eyes on code. We have less time to certify that it’s correct. So we perhaps want to move upstream the moment where a second pair of eyes has a look at a piece of code. So a team might make mistakes in not doing that, trying to do code review once a month or quarterly when they really should be doing it every single commit and within minutes or at least an hour of origination. These are the types of mistake that a team might make.
Giovani Asproni 00:08:29 Have you got an experience that you can share with trunk-based development, you know, to describe how you used it and how it benefited your organization?
Paul Hammant 00:08:39 Yeah, I have a few but I will pick an airline to talk about from my past. I wasn’t the principal technician in this airline. I was one of many thought workers amongst one of many developers and consultants. And we had a setup where we were doing a medium or major release every six weeks. However every single team was taking longer than six weeks to make their thing from start to finish. So that we’ll call concurrent development of consecutive releases. What we inherited was a series of branches that the first release team would work in one branch and on a weekly basis you would merge their stuff into the intended next releases branch. And I was maybe my team was maybe four branches further down than that. So that was like a cascade branching model and it wasn’t working and we were breaking sub version and we couldn’t get support for helping us push through that break in sub version. So we proposed that we should move everyone to trunk-based developments and a reliance on toggles and flags. And almost within a couple of days of affecting that, we were liberated from the merge hell. We were able to dial up our throughput all teams, we had a new need to rely on toggles, maybe we’ll come back to that later, but we had this super productive period of time facilitated by trunk-based development when an attempt at another branch and model organized around teams and intended release order was not working.
Giovani Asproni 00:10:13 In this example using trunk-based development was what saved the project in a sense.
Paul Hammant 00:10:19 We possibly could have sold it on with the old way but we would’ve had to have slow down our releases made a greater interval between releases because the dev teams literally couldn’t keep up with the upstream changes that other teams had managed to lock in because the mergings were harder and harder to do. So it did save the throughput but maybe if you wanted to be cynical we could’ve stuck with the old design but it would’ve been increasingly hard unpopular. We could have had quitters to get away from the hell of merging, but we moved everyone to a more productive, higher throughput place by installing trunk-based development.
Giovani Asproni 00:10:58 Okay. And now maybe we want to have a look at trunk-based development in a bit more detail. So in your book you mentioned a few styles of trunk-based development. Can you tell us what they are?
Paul Hammant 00:11:10 For sure. There is a historical direct to trunk model in the early days of source control in the nineties, maybe that was the only mode of operation you had. And that, you could have a concept of a branch came later halfway through the nineties. So direct trunk would mean that everyone is going to effectively do their piece of code, hopefully a small piece of code test it themselves. That would mean bring it up, desk check it on their own machine, run the unit tests as well as every other facet of the build and then do a commit and a push. Historically the end of the nineties into the two thousands you might have to stick your hand in the air in the deaf room and say, I’m checking in, nobody else check in. And then that ceremony would render this safe. In later years the source control packages would arbitrate on whether two people could check in concurrently and whether it would still work, sub version lock that in.
Paul Hammant 00:12:04 Others like ProFORCE had done it some years before the, that model really is passe. People don’t do that so often, although it’s perfectly possible to do in a small enough team who know each other. The second and most popular model perhaps is the short-lived feature branches of GitHub flow. This was a standout feature of GitHub and launched in 2008 and it was a pretty compelling way to work. This one also facilitates surprise pieces of work from outside your team. Many open-source teams used this model. The third way of working in the styles page of their trunk-based phenomena site is a patch queue. This one also has some of the historical use. Google implemented their own patch queue, maybe not when they started, but soon after they realized they couldn’t get all their developers doing push straight trunk. And Guido van Rossum had a technology talked about in 2006 called Mondrian. It maintained pending changes in Postgres and you could review them in a web application and if you got enough thumbs up on a particular change, a patch, a change list you could automate from there. Its inclusion back into its integration back into the trunk. So it would handle effectively everything that GitHub’s pull request model does but without actually using a branch.
Giovani Asproni 00:13:28 How do you choose among these three different styles?
Paul Hammant 00:13:32 That’s a very good question. So if I’m setting up a dev team, I’ll probably try and get everyone to push right straight back to trunk. Now I’ll get some resistance, but I might be able to carry the day and this could easily work up to 10 developers who are concurrently committing back to trunk. You could make it higher than that, but you really probably shouldn’t. You should shift to shortly feature branches, the GitHub model. Now that can scale up to a hundred and that’s why I would maybe make that decision. I have a hundred developers, the pool request model’s going to work for me. But if I was trying to recreate Google and I’m going beyond a thousand developers, there’s no way on earth that I would choose the pull request model. I’d probably put in one of the patch review systems that exist in open-source lands.
Giovani Asproni 00:14:18 You said that the second model, so basically using feature branches to work and then pull request or merge to master. But what is the main difference between this and a branch-based development model? Cause you know you are still using branches. So where is the difference?
Paul Hammant 00:14:37 That’s a good question as well. So, let’s say regardless of the three models that I’m talking about, the team maintains one promise to the bosses, which is we are always release ready. So, say you know your CIO, you come in and say we must release in one hour because competitor X has managed to beat us to market on the feature we’ve all been working on. So, in a trunk-based development team, the answer is we are release ready at all moments, maybe not within a second if we were in thinking we had another three weeks to develop, but if we flip some feature flags or toggles, we can maybe stabilize and make you know, a perfect feature representation of what we’re doing. But what we had already was to build passes. We could deploy to an environment, we might already be deploying it to QA or UAT with every commit, but what we have is a capability that long-lived branch teams do not have. We are always release ready.
Giovani Asproni 00:15:32 So you mentioned our long-lived. So, I guess the difference here is between the length of time and feature branches kept open because you are saying you still have for something some branches where there is development and then maybe a pull request model. But I would imagine in this situation you want this branch to live for a very short amount of time to be called trunk-based development. How long is the life of a short-lived feature branch?
Paul Hammant 00:16:02 If you ask me, it’s a day. If we ask the industry by saying how big is your short-lived feature branch? The typical answers are two to three days of development and one for QA.
Giovani Asproni 00:16:13 And for something that you want to call trunk-based development, how long should it be?
Paul Hammant 00:16:17 One day one day max, you know, and that’s maybe, your smallest pieces of working going to be a quarter of a day, it’s a bug fix did or it’s a small feature. I’m a hundred percent confident I did it. Test driven developed, it went here and it was perfect. I’m picking up the next thing from the backlog and it’s not even launched yet
Giovani Asproni 00:16:36 With trunk development, how do you manage conflicts when multiple developers are working on the same parts of the code base?
Paul Hammant 00:16:43 Okay, so first observation is that it kind of disappears because these changes are small. The chance that we could actually be working on the same method or function in the same source file in an incompatible way is massively reduced. So anyway, that wasn’t answering your question or at least not all of it. It does happen and you’re going to be pushed into the three-way merge tools of your source control system and you’re going to use three-way merge to arbitrate over our common ancestor, your work and my work. Because these things are smaller, that’s still easier to get through. Maybe, you know, you wanted to do a push, you found out somebody else had beaten you to it. So at least a trunk your obligation is to pull their changes, go through the three-way merge tool and then have a second go at pushing your thing back up to the chronicle place. Now it could be that delay on your push of your own work was only half an hour more. If we were in that health scape of long lift feature branches, it might be days of merge, it might take multiple eyes and people working in rounds to get their long lived branch back into a trunk or mainline.
Giovani Asproni 00:17:51 And how do you ensure that, trunk is always stable and ready for production release? I think we touched upon this a little bit before. So, you mentioned some infrastructure testing and continuous integration. So, can we expand a bit on this? Yeah,
Paul Hammant 00:18:06 So, your CI tool Jenkins classically is going to pick up your commit or your pull request and is going to run it through a bunch of steps in the pipeline compilation unit tests, one or two or three stages of integration test. Now in the highest performing teams it’s going to acquire a bunch of ephemeral infrastructure, using infrastructure as code scripts to stand up enough of the system to amount to the fullest of tests for the entire change that had had just happened.
Giovani Asproni 00:18:33 Some teams use also quite a bit of manual testing for the systems. When you use trunk-based development, is there any space for manual testing? And if yes, what kind of manual testing we can use?
Paul Hammant 00:18:47 That’s a good question and it depends on your intended release cadence. So, release cadence is how many plan releases do we want to get out in a week, a year, a month? So, with that said, if you are intending to quarterly releases, you have plenty of time for manual testing of an imminent release if you’re intending to do monthly releases. The same is still true. If you’re intending to do one release a week, then you’re maybe stretching what a manual team could do within a week. You need to be maybe code completes one or two days to go and then the QA team is going to have a look at the thing. If you move up past daily to 10 releases a day, you’ve lost the ability to have a manual test as an honest gates of whether we should or shouldn’t do the release. So, we’ve probably at some point in that progression lost deliberately our ability to manually veto a release now at a hundred releases a day, this is GitHub, Etsy, a few others during no place to say no to a release. If you put a bug in production, it’ll be resolved through a technique called fix forwards rather than rollback, which is maybe 30 years of well experienced history of the IT industry.
Giovani Asproni 00:20:08 And let’s change get a little bit. So now many teams nowadays use code reviews before pushing to trunk. What are the best practices for code reviews in trunk-based development?
Paul Hammant 00:20:21 If you have the code review tool, that would be the short-lived feature branch pull request system of GitHub or the patch queue, which has the review systems built in. The first piece of advice for a trunk team would be to do it as soon as possible after the request has gone out for a code review. Now Facebook have an interesting story from their past from 10 years ago where they put a service level in for the DevOps team to have all the automatic data points in within 10 minutes. Because 10 minutes was when the early majority of human code reviewers would turn up to review a change within their trunk-based development system would be a waste if those people were in the middle of reviewing something only to find out Jenkins 10 minutes later was going to fail that pull request because it was wrong or broke a test or didn’t merge perfectly with something else. So, they moved it so that the 10-minute average turnarounds on human reviews was supported by all the bot data points.
Giovani Asproni 00:21:28 What about teams doing per programming and mob programming in those situations? What would you recommend to them?
Paul Hammant 00:21:36 So, I was in Google as a consultant for a period and, not every team but some teams had an automatic kind of plus one for any piece of code that had been pair programmed possible that you still had to have another human weigh in afterwards. But if the two people in that pairing partnership were sufficiently senior on the actual application and with internal qualifications, was maybe one was called Java readability, if they were senior enough, that might have been enough to get that thing all the way into the trunk without any other humans reviewing it. So, pair programming for example, is definitely, second pair of eyes and for some teams it might be enough, especially for say the extreme programming teams in 2001, before Git and GitHub they would say everything paired where the developers have been rigorous enough to run all the tests before the check-in and the push everything paired would be fine without any further review after that.
Giovani Asproni 00:22:36 So since that, two dimensions here to keep in mind, one is more than one pair of eyes to look at the code, so a pair or somebody else. But also when it is not a pair or a sufficiently senior pair as you were mentioning, basically somebody else has to, to look at the code, like to decide that it’s good enough. It seems that the key parameter there is, is time. So, mention Facebook with 10 minutes, uh, because I would imagine if it takes two days one week or as I’ve seen in some teams one month to do a review, that cannot really be called trunk-based development by any stretch or imagination.
Paul Hammant 00:23:14 Absolutely right. You’re lying if you are calling a trunk-based development, you’re probably a team with a branching model where one of the branches is called trunk. But it’s certainly not trunk-based development if you’re waiting that long for code to be consumed back into the principally most important place.
Giovani Asproni 00:23:32 And now a different kind of question is around compliance and regulations. There are some people that work in an environments where they have to follow some rules for coding because there are uh, some, I donít, comp regulations within the company or maybe within the industry. Have you worked in any environments where there are some specific compliance rules and specific regulations to follow for the code to be accepted? And if so, how did that impact trunk-based development?
Paul Hammant 00:24:06 So another great question. I think, you know, there’s two parts, right? So, some of your so start audits concern releases and some of it concerns codes in the repo. And you could say that within a source control system that has integrated review, we can actually audit after the events that every single change that went through was appropriately categorized feature bug was linked to a backlog item, a Jira or something else. And that also had a review attached. So maybe if it was a vanilla sub version from the past, you wouldn’t be able to do this because it all wasn’t done in one technology. But if you are using GitHub or GitLab in this age, you can see all of those things after the event. Now of course I think the still mistakes can be made by a dev team, but as long as you’ve actually remediated those mistakes before the audit, I think you’re okay.
Paul Hammant 00:24:57 So you’d run a script, it says what went into the trunk that didn’t get a code review. Okay, we will list those as errors on our part, but we will do the code review now. So, it wasn’t caught in an audit, it was uh, disclosed to the auditors in a Mea Culpa moment that we did 18,000 commits of which 100 were not reviewed at the time but were reviewed within five days of landing in the trunk. So there’s many things that a trunk-based development team can do to be audit safe and comply with regulations around those that don’t mean you have to slow down, you can keep your full speed as long as you can prove something after the event proof is going to be hopefully resting on Gits and some Python scripts that are reproducible and would please the auditors on at that moment later.
Giovani Asproni 00:25:50 From what you say, it sounds like trunk-based development can be used in environments where there are some strict compliance requirements or strict regulations.
Paul Hammant 00:26:00 For sure, as long as we’re vigilant that mistakes might happen.
Giovani Asproni 00:26:03 Let’s change direction a little bit. How do you manage deployment and release cycles in trunk-based development?
Paul Hammant 00:26:10 That’s a good question. You know, it’s often the first question I ask to companies I’m consulting with, what is your plans release cycle? Somebody might say quarterly or monthly or weekly. And then I might ask, what is your observed unplanned releases per plan release? Somebody might say every three releases we do an unplanned release in the days that follow. And that really drives a lot of questions about how, what flavor of Trunk-based development you’re going to do. So not just direct to trunk or the pool request model or the patch cue model, but what are we going to do around CI? What environments do we need? What environments are we going to eliminate? What other parallel techniques might we do? Service virtualization, component testing, contract testing, all sorts of sophisticated things. We might want to work into the build, the definition of done, the roles within the team. Devs, QAs, devs and QAs may be pairing together the emphasis on or attendance of a backlog, a tracker, the dual attendance of a second system, a trouble ticket system that could be ServiceNow is possible. Teams can look at two queues planned work and then the surprise work from production and all of these things drive the general capability and style and experience of releasing and supporting releases for a dev team.
Giovani Asproni 00:27:33 So what kind of models have you seen? Like even with trunk-based development, would you use branching off for a particular release or maybe just put a tag for a release and branch off if you need to do some fixes? How would you proceed with that?
Paul Hammant 00:27:47 So your monthly on your plans, release cadence, you’re probably going to cut a release on a release branch on a just in time basis. Harden that in the hour that follows, lock it and only cherry pick changes to it using a responsible member of the team that are bug fixes. You’re not going to allow stuff to come up from trunk into the release branch. That’s more features. You’re going to say wait for your next release, which is a month and one month time. And if that’s not acceptable to the team that’s being told no they can’t merge, then maybe you should have a faster release cadence than monthly. These are the gates that might drive the team’s larger experience. You know, make a release branch. The one you, the other one you talked about was make a tag. If you’ve moved from monthly to weekly and then weekly to daily, you might not be making a branch anymore for releases.
Paul Hammant 00:28:40 You might just be making a tag to say we’re going to release that. And a bot picks that up and says that’s going all the way into production. If you’ve moved to 10 releases a day or a hundred, you’re probably not even tagging anymore. The system is making a decision which commits or commits plural will be batched on normal basis. It doesn’t really matter anymore. The system is making the decision as to which one’s going out and it’s doing the release and if it, even if it makes a tag, which it may not, the tag isn’t after the event thing, meaning the release went out and I’m going to tag it say that was the 18th release on this date. But at some point they become meaningless too that if your system is so good that every commit reaches production, there’s no point in tagging anymore.
Giovani Asproni 00:29:23 Also when releasing many teams use code freezes as well. So, can you comment on this? Is there an implication for trunk-based development?
Paul Hammant 00:29:33 There is. So, Laura Winberg and Christopher Seewald did the high-level source control test practices document in 1998 and it talked about branches and each branch should have a different policy, code freeze as a policy. So, we could apply a freeze to this branch but not to that one. Now the reality for trunk-based development teams is they never freeze the trunk ever. It doesn’t matter. You are going at high speed and if it was you that was making the release branch, you might tell the team, I’m making the release branch in one hour, but nobody changes any habits in that time scale. The release branch just gets made without choice, just gets made from head and then that one is frozen. Frozen to the developers. You, the person that made the release branch might be the responsible adult that’s going to look after it for the next three days before it goes live.
Paul Hammant 00:30:26 But you’re going to freeze it and say devs can’t commit to it. Devs main request that a commit or commits get cherry picked to it, which is a specialized form of merge that um, GIT has sub version has material has, but the freeze of the release branches because of a policy change which is devs shall not develop on the release branch. So, we said already we don’t want additional features rammed in after the branch cut moment, but we are open to bug fix is going in because somebody in the proving cycle for the release that’s about to go out, that’s coupled with that release branch has found a defect. We have fixed that in the trunk. That was unplanned work. We proved that it was fixed. We maybe added a test as well cause a really good test-driven developers and that individual atomic commits was cherry picked the release branch and none of the intermediate commits that happened after branch cut and before the commit that was eligible for cherry pick, none of them came with it to the release branch. So, we are saying frozen branch responsible adult bringing stuff to it matches the defects. We will have to disclose this in an audit cycle maybe. It should have multiple eyes on it, including a manager saying yes that should be merged, cherry picks to the release brunch. But that’s the only place that trunk-based development allows freezing to happen.
Giovani Asproni 00:31:47 Now another common thing issue that they see in teams that is long running feature development. So sometimes teams have in their own requirements to develop something that requires time. I’ve seen that often branches are used for this purpose, like basically feature branches. Yeah. So, create a branch for this long running feature and we’ll work on this for a few weeks, maybe a month, sometimes a few months and then merge back when we are ready. So, if you want to use strength-based development for these long running features, what should we do to be able to do that?
Paul Hammant 00:32:25 We should do branch by abstraction. This is my contribution to the field of computer science in the respect of trunk-based development. I’m just the guy that documents everything. Now one thing he didn’t say was this particular long running piece of work along to achieve piece of work was probably straddling a number of releases, not just taking a long time in itself. So, the answer is that we had something to do in the code base that was large and would take many, many, many commits. What we should probably do is introduce an abstraction for ourselves first that isn’t actually anything towards goals, but it facilitates breaking the workup into multiple further commits that don’t in convince anyone who’s working in the same trunk, everyone else should be able to work at full speed delivering their features. You are doing your feature or your refactoring or your tech debt item, whatever it is that’s going to take ages to do.
Paul Hammant 00:33:22 But you’re doing it in such a way that no one else in the team notices you’re doing it but you’re doing it methodically through a hundred commits that might go live in sets of 10 or 15 or 20. But they’re all benign in production because they’re kind of turned off. Now they’re off and on technology is flags and toggles of which there’s many, many ways of a achieving that within a code base, using a system to maintain toggles or just having a YAML file in the so in the source tree. But for you as the person who’s doing the long to achieve change, you might have the toggle set to new whatever the toggle name was. Everyone else has it set to old. In fact, that’s how it was committed. But you are working towards goals, you are doing commits, they are going live but nobody else is noticing them. They might notice the abstraction has been introduced and they have to work with that, but they don’t see any of the downside of you taking ages to finish your change. Now your last piece is you finished the work, there’s nothing more to do, you’ve gone live so you should probably remove the abstraction. You might even remove the toggle at the exact same moment or they’re done kind of in following commits.
Giovani Asproni 00:34:32 So you mentioned, toggles and flags, which are the same things. So, can you tell us in practice how they work?
Paul Hammant 00:34:41 Yeah, it’s a software flag that says yes or no or old or new or A or B or C or D in some context. Now that could be something you had on the command line as you start your compiler. So if, if this was Maven you could pass a dash P profile or a dash D flag to the build and it would go and do something instead of something else. So that would be a build time flag. If it was a more sophisticated tool then you don’t have a parameter to the build which would be dash dash with shopping cart if that was the thing that was being added in your refactoring. That’s something that affects the binary from the same set of sources. And if you ran it twice back-to-back with different toggles, it would make two different binaries without any other changes on your part.
Paul Hammant 00:35:28 And if you’ve shipped one of those to production, it is that thing you know with shopping cart or another binary without shopping cart and you can’t in production be this other thing instantaneously you’d have to do a rebuild and a redeploy. So, the next level along would be let’s ship the binary with shopping cart and without shopping cart logic in it. But it will pick up a YAML file as it boots and say I have shopping carts, which would give you the option of maybe bringing that service down, editing the YAML, this is not advised, and then relaunching the service in the same infrastructure. Maybe this is before the virtualized era and it would suddenly have shopping cart in production when it didn’t have shopping cart a few minutes before. So that’s a toggle that would be shipped with the binary that is all things to all people, but shapes the behavior of the binary in its environment.
Paul Hammant 00:36:18 The third kind of way is to have a toggle system. So, this would be a server on some basis that maintains has shopping cart or does not have shopping cart and the application is it launches with phone home to the service and say should I show the shopping cart or should I hide the shopping cart? And that would be able to disseminate toggle states to running services and applications such that maybe in the middle of the day and not having to reboot the stack at all, you could change the flag from does not have shopping cart to have shopping cart and all of a sudden the system would show that. So, the next user that visits the page in question where the shopping cart could or in alternate configurations may not be. So that’s really through in my view, three broad categories of flags and toggles. Ones that work at bill time, ones that work at boot time and ones that don’t require any reboots or deployment of the software at all, will just work based on whatever an operations team does in a page that says on or off or yes or no.
Giovani Asproni 00:37:19 So are there implications in trunk-based development for the team itself? Like if a team has specialists, say somebody that only does UI, somebody else that only does backend development, somebody else that is just database, is this a potential problem for trunk-based development
Paul Hammant 00:37:43 If they refuse to slowly learn the other technologies, yes. So, what we really want is T-shaped people. So, you’re a bit deep on one thing, like I’m really good at CSS but I will get on, you know, less excellent but still adept with the full range of technologies that I need to use within the dev team. So, you want a team of multiple T-shaped people where every specialization is covered by someone who is a specialist, but that gradually all the skills move around the team regardless of hopes for the individuals within the team.
Giovani Asproni 00:38:16 If I’m understanding correctly, what we’re saying is if we have a team where people, well a team and perhaps a company where the culture is of high specialization of what people do using trunk-based development may become difficult in that context. Am I correct?
Paul Hammant 00:38:33 If they refuse to become poly skills, every team would start where you are describing and then jump into the opportunity to train up colleagues in the same skills that they’re very good at. Not to the same level but at least no longer beginner on those technologies.
Giovani Asproni 00:38:51 So your experience is that this is actually not a real problem in practice?
Paul Hammant 00:38:56 If you’ve achieved trunk-based development, you probably have started to solve this one problem.
Giovani Asproni 00:39:01 Let’s talk about the role of tools. Are there any limitations in the tools that we need to be aware of when we want to use trunk-based development?
Paul Hammant 00:39:10 Yeah, so historically not all the tools would be equal on their ability to do three-way merge, but I think that’s been leveled up. If we talk about Git in particular, you have to do a pull before you do a push. If somebody else pushed before you to the same branch, you won’t be able to push, you have to do a pull first. Now you know, this was for the longest time of Gits life, a problem. In the last year or so, a number of strategies and capabilities in Git as a tool have evolved to actually almost eliminate this problem. But in, let’s say, let’s go and say what the worst incarnation, the problem would be you’re in a large team, hundred deaths, everyone does four commits a day that they push all the way back to trunk. You might finish piece of work and then do commits.
Paul Hammant 00:39:54 It works for you because you are distributed. Then you might do push, which is to get all of your commits back to origin. But it says you can’t do it. So you go, oh somebody else worked. I do a get pull, oh merge conflicts or I’ve got to run the build again because I’m a good citizen. And you do a push again and you find out somebody else bated you to it as well. So, the problem is that you might be in that push pull bottleneck forever if the team is huge and has sort of high throughput commits back to the trunk. So, you might regret your decision to use Git for that, but my observation is this problem has started to be eliminated by strategies for merging, you know, or merging back to origin fast forwards and alike and it’s less of a problem than it was five years ago and roll forward five years more and this might be absolutely invisible as a problem.
Giovani Asproni 00:40:47 About other tools now, do IDEs have a role to play as well?
Paul Hammant 00:40:52 These days? The IDE has to understand the source control technology natively. This is not just shelling out to the commands get sub version material. This is actually having the features built in so that it can represent change, divergence, it can arbitrate maybe even be at the on the low end, an admirable tool for merging three ways. But you know the answer to your question that you put is we do need tight integration with the source control package in the modern age and we would not use a source control package that didn’t have tight integration.
Giovani Asproni 00:41:28 In your book you talk about the importance of a local build. So, can you expand on that?
Paul Hammant 00:41:34 So this has been the tradition for the continuous integration teams, the test- driven development teams, the extreme programming teams and we, we just have to continue it. So, you can’t really say your high throughput unless the change your pushing back to trunk passes all of the same tests that the CI bot would do as well. And you don’t want to duplicate that. So that means the, basically the same technologies that you would use to certify that something passes all checks should be doable on your dev workstation in your checkout before commit after commit. It doesn’t matter, it must run on your Mac or your Dell or your think pad.
Giovani Asproni 00:42:15 Okay. So, the importance of the local building is basically to give developers that are going to push something to track a high confidence that they think is going to work. So, I guess and a lesser chance of having problems when, breaking the build or maybe clashes with other mergers. And let’s now talk about something else that is about actually implementing trunk-based development. So, let’s say an organization has decided to go towards trunk-based development. So, are there any situations in which you would recommend the organization not to do that?
Paul Hammant 00:42:54 Well if you don’t have any tests at all, any automated tests, you’re being a pickle that doesn’t, you can’t take it on at the same. Other pitfalls, you know, what’s your current cadence into production for releases? Maybe don’t shoot for the moon and say we’re going straight to CD, which is a, a flavor of trunk-based development. And there are two CDs, this continuous deployment and continuous delivery. But don’t do either. If you’ve got no tests, don’t do either. If you are starting from a point of quarterly releases, choose a way of working with trunk-based development and matches your current cadence, then just experience trunk-based development in that design. Then think about dialing up from quarterly to monthly or monthly to weekly on that’s on the planned releases. Try and worry about your unplanned releases, which are the bug fix ones. You want to eliminate them if you can choose any strategy to do so.
Paul Hammant 00:43:46 But the most solid one is have more automated tests. Make an obligation that the devs who do work write the tests, at least the unit tests. QAs have skills devs don’t QAs for one know what not to test, but they should be in the dev team too. And they should be committing and not be manual. They should be learning the same technologies, be that type scripts or Java. They should be crafting their skills with frameworks in unit J, unit cucumber, selenium, cypress and everyone should be living in this world where the work we do is in small batches and it goes back frequently and none of it breaks the build. All of it increases the solidity of everything, meaning all test pass. Hopefully they’re fast. There’s other things we can apply to make them faster without running all of them. But we do that slow ratcheting forwards from the place we first arrived at as we requested for trunk-based development, say a matched set of ways of working that matched our quarterly release cycle. We experienced that and then dialed up the release cadence slowly after that.
Giovani Asproni 00:44:51 Are there any particular strategies that you would suggest for implementing trunk-based development in say greenfield projects or brownfield or legacy ones?
Paul Hammant 00:45:03 I think we covered legacy. You know, be careful, choose a way of working that doesn’t upset everything around your cadences, your batching of work, your work streams, your prioritization. But if we talk about startups, you know, there’s no employees yet. It’s you as CTO and you’re going to do all the hiring and you’re going to align people. You should start with trunk-based development and you should never lose it. And this was what this is Google’s big advance in, they didn’t quite start it in say 96 with trunk-based development. By 98 when they put in PerForce, they had super achieved on this thing and they never lost it. Now you start as you mean to carry on. You hire people, if they’ve never done it before, you’ll train them. How do we do the batching of work? How do we do the tests? Here’s your first pairing opportunity. This wasn’t that fun. Look, there was, this is a bill breakage, let’s do our blameless autopsy, but we’re not live yet. So, we’ve got some runup. So, we craft our skills with trunk-based development. If we think about maybe a startup as well and think there’s front ends and there’s back ends and then there’s related non Realtime services.
Giovani Asproni 00:46:10 We said that for greenfield project you said that you would recommend trunk-based development every single time. But for brownfield and legacy ones they would be using maybe GitFlow, some other branch base, possibly in-house custom way of branch-based process. And they may also have lots of manual testing, not a lot of automated testing. So how can this legacy or brownfield projects, what can they do to move into a trunk-based development process but in an incremental way?
Paul Hammant 00:46:46 Yeah, there’s some readiness things, right? We have to speak to the QA group and talk about shifting everyone from manual to automation skills. If we’re a generous company, generous brownfield, we’ll train in working hours using courses, Pluralsight and others on some of that stuff. We’d give support as people are learning this craft. But that feels to be a long thing to do. You know, it’ll take six months to take somebody from manual to some level of confidence with I can commit. I don’t break the builds. I see the new techniques. But the other aspects was short term. Let’s say we’re planning this from other branch and model long-lived to trunk-based development and we’re counting down to like a cutover moment. If we tried this big bang, like moving five, eight teams from other branching model to trunk-based, we could have egg on our face regret.
Paul Hammant 00:47:42 We could cancel it or abandon it after a week, be forced back to the old place. And now with the additional business of the business, because we have downtime in that time. So, a better plan would be to phase an implementation. And I did this in Chicago, it was a bank and they were in clear case and a another branch and model. Quarterly releases, or maybe one release every four months, and we planned between two releases to migrate everyone ultimately 200 contributors from the other branch and model in clear case to back then trunk-based development in a monorepo trunk-based development in ProForce. And what we did was we sequenced the teams in order we’ll move you on Monday, we’re going to hot train everything in front of you, you have a day of downtime or half a day, we’ll show you some stuff.
Paul Hammant 00:48:31 We’ll add some tests that make things more solid at the same time, but by Wednesday we should be done with you. And I’ve moved to a new team, another team on another module within the same larger application. It was her trading application. And you know, if you have enough time between releases, you can do this migration. It’s a little bit military, but you’ve got time to complete a migration of all the teams from the old place for the new place, from the old techniques to the new techniques. If there’s some top down encouragement and there’s some bottom up support for the same. And you also have time, really importantly, to deliver all of the functional commitments you’d made for the same release cadence. Let’s say in this case it was four months. So as long as we achieved all of the functional deliverables that the team previously committed to or regularly did within that interval and we did all this housekeeping stuff, which was shift the trunk-based development, then we can be left in a good place.
Paul Hammant 00:49:27 But if your migration plan included downtime that was significant and multi-person before, you’d be in the new place of trunk-based development. In that case for the bank monorepo too. If you have the downtime and there’s observation of that downtime from management, business management, there’s the possibility that somebody at some point can’t withhold their better judgment any longer and is going to blow a whistle and say get back to work, which then means go back to the functional deliverables only and do none of this housekeeping stuff. Meaning you’re forced back to the other branching model. You didn’t delete it, it was always there. You just went back to that repo and you carried on or that branch. And nobody in the same organization’s ever going to be able to suggest trunk-based development or any of the related practices or monorepo again, you know, maybe careers changed also as part of this whistle blowing activity.
Paul Hammant 00:50:24 So in summary, choose a methodical way of getting from your place in brownfield land to your better place in baby steps that is concurrently delivered with functional commitments. That way you can kind of avoid the everyone’s stop fear that some of us have been veterans for years, have experienced many cancel projects and only canceled because we were late. So, through a methodical approach and a dual track, we will do functional deliverables at the same time you can possibly get to your place. And as we did for this bank, this material made it into the continuous delivery book by the way, as we did for this bank. I think we went live two weeks early on the four-month plan and the business thought we were geniuses, which we went and you know, wanted to know if we could do it again for the next release. So, you know, we were a bit lucky I guess in retrospect, but we achieved all the functional durables and the big migration to a better way of working for development or within the committed release cadence.
Giovani Asproni 00:51:27 Which brings us to a follow up that is changing this model is a big thing in a company. I mean a company that relies on software because maybe they, is a what implements a service they sell or is a product they sell to their customers. Obviously, this the way software is produced is important. So, what happens if you don’t have support from the top to go into trunk-based development? So, if you have the teams that say, look, we really want to do trunk-based development here. We think it’ll give us advantages, but management is not convinced.
Paul Hammant 00:52:03 Yeah, without top-down support your bottom up crew of let’s say senior techies that say we must do this. We did it in the last company and the one before that they’re not going to prevail. You have to convince management. Same time you can’t have management figures, you or me and the CTO roles saying we should shift this trunk-based development when none of the practitioners support it as well. So, you need top down and bottom-up support or you probably shouldn’t do it. Meaning every time I’ve seen one without the other, it’s always been canceled or placidly attended and then you know, no benefits a year later despite all of the meetings we had about it. So, you should need the support before you do it. And that includes management and that includes business management, not just IT management.
Giovani Asproni 00:52:45 So one of the pre-condition to move to trunk-based development is actually commitment from all stakeholders in the company. Like to it.
Paul Hammant 00:52:55 Yes. Now you could do that through, we’re going to talk about this change, we’re going to sell it, we’ll do an offsite, all the management figures are here, we’ll explain the other companies that have done it. The productivity benefits afterwards with that don’t mean people are working twice as hard and then you’ve convinced some of the people in the room, another group in the same room might just withhold or suspend their better judgment. You know, hopefully they become convinced in the fullness of time because you made zero mistakes against this promise. Certainly in the airline, people that were against the trunk-based development move, thought you know, for the longest time they’d go back to their branching model when say Thought Works had left. But when this reorganization or re-sequencing moment came a couple of years later, they were total converts to the idea of trunk-based development and branch of abstraction and feature flags toggles, because it eliminated an entire round of recrimination around a botched release and a zero-productivity moment as we unpicked mergers. So, you know, they withheld judgment and later became converts because of the economics thing, which was, I think I did a speech, and I called it hedging, but really it’s options on the orders of release. And they saw it from an economics point of view when Paul, the passionate techie only sees it from a Paul passionate techie point of view.
Giovani Asproni 00:54:15 So you just mentioned economics, that was something that convinced people that was the right thing to do. So, is this a way you measure the success of trunk-based development? So, when a company says, okay, we’ll go to from what we are doing now, we’ll go to trunk-based development, but then is how can we measure that the move to trunk-based development is actually something that is better than what we are doing before?
Paul Hammant 00:54:40 Because you’re moving faster than your competitors. Now that you have this, I’m going to say military, you have this military organized dev force. I used to do a keynote for not the big stage, but at corporate developer conferences. The decision that Google made to put in PerForce and choose trunk-based development, which was Craig Silverstein’s decision. He was the first hire, it was 1998. PerForce also went into Microsoft, although they called it Source Depo. And the Perforce sales folks said, you can choose any branch of model you like. And Microsoft listened to that and made lots and lots of branches, like ultimately thousands within the Windows repository. And Microsoft has on blog entries about how they’ve remediated that and finally gone to trunk-based development, say 2014, 2015, that sort of timeframe. But in all the intervening years, Google was doing trunk-based development in a monorepo was adding tooling to take away human things that could be automated, add data points, make it frighteningly hard to meet standards to get a commit in.
Paul Hammant 00:55:47 And Google had significant competitive advantage on the productivity per developer from 1998 through to the middle 2010s. I think Microsoft’s closed the gap in a number of their assets. You know, Office, Windows, some of the mobile things. But Google used this economic advantage to totally dominate Microsoft on at least the topic of mobile phones, Android in particular. Before Android came along, iOS was maybe a year old, but the second technology was pocket PC or window CE from Microsoft’s and it just disappeared. Microsoft using their ways of working on their multi-branch model couldn’t compete with Google’s way of working trunk-based development, albeit substantially in Git nor Perforce for the Android team. And there was a mismatch on this competition towards dominance of the second mobile platform after Apple.
Giovani Asproni 00:56:48 So in short, yeah. What you’re saying is that actually using trunk-based development increases, if you like, the flow of features deployed, well, developed over time.
Paul Hammant 00:56:58 Yeah. And but fast pivoting too.
Giovani Asproni 00:57:00 And fast pivoting too. So even changes of direction can be, for whatever reason the product needs to change direction can be actually implemented much more quickly.
Paul Hammant 00:57:08 Yeah. One of my bosses was Jay Bloom, big speaker on the lean tour and he said, think about it, the fastest moving, highest pivoting organizations are great, but if we shift to say, Formula One or NASCAR and there’s the person who’s at the front of the entire pack has somebody that’s threatening them from behind, they spend more time looking in their rear view mirror and actually the lap times go down as they’re trying to stop the other person passing them. So, if you have a person, a competitor that’s close to you, you malperform even when you’re in front. The other person doesn’t have enough width on a Formula one track to overtake anywhere. But if those tracks were much wider with less turns, the person who’s approaching you from behind would get past you every time. So, your tactics changes an organization.
Paul Hammant 00:57:54 And Jay was pointing out, you know, this metaphor of the person in front is looking in the mirror. What you would hope as an organization is there’s no one to look in the mirror to. What you should do is everything you can to make sure that you are so far ahead of your competitors that they never get close to you. But what a complacent, slow developing, non-trunking team is doing is driving at the front because they were first mover, but they’re moving slowly and the competitors are going to street pass you. Not just one, but maybe multiple. There was a case where Travelocity, travel booking forklift of the US, was an award winner, 95, 96, 97, 98, and then Expedia went past them on that leaderboard and a few others did too. You know, you can’t simply be first mover and keep that position. You have to continue to innovate in your engineering workforce. And that doesn’t mean offshore it. That means get better at practice, get better at batch sizes, get better at quality, automate some things away, add a whole bunch of other things you were never doing before in order to make better and faster.
Giovani Asproni 00:59:01 And trunk based development, from what you say, seems to be one of the important tools to actually being able to do that.
Paul Hammant 00:59:08 Yeah, it’s always there.
Giovani Asproni 00:59:09 I think we’ve done a great job of introducing trunk-based development, but if there was only one thing that you’d like a listener to remember from this show, the most important one. If there is one, which one would it be?
Paul Hammant 00:59:24 Commit little and often and don’t break the builds.
Giovani Asproni 00:59:26 Commit little and often and don’t break the builds. I like that one. I lose that one myself . Okay. Thank you very much Paul for coming to the show has been a real pleasure. And this is Giovani Asproni for Software Engineering Radio. Thank you for listening.
[End of Audio]