Working in software development, everyone seems to always be concerned about high performing teams and not breaking things.  Sure, none of us wants to make a change that takes down production – though I am sure more than one of us has done it!  Making a change that breaks something though is a great way to learn new things about your code. It also gives you a great opportunity to make improvements.  If breaking things leads to improvements, breaking things is good, right?

How we broke something and made it better

Recently I was working with a team on a codebase they are unfamiliar with. It’s a complicated code monolith with a shared service layer. Despite how it sounds, it has a pretty mature CI pipeline and an “always deployable” main branch.  That doesn’t mean its invincible though. This team found that out the hard way. They ran into an issue with a custom test tool failing to run due to changes they made to the shared service layer. 

To make matters worse, it was only discovered at a late stage in the CI pipeline. The stage that prepares deployment to production. You guessed it, it was blocking production deployments.  Nobody wants to be the person responsible for blocking a production deployment! Yet thanks to that team breaking it, no one else will have to be in that position again.

While the team that made the breaking change were fixing it, the team with overall responsibility for the pipeline also jumped to work.  They introduced new tests that would run earlier in the process to catch breaking changes for the test tool. This means these issues would be identified before a merge in future.  They also made improvements to later stages of the CI pipeline. They introduced more thorough checks run against builds of main after each merge.  The checks then alert if there is an issue with main sooner. This gives time to resolve issues before a production deployment. 

Now the teams are in a position where the production deployment preparation can be simplified further. As a result, deployment frequency will be able to be increased. Not a bad result for someone breaking something! 

Conclusion

Some things never change, and breaking things in production is still bad, especially when there is significant customer impact. Yet breaking things, learning from the experience, and delivering improvements as a result?  That’s what makes you a high performing team.

Further reading

Want to read more of my stories about software development?

How about the time we reduced lead time to deployment by 3 weeks?

or How you can improve your code by using a rubber duck?

or for more stories from other people at Zoopla, check out the the Zoopla.blog

Subscribe to The Quality Duck

Did you know you can now subscribe to The Quality Duck? Never miss a post but getting them delivered direct to your mailbox whenever I create a new post. Don’t worry, you won’t get flooded with emails, I post at most once a week.