The Zen of Continuous Deployment
As an engineering leader you’re expected to make the right organizational and technical choices to enable your team to deliver features faster and safer to your customers. As a matter of fact, you’re probably bombarded by daily ads and sales calls for “that-one-tool-that-will-10x-your-team-output”.
Few of these tools actually work, but if there’s one thing that help our team deliver features quickly and reliably to our customers, it is Continuous Deployment (CD). Continuous Deployment is, put simply, a process to have your application automatically deployed to production as soon as the code change has been approved and merged. But more than that, it’s a mindset that will make your team accountable and empowered to release high-quality software. If you are in a position to influence the processes and tools used in your company, I can’t recommend enough implementing Continuous Deployment as soon as possible. I won’t focus on the tools and processes that we use, but just the benefits for the organization.
What is Continuous Deployment?
At a high-level Continuous Deployment is a set of processes that ensures that once code is green lighted and merged into the codebase, it gets deployed in an automated way (read: without any user input). Most of the time, continuous deployment is associated with trunk based development. It builds on the idea of Continuous Delivery: after new code is merged, the application should be in a state where it can be deployed with user input.
Why Continuous Deployment is crucial for an organization
It’s obvious that we get better at things that we do often. Releasing a new version of your application is fundamentally risky. Defects can arise from, for example:
- Database Migrations
- Business Logic changes
- Configuration changes
- Design & UX changes
Errors and bugs will happen. As an organization it may be tempting to blame failures in these areas as the result of mistakes by individuals or systems. However, issues are very likely not the result of one or multiple people, but the entire process. If you are a CTO/manager I really encourage you to read “The Field Guide to Understanding Human Error” as it outlines common misconceptions.
To come back to Continuous Deployment, making it part of the principles of our engineering organization will force you to make changes in the way you approach errors. Feedback loops around production issues will be tied with code reviews and will encourage people to act more responsibly.
For Software development, here are five key areas that you’ll need to invest in over time to make continuous deployment a success:
- Mature Code Reviews: Your people are the best gatekeepers of what ultimate will be in the hands of your customers. Investing in good code reviews will give you the assurance that code meets a quality bar before it reaches prod.
- Automated Tests: You and the team will need to invest in building tests that cover existing code paths to avoid major regressions. These tests can be build over time and you don’t need 100% coverage to get started.
- Automated Build & Release Process: Automated is key here, no Engineer or SRE should have to manually step in for the build to succeed. Same goes from a new build to a full deployment to production.
- Smooth Rollback: Errors will inevitably happen. The good news is that since you are in a continuous deployment environment, you should be able to gracefully rollback in the same amount of time it took to make the mistake! I won’t go in details as to why, but you should avoid fixing forward at all costs. It’s more powerful to invest in a way to rollback painlessly than scrambling to fix forward. Don’t overlook feature flags as a rollback mechanism either. If you can switch off a problematic code path, you just avoided rolling back your code/infrastructure.
- Monitoring: You can’t improve or fix what you can’t see or measure. It’s as simple as that. If your organization is in the dark, continuous deployment will be a disaster and debugging issues with post-mortem will be challenging. Your monitoring should extend to pre/during/post deployment: from errors all the way to performance metrics.
There are some cases where continuous deployment can’t be implemented. Mission critical systems where a single failure would result in a catastrophic outcome are probably not a great place to implement it… However, I’d argue you could still build test systems where you can achieve a good degree of continuous delivery and even continuous deployment.
For all other places, especially for tech companies, if you haven’t implemented a flavor of continuous deployment yet, I strongly urge you to consider it. Paradoxically, knowing that code can reach production at any point in time, will over time become a reassuring thought. Whatever happen in production will be able to be dealt with the discipline and training that your team will have acquired through hundreds of deployments. Don’t bet on luck, bet on repetition.