The One Thing That Is Repeatedly Breaking Your CI/CD Workflow

 ● 02nd Nov 2017

6 min read

What’s the missing link in a complete CI/CD toolchain and how can you add it to your workflow?

Companies and teams want to move fast. This includes frequent releases, constantly updating the product and keep team members on their toes about new and relevant technology. These needs led to the rise of continuous integration and continuous delivery practices.
The current widespread understanding of the CI/CD cycle adds a lot of automation to test-build-deploy stages, but it misses out on a critical step in a complete release cycle. In the following post we’ll understand why the CI/CD cycle doesn’t end after deployment, and why it’s important to add automation to your monitoring practices. Let’s check it out.

The missing link in the CI/CD toolchain

The current CI/CD workflow holds one of the biggest misconceptions in software engineering: If so many teams are experiencing the same type of issues when code hits production, there must be something fundamentally broken in the way teams are doing automation and CI/CD.
Digging further into why this happens, we spoke with hundreds of teams of developers over the last few years and we found some common grounds as to why they’re experiencing negative effects as they move faster.
The main problem is that knowingly or maybe even unknowingly, more often than not, when people think of CI/CD they view it as a cycle that starts with code commits and ends when new code is deployed to production. The biggest misconception is that the CI/CD workflow ends when new code is deployed to production. Teams tend to think that the automation stops when the code is out in the wild. Which is totally wrong.
Monitoring is an inseparable part of the CI/CD cycle, and automated deployments require smarter monitoring. You want to know when a release introduces new errors without relying on user reports, and have all the information you need to fix it.
Even Zuckerberg had a change of heart and workflow, and on 2014 the company changed its motto from “Move Fast and Break Things” to “Move Fast, With Stable Infra”. The social network needs to make sure that the application and features are stable, without the risk of breaking in production.

How is it even possible to automate root cause analysis?

Engineering teams at enterprises and startups who overcome CI/CD obstacles are doing so by building a strategy across all stages of the software release cycle – from building, through testing, deploying and through monitoring. When Comcast’s engineering team faced the challenge of debugging their flagship X1 XFINITY platform after releases, it was critical for them to be as efficient as they can be. To learn more about how Comcast introduced an automated error resolution strategy into their workflow, join our webinar.

The hidden costs of CI/CD

Let’s try to understand why so many engineering teams are facing challenges after code is actually deployed. A CI/CD methodology can help your team innovate faster and continuously release updates, as well as offer a better product for your users. However, if you’re not adding automation to your monitoring process, this cycle can come with hidden costs that might break your workflow.
These costs can sometimes be overlooked until it’s too late, and it’s important to acknowledge and understand them in advance:

1. Increased rate of production errors

Expect the unexpected. Even the most thorough testing, staging and QA process lets errors slip through the cracks. User reports remain the biggest source of information about errors, and the error resolution process is reactive. CI/CD speeds up disruption and code breaks more often.
Business outcome: bad user experience and lost users. Even a few minutes of failed transactions can cost hundreds of thousands of dollars.

2. Reduced staff efficiency

Developers already spend 20-40% of their time debugging. Beyond the impact on the application and service quality, engineers spend an increasing percentage of their time debugging software instead of building new features. Trying to move faster often results in a reverse outcome.
Business outcome: loss of productivity, employee churn and uncontrolled spending.

3. Bad releases stop the deployment train

Outdated production monitoring practices that rely solely on logging and performance management tools often stall a CI/CD process for days. An informed production error handling strategy needs to be put in place to enjoy the benefits of fast paced innovation while mitigating the associated risks.
Business outcome: delays in product roadmaps and slow time-to-market. Missed deadlines and managerial overhead.

With fast deployments comes great responsibility

Many companies have adopted or are in the process of adopting CI/CD methodologies as part of their workflow to innovate faster. Quick Time-To-Market is more than a nice-to-have ability; it’s the cornerstone of a successful company.
Successful organizations who withstand the growing pains of CI/CD are getting ahead of their competition. This practice leads to high performing engineering teams having a bigger impact on their company’s bottom line, and the satisfaction of their team members increases along with their productivity.
It’s also important to note that the CI/CD cycle brings a lot of stress to companies and teams, making them innovate and innovate at a faster rate. Everyone on the team are responsible for their own code, and they have to make sure it meets the standards needed for it to be pushed into production.
After building, testing and deploying, it’s time to think of the next step in this process; monitoring your application. You need to know as soon as an error is introduced into your application. With OverOps, you’ll get the complete source code and variable state across the entire call stack for every error, exception or bug, as soon as they’re introduced into the application.

Final thoughts

CI/CD doesn’t end when code is deployed, and everyone on the team crucial to the success of this workflow, taking responsibility for their code even after it’s deployed.
If up until now CI/CD was all about building, testing and deploying, we need to add monitoring to that mix and make sure nothing breaks in production. Unless you want your developers waste most of their time on debugging.

Henn is a marketing manager at OverOps covering topics related to Java, Scala and everything in between. She is a lover of gadgets, apps, technology and tea.

Troubleshooting Apache Spark Applications with OverOps OverOps’ ability to detect precisely why something broke and to see variable state is invaluable in a distributed compute environment.
Troubleshooting Apache Spark Applications with OverOps

Next Article

The Fastest Way to Why.

Eliminate the detective work of searching logs for the Cause of critical issues. Resolve issues in minutes.
Learn More