How come there are companies who skip staging and push new code straight to production?
The classic model of developing software applications has several environments: development, integration, testing, QA, staging, and finally production. In practice, particularly today, several of these environments are streamlined or done away with altogether. As the software world has evolved, it’s worth revisiting some central assumptions about the development models we use.
Today, we’re looking at one environment in particular: staging. Many developers are forgoing it anyway and taking on the risks of testing directly in production, so the question is: do we really need staging environments anymore?
With the disclaimer that there is no one-size-fits-all development model, here are five reasons to reconsider your staging environment:
1. Staging environments were designed to solve a need that often no longer exists
Before we really dive into the question itself, let’s look at why staging environments exist in the first place. When software applications were shipped in boxes, there wasn’t a controllable production environment to test or gather data from. To get around this, staging environments were created.
Staging environments were supposed to exactly mimic production environments, in order for you to deploy your app and test it in “real life” conditions. Ultimately, the whole thing is a simulation, but a simulation can still tell you something valuable.
With the advent of the Internet (and error reports), developers were first able to start getting actual data from production. It wasn’t real time and it was passive rather than active, but it still began to chip away at the value of staging, since it was real and true data.
As the world has moved to a SaaS model and always-on connections for software, there is now a rich and full production environment that is often quite available to gather data from. Simply put, the original driver for staging environments no longer exists. Especially as modern developer teams have moved away from the waterfall model where they didn’t even have access to their own production environment.
2. Staging environments require significant upkeep
Today’s deployment environment is fast-paced and continuous. Your fixes may come out monthly, bi-weekly, or even more frequently. There’s often no time to set aside for testing a fix to a production bug in staging, and deploying straight to production is a reality for many developers. However, if you’re not careful, your staging environment can start to slip. That tiny fix you deployed straight to production and forget to later deploy to staging can break something in a bigger version you deploy a month later.
Beyond just version upkeep, staging environments require a careful eye, monitoring, and testing of their own. As they’re supposed to be mirror images of your production environment, staging environments are often deployed on the same servers or some locations as production environments. That creates a risk of leakage, which requires frequent testing to monitor.
3. The data in staging is simply not as good
The main limitation of staging environments, no matter how exactly they mimic your production environment, is that they aren’t handling real, live interactions. When simulating interactions, it can be nearly impossible to accurately simulate what your production environment will go through. High loads, corner cases, weird user actions, and many other scenarios that pop up in production are difficult to generate in staging. And if you’re fortunate enough to be working on a JVM-based language, you can actually get all the debugging data you need in production as well.
Depending on your business, you might not be able to deploy 100% of features in your staging environment anyway. For example, in a financial business, your user interaction is going to be credit card transactions. That’s not something you can simulate, unless you’re willing to set aside some budget. And that brings us to #4.
4. Staging environments are expensive
If you have a staging environment, it can nearly double your costs. To get anything out of staging, you have to make sure your staging environment mirrors your production environment as closely as possible. That can mean doubling the number of servers you have, doubling the bandwidth, and doubling engineering time. This is a challenge that holds true for both startups, where money can be tight, and for large companies, where issues of high scale come in.
5. Staging environments are hard to scale
One of the biggest challenges in development is building environments that can handle high loads and scale that can come in production (if you’re lucky). Staging environments are no exception. Systems at high scale behave differently than those at small scale, even when everything else is the same. There are limits to what you can achieve with a staging environment solely because of scaling challenges. Simulating the scale of your production environment is often unfeasible or simply impossible. There are obstacles around physical infrastructure, traffic loads, and types of interactions, none of which are easy to overcome.
So to recap, staging environments are expensive, difficult to get accurate data from, and based on an outdated need. Does that mean you should do away with them altogether? To answer that, you have to answer two questions first. One, are you able to accept some loss? Not every app can afford to take the level of risk or losses that can come from testing completely in production. For example, if you’re building an app for an operating room for patients, you can’t exactly test in production, and no loss is acceptable. In such a case, you absolutely will need a staging environment.
The second question is, are you able and willing to test completely in production? Full production testing is possible for many apps, but it requires planning, good practices, and a set of dedicated tools for the job.
OverOps is a particularly strong tool for those that are ready and willing to test in production (and also for those that aren’t ready OR willing to admit they already do test in production).