Our team is responsible for Small Business Group Quality in Intuit’s flagship product. It’s one of the biggest business units in the company, and our goal is to improve velocity and productivity for every engineer to allow faster release cycles. We’re running thousands of machines both in production and pre-production environments, that keeps on growing according to our needs.
Some of our environments are old, running monolith modules and legacy code that’s hard to maintain. We often find ourselves dealing with issues occurring in those environments, and we’re not able prevent them before they affect users.
Our main method of detecting errors was through logs, along with some functional tests that we’ve written. Even if we did find the issues within the logs, some of them were not reproducible. Finding, reproducing and solving issues within these areas is a real challenge.
We’ve encountered an issue where critical exceptions kept recurring and we couldn’t find what was causing them. We could think of at least a 100 different reasons for them to be thrown, and investigating the error with logs and APM didn’t make sense.
One look at the OverOps dashboard showed us that particular exception, along with the variables that caused it. OverOps immediately identified the exception’s cause, when APM tools and log files were no help.
OverOps improved our development team's productivity significantly by giving us the root cause to errors.
We use OverOps email alerts to get real-time notifications whenever an issue occurs. We can now see the errors and exceptions that are thrown, get the variables and values assigned to each one and identify the root cause in less than 20 minutes.
We use OverOps across multiple environments, including QA, pre-production and staging so that we can detect an error before it impacts the user. This improves our application’s reliability and helps us provide an outstanding user experience.