Oliver Wyman is a leading global management consulting firm with expertise in strategy, operations, risk management, and organizational transformation.
Production Monitoring Ecosystem
About the Team
Our team is responsible for an application that facilitates verification and aggregation of automotive OEMs data from around the world. The data is then analyzed and compiled into an automotive benchmarking study, which serves as a competitive analysis guide based on ~400 plants globally.
Error handling and resolution process
Before OverOps, we used in-house products to monitor and analyze our logs, with custom messages that were sent to us occasionally. However, we had to sift through the logs manually to find what went wrong, and that wasn’t a sufficient solution for the team.
To optimize our use of log files, we used log management tools to search for particular key phrases and keywords and get reports on an hourly basis. We didn’t have the ability to get a deeper look into the error and state of the JVM to figure out what went wrong.
How does OverOps help you solve issues?
After adding OverOps to our toolkit and making it a part of our workflow, we now rely on it to know what causes production errors and are able to easily fix them. OverOps, along with our existing log management solutions, helps us gain more insight and information about the state of the application when production errors happen.
Now with OverOps, we know the exact application state, the values that were sent and the user actions when a critical error was thrown, immediately reacting and fixing it.
How are you integrating OverOps with your daily workflow?
We have a weekly release schedule and about 60 yearly releases in production. As part of our daily workflow and release cycle, we get email alerts from OverOps that immediately let us know if something might be off. OverOps helps us dive deeper into the code to understand when, where, and why errors happen.
We’ve integrated OverOps with JIRA and can now open tickets directly from the dashboard. If after a brief triage it looks like an new error, we immediately create a JIRA ticket and assign it to the right developer, with the complete reference to a source code and variable state they need to solve it.