How to Make Debugging Easier with Rollbar
Table of Contents
Fixing software errors is more like a collection of haphazard steps than a proper process. Sure, you do extensive testing and QA and you’re continually monitoring your app’s stability, but bugs slip through. That’s OK. It’s inevitable. No matter how much you test, review, and monitor, there will always be unforeseen issues.
There are, however, better ways to get ahead of errors. With the right tools and preparation, development teams can be proactive about remediating errors before they impact users — and their businesses. The real culprit is that as software development has evolved — with cloud-based apps, more frequent releases, progressive deployments, etc. — the way developers deal with bugs hasn’t changed at the same pace.
This is a guest post from Christopher Seaman, Director of Product Management and Growth at Rollbar, a service providing AI-assisted workflows for predicting, reporting and remediating software errors.
The Current State of Debugging Is Lacking
If you ask any developer what their favorite part of their job is, they may well say “building awesome features”, or something similar. But many feel like they aren’t truly able to focus on doing that because they’re constantly “on call” to fix bugs. Fixing problems is part of the role, they’d agree, but the process is just too inefficient and takes too much time. Too often, errors aren’t surfaced until users and customers encounter them.
Even when you do get an alert that something's up — hopefully not at 3 a.m. — you still need to investigate exactly what’s going on. So, you dig into the logs and the APM. It takes a long time because there’s a lot you need to comb through. But you’re not quite sure what you’re supposed to be looking for, because the error alert didn’t give you enough context to quickly pinpoint the cause.
Not having the right information when alerted forces dev teams to spend a lot of their time investigating issues instead of fixing them. It also creates a reactive system which can lead to unhappy customers, deflated developers, and a business that’s stuck on root cause analysis instead of fixing bugs quickly and getting back to innovating.
Start With More Intelligent Error Alerts
The first step in being proactive about remediating errors is better error signal accuracy. Today, there are a number of error monitoring solutions that group errors with the same root cause together. That’s helpful, as you may be seeing hundreds of errors, but they’re actually being caused by the same thing. Or, you only see ten errors but they all require a different solution. It helps you prioritize what you need to do.
The problem is that many grouping engines have hard-coded rules. Meaning, they have a limited definition of errors they recognize. This causes both “false negatives”, where different, unique errors are grouped together but shouldn’t be, and “false positives”, where it seems like there are numerous errors but they are, in fact, caused by the same issue. Many APM solutions don’t group errors at all.
The solution is to have a continuously learning grouping engine. It can be built using machine learning to be constantly trained on new error types it encounters to give more accurate error signals. And, over time, it can better recognize errors similar to those that have happened before and distinguish them from new, unique errors.
Trusting the alerts you get reduces the time you need to spend investigating, allowing you to start resolving the errors sooner.
Fix Errors Faster With Immediate Code Context
Intelligent alerts are just one piece of the puzzle, though. Having an error response solution that integrates directly with your source code repositories to give you all the code context you need is also key — and helps you get to the root cause much faster.
Better code context means that you should able to do the following quickly and easily:
- See who was the last person to edit the line of code that caused the error.
- See exactly when the error occurred alongside the last time that line of code was updated.
- See the history of the code, including why and how the code was added.
- Assign the error to the most recent author, if needed.
Now, you’re actually fixing issues faster, decreasing your mean-time-to-resolution (MTTR) significantly. You don’t have to bounce back and forth between tools like Jira and others as much. Additionally, you can automate issue tracking by assigning tickets to the right person to resolve the issue, eliminating previously manual tasks.
The goal is to spend less time fixing issues but also to catch them before users do, or before they cause even bigger headaches. Being proactive means setting up the right tools and systems to get the information you need to begin remediating errors faster. Spending less time dealing with bugs means you can focus on actually improving your code.
Rollbar is a service implementing the features detailed in this article, helping you deal with errors in your application efficiently. If you find this interesting, you can learn more about Rollbar and try it out for free.