If you don’t understand a problem you’re not allowed to fix it

Steven Griffith
6 min readAug 10, 2019

Let me repeat that: “If you don’t understand a problem you’re not allowed to fix it”

For a lot of my career I was known for my talents in debugging, reading terrible code and understanding it, and fixing bugs. I didn’t have the heavy algorithm and intense math background my coworkers did but I was curious, enjoyed puzzles, and loved to refactor code. (I still deeply enjoy refactoring code). I have new strengths now and I’ve grown a lot, but these were the building blocks of my programming journey.

We’ve all been in that situation. We’re perplexed by something and we are trying out things without knowing exactly why. “Throwing spaghetti at the wall and seeing what sticks”. We are logging out variables, setting breakpoints, etc trying to see the state of things and where it went wrong.

Here’s where it goes off the rails. You see a situation that’s causing the problem. Maybe it’s a known object getting passed in with incorrect properties, maybe it’s a null value where there shouldn’t be one, and maybe you can’t figure out why a specific branch is getting hit when it shouldn’t. You see the immediate issue and you get the urge to just fix it. You want to check the null value and just return early. What could go wrong? You want to just check the properties and handle the instance you’re finding that is out of whack. What could go wrong?

Problem is you don’t know WHY that is happening. It’s not enough to figure out how to bandaid it, you have to actually understand the problem. Why is the object different? Why is there a null value where there shouldn’t be? Why? You don’t understand what is causing it. That means you ARE NOT READY TO FIX IT YET. That’s right. Get your fingers away from that keyboard, you’ve still got investigation to do.

It’s not enough to fix a symptom. Sometimes when you do understand the problem, you still end up just doing what you would have done anyway and that’s fine. You need to be able to know what is causing that, you need to be able to explain it to someone else, and you need to be able to explain why your fix was the right approach. This is what doing your best means. This is your responsibility.

There’s a lot of reasons you might try to just do that quick patch. Sometimes management is essentially “standing in the door waiting”. You have a responsibility to them to do your best. Throwing in a hack is not your best and I can guarantee you that in most of the situations all the time that you are “saving” is still going to be spent on this problem. You’re going to find another time where that happens that you didn’t expect, you’re going to see side effects pop up from the patch, the situation behaves a little differently when it’s in production or multi-user environment, or another of the many possibilities out there. It’s perfectly acceptable to report that you don’t understand the problem fully yet. In fact it’s your responsibility to your team and that includes your management.

Sometimes you’re in that “hot patch” environment. Help test bugs sort of thing. I worked in an industry where it was just the nature of the beast to be adding features or fixing bugs to a short lived live ordering site while it was running. This was an intense situation for sellers and buyers, and the needs were insanely “right now”. The clients pressured the business, and the business pressured the programmers.

It took quite some time before I realized I was failing my team by nodding, accepting ideas for that quick fix, and/or just hacking in a patch. It was my job to make sure I understood and could properly fix the problem. It was my job to keep myself from being pressured into the quick fix. The management didn’t understand what that meant, and I never conveyed the danger of it to them. Had I actually described the impact or possibilities I’m sure that they wouldn’t have wanted it that way. I can’t think of a situation where it didn’t end up with me (or worse, someone else) eventually spending more time on the issue.

Sometimes you feel like you’re taking too long on a ticket, or the code review suggestions come in and you put yourself in the bug fix mindset because you were “already done” (FYI: The code isn’t done until the review is done). You start looking for the quickest rout and stop asking questions. You don’t want to ask for help after spending so long, or perhaps you can’t figure out the right questions because you don’t have the understandings. Programming is a team sport. Ask for help.

Sometimes you’re going to get suggestions from people when you explain the side effect, even though no one understands it yet. These may or may not be good, but it’s still your responsibility to understand before you implement the suggestion. It’s acceptable to preface your explanation with: “I’m still trying to understand the issue so I’m not looking for help with any solutions. I just want to update you with my progress”. A lot of times it is part of the listeners job to remove blocks or facilitate production from you. That will mean it’s natural for them to try and help find a fix. Many times it’s someone who is technical which means they naturally want to solve problems. I find it’s better to be open to input from technical people but still say “That’s a good idea, I’ll look into that. I want to understand exactly what’s happening first”. But nothing gets you off the hook from understanding.

You have to understand the problem when you are reviewing code too. If you see something that looks like a quick fix then you can ask. If you don’t understand the issue, and the publisher didn’t leave a good enough explanation, then you can’t continue the review. It’s always acceptable to ask the person to walk you through the issue and why the solution fixes it. If you are the publisher it’s your responsibility to try and explain the problem, solution, and reproduction steps the best you can in the pull request or before the review begins. If you can’t do this then you’re not finished.

Your final task is related specifically to bug fixes. It’s important to write test coverage for edge cases that are hard to understand. You must do everything in your power to test the case that you fix, even if you already have 100% coverage without adding any new tests. If it fooled everyone the first time then tests will both ensure that the fix remains and document the problem for future developers. This is mostly a problem if you test but don’t do TDD. If you don’t currently test anything then this edge case is the perfect time to start adding coverage. It’s simple to drop in a test suite.

There’s only two things that can stop this from happening. First … brace yourself … your ego. If your ego is an issue then it will be very hard to be a professional and a good team member. You’re probably going to end up job surfing and possibly working a lot of hours as a “rock star” or a “ninja”. Good luck to you. I wish you many glories.

The second is management overrides you. This is the case sometimes and that is fine as long as you’ve done your part by explaining yourself. They have the business targets in mind and if you’ve explained the risk and that you don’t understand the problem then it is their choice. You’ve done your part, just make sure you document the situation. In some situations where there is real danger as a result of your actions you may need to find other avenues to report to but I’ll consider those edge cases and assume you know if that’s you.

At the core, we are problem solvers. We have a deep desire to understand problem spaces and build solutions, and we are lazy. With the right mindset debugging and bug fixing can be fun and rewarding. All it takes is an active choice and a little diligence.

--

--

Steven Griffith

I was a software engineer for right around ten years before transitioning into management. I’m still growing in my new field.