Before we begin, the best way to debug a program is to not have bugs to begin with. Though we’re only human and we’ll certainly mak mstakes, the best way to avoid bugs is to adhere to the problem solving framework. Remember that the programming battle is in the Understand and Plan phases. The more completely and correctly you complete those phases, the fewer bugs you’ll have when coding it up.
That said, let’s talk about what to do when the inevitable happens.
This is one of the most important things about being a developer: have a mental model of computation.
That is, you should be able to read code and know what’s going to happen.
def foo(n):
= 0
i
while i < n:
= i + 1 + (i % 2)
i
print(i)
5) foo(
Read the nonsense Python code, above. Mentally compute the output. Then see if you’re right. (I wrote the code, and it still took me a solid minute to mentally model the answer. But I was right!)
If you cannot “run” the code in your head, you cannot debug. Yes, I’m going that strong. I’m sure some people disagree with me, but I want to drive home how important this is.
Debugging is the art of finding the part of the code where your mental model of the computation and the reality of the computation diverge. And then fixing it.
If you don’t have a mental model, you don’t have anything to compare against and you’ll make little progress.
How do your improve your mental model of computation?
Study code. Trace through it. There’s a ton of code out there to practice with, e.g. on GitHub, on HackerRank, your peers’ codebases, your own stuff that you wrote four months ago and have forgotten how it works, etc.
Predict the output. As you learn the code, try to predict how it will behave when run.
Manually trace a run. Use a whiteboard to manually track what values variables take on, what functions get called, and what line of code is executing.
Write a specification. Study some code and “reverse engineer” it. Figure out what it does, then write a human-readable spec that perfectly describes the algorithm or codebase to the degree that a reader could reimplement it from scratch.
Single-step through with a debugger. Have the computer show you how the program is flowing. We’ll talk more about that, below.
You’ll definitely improve this skill with practice.
First things first: see if you can get the bug to happen consistently. Being able reproduce (“repro”) the bug is the first step in being able to squash it.
Sometimes this is actually the hard part. You saw something go wrong once (or someone reported they saw it), and now you can’t repro it.
At this point you might be tempted to utilize a programming technique known as “prayer” in that you’re praying that either you didn’t really see it or the bug will never rear its ugly face again in this world. But here’s the unfortunate truth: if someone saw this rare bug just once in testing, thousands or millions of people will see the bug when it goes into production. Murphy’s Law and statistics? You have no chance against that.
If you can’t repro it, you’d just be shooting in the dark trying to fix it. Your goal at this point is to just get it to happen at all. Work backward logically. What must have happened in the program flow in order to see what the bug reporter saw? Look there first. Even if you’re sure some condition can’t possibly be true, if it must have been true to see the bug, then it must have been true! Keep tracking back, looking for where your mental model and the code diverge and use that to repro it yourself.
If you can only sporadically repro it, you have a chance, but it’s going to be hard work. Your goal is to get it to repro consistently. This saves you spending forever trying to get a rare repo. If you could make it happen consistently, that’s a big time saver, and it helps you narrow down where the bug can be.
Once you find how to repro it consistently, now you can get even more systematic about things. You want to find the minimum number of steps that can cause the bug to manifest. For example, you’re playing a game and you find a bug when you complete 20 laps around the race course and then drive into a tree. Maybe try just driving into the tree first. If you’re lucky, you saved yourself 20 laps and narrowed the bug down to the tree. If nothing happens, you know that the 20 laps is an integral part of the bug. Maybe try a single lap followed by the tree. Does that repro?
Getting the minimum number of steps not only helps you narrow down even further where the bug is in the code, but it makes it easier to test fixes because you don’t have to spend so long reproing the issue.
There is a bug somewhere. You know that because when you give your code some input, and it cranks away at it, eventually it gives you unexpected output.
Somewhere in that big process the reality of the computation diverges from your mental model of the computation.
At first, all you might know is that somewhere in 10,000 lines of code something went wrong. So you’ve narrowed it down to that. Your mental model said that if the input was 2
, the output would be 3490
. And instead the output was 299792458
.
Therefore you know the bug is somewhere between the input and the output.
You could just start changing random things in the code to see if it gets fixed. This is sometimes called shotgun debugging or prayer debugging, and it very, very, very, very, very rarely works. Way more often you’ll just mess things up and make the problem even harder to find. It’s like trying to fix your car’s electrical issues by randomly adding and cutting wires. It’s not even worth debugging this way.
And yet despite that, it’s a very common technique practiced, in vain, by students worldwide. You, however, should not use it.
Instead, it’s time to be systematic. Somewhere in that pipeline of computation the intermediate computed values diverge from your mental model. Your job is to find out where.
So you start probing inside the program at various points to see where things go off the rails. Binary search is great—jump to the middle of the process and examine the values during a run (see below). If they are as expected, the bug must therefore be between the middle of the program and the end! You’ve just halved your search space. Now do it again until you narrow it down far enough to see the bug.
I would contend, though some might disagree, the bug is not found until you understand it. That is, you must understand exactly how your program was giving the output 299792458
instead of the expected 3490
. Gaining that full understanding has a number of benefits:
Once you understand how the wrong output was produced, then decisively and correctly fix the issue, and know why the fix will work.
Finally, if you’re just filing a bug report (i.e. someone else will fix it), being able to give them the minimum steps needed to repro will make you their hero for the day. Consider it from the reverse perspective; would you rather fix a bug with a vague, long sequence of steps to repro, or one with a few steps that caused it to repro every time? The more specific things are, the happier the bug-fixer is, whether that’s you or someone else.
The good old-fashioned standard way of probing software in the middle of a run is called print debugging, or, if you’re a C programmer, printf debugging (pronounced “print eff”).
This is basically just tactically placing print statements inside your code to see what state your program is in.
There are a couple common uses of this:
Print anything at all to see if some part of the code actually executes.
print("A")
= foo()
x
print("B")
if x == 3:
print("C")
*= 2
x else:
print("D")
bar()
print("E")
Notice that when I run the code, I can see how far it gets before a crash, and I can determine if x
were 3
or not.
Print some specific values to see what they are.
Here’s an example where we’re getting data from some sensor in a loop for processing. We suspect that some of the data is wonky (maybe the sensor is busted) so we’re printing it out to see what we get.
while not done:
= get_sensor_data()
data
print(f"Got sensor data: {data}")
process_sensor_data(data)
= data < 0 done
Don’t f---ing use profanity in your debugging statements. Murphy’s Law says that if you do use profanity, you’ll forget to take it out, and even though it was in some part of the code that you’re certain will never run, it will inevitably pop onto the screen while you’re doing a client demo with your boss on the day he’s assessing you for a raise.
I know as well as anyone how infuriating programming can be. And when I’m feeling that way and forget to take a deep breath and recenter, I print this:
print("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA");
Not only does it stand out nice and clear on the screen, but it’s really easy to type in frustration and helps dispel some of my bad energy. And if the client sees it, it’s a minor transgression.
Now, print debugging is kinda frowned upon as a lesser means of debugging compared to using a real debugger (as in the following section). But everyone does it at some point or another, and some people even swear by it.
The place that I think it really shines is when you need to gather a lot of data about the run to see a larger pattern emerge, or when you need to catch an infrequent event. If something happens one run in 10,000, single stepping through with a standard debugger is going to take forever. You can add some print statements and script a run 10,000 times and watch the output to see when it manifests.
One thing to watch out for is that if you’re printing a lot, it can be tough to visually parse the output, and error messages might be lost in it. I’d recommend redirecting the output to a file and then bringing it up in an editor to search.
And, finally, don’t forget to remove all your print statements before you ship your work!
Debuggers are tools that help you find bugs. There are many different debuggers, but virtually all of them share a common set of features. The main features are:
Additional features that are common are:
Rarer are time-travel debuggers. In addition to allowing you to step forward through your program, they allow you to step backward, as well! This is great if you step past the bug by accident and want to step back to see it.
All major IDEs35 have debugger functionality. (It’s part of what’s “integrated”, the “I” in “IDE”.) There are also standalone debuggers that you can run. And all mainstream languages have some kind of debugger support.
As you might imagine, with those features, debuggers are really powerful.
If you suspect a bug in function foo()
, you can set a breakpoint there, run the code, and then get control of the debugger when foo()
executes. Then you can step through it a line at a time, looking at how the values of variables change. And there’s no need to add any print statements.
Note that in VS Code, getting your debugger set up might be trivial, or you might have to edit some esoteric JSON files to get it going.
In any case, learning to use a debugger is a valuable skill that can save you massive amounts of time while you’re trying to track down that pesky gremlin in your code.
What is the mental model of computation and how is it important for debugging?
Why is finding a bug’s minimum reproduction case an important step in fixing that bug?
How do you narrow down where a bug is in your code?
Contrast print debugging versus using a debugger. What do you think the relative strengths and weaknesses are?