Debugging Basics

Statistical Computing, 36-350

Monday October 17, 2016

Bug!

The original name for glitches and unexpected defects: dates back to at least Edison in 1876, but better story from Grace Hopper in 1947:

(From Wikipedia)

Debugging: what and why?

Debugging is a the process of locating, understanding, and removing bugs from your code

Why should we care to learn about this?

The truth: you’re going to have to debug, because you’re not perfect (none of us are!) and so you can’t write perfect code
Debugging is frustrating and time-consuming, but essential
Writing code that makes it easier to debug later is worth it, even if it takes a bit more time (lots of our design ideas support this)
Simple things you can do to help: use lots of comments, use meaningful variable names!

Debugging: how?

Debugging is (largely) a process of differential diagnosis. Stages of debugging:

Reproduce the error: can you make the bug reappear?
Characterize the error: what can you see that is going wrong?
Localize the error: where in the code does the mistake originate?
Modify the code: did you eliminate the error? Did you add new ones?

Reproduce the bug

Step 0: make if happen again

Can we produce it repeatedly when re-running the same code, with the same input values?
And if we run the same code in a clean copy of R, does the same thing happen?

Characterize the bug

Step 1: figure out if it’s a pervasive/big problem

How much can we change the inputs and get the same error?
Or is it a different error?
And how big is the error?

Localize the bug

Step 2: find out exactly where things are going wrong

This is most often the hardest part!
Today, we’ll learn how to understand errors, using traceback(), and also cat(), print()
Next time, we’ll learn how to interactively debug with the R tool browser()

Localizing can be easy or hard

Sometimes error messages are easier to decode, sometimes they’re harder; this can make locating the bug easier or harder

my.plotter = function(x, y, my.list=NULL) {
  if (!is.null(my.list)) 
    plot(my.list, main="A plot from my.list!")
  else
    plot(x, y, main="A plot from x, y!")
}

my.plotter(x=1:8, y=1:8)

my.plotter(my.list=list(x=-10:10, y=(-10:10)^3))

my.plotter() # Easy to understand error message

## Error in plot(x, y, main = "A plot from x, y!"): argument "x" is missing, with no default

my.plotter(my.list=list(x=-10:10, Y=(-10:10)^3)) # Not as clear

## Error in xy.coords(x, y, xlabel, ylabel, log): 'x' is a list, but does not have components 'x' and 'y'

Who called xy.coords()? (Not us, at least not explicitly!) And why is it saying ‘x’ is a list? (We never set it to be so!)

`traceback()`

A call to traceback(), after an error: traces back through all the function calls leading to the error

Start your attention at the “bottom”, where you recognize the function you called
Read your way up to the “top”, which is the lowest-level function that produces the error
Often the most useful bit is somewhere in the middle

If you ran the last example my.plotter(my.list=list(x=-10:10, Y=(-10:10)^3)) in the console, then called traceback(), you’d see:

> traceback()
5: stop("'x' is a list, but does not have components 'x' and 'y'")
4: xy.coords(x, y, xlabel, ylabel, log)
3: plot.default(my.list, main = "A plot from my.list!")
2: plot(my.list, main = "A plot from my.list!") at #4
1: my.plotter(my.list = list(x = -10:10, Y = (-10:10)^3))

We can see that my.plotter() is calling plot() is calling plot.default() is calling xy.coords(), and this last function is throwing the error

Why? Its first argument x is being set to my.list, which is OK, but then it’s expecting this list to have components named x and y (ours are named x and Y)

`cat()`, `print()`

You can modify your function by calling cat() or print() at various points, to print out the state of variables, to help you localize the error

my.plotter = function(x, y, my.list=NULL) {
  if (!is.null(my.list)) {
    print("Here is my.list:")
    print(my.list)
    print("Now about to plot my.list")
    plot(my.list, main="A plot from my.list!")
  }
  else {
    print("Here is x:"); print(x)
    print("Here is y:"); print(y)
    print("Now about to plot x, y")
    plot(x, y, main="A plot from x, y!")
  }
}

(Continued)

Here it’s not exactly critical to use print() for debugging, but this is just a demonstration of how we might check the status of variables along the way

my.plotter(my.list=list(x=-10:10, Y=(-10:10)^3))

## [1] "Here is my.list:"
## $x
##  [1] -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6
## [18]   7   8   9  10
## 
## $Y
##  [1] -1000  -729  -512  -343  -216  -125   -64   -27    -8    -1     0
## [12]     1     8    27    64   125   216   343   512   729  1000
## 
## [1] "Now about to plot my.list"

## Error in xy.coords(x, y, xlabel, ylabel, log): 'x' is a list, but does not have components 'x' and 'y'

my.plotter(x="hi", y="there")

## [1] "Here is x:"
## [1] "hi"
## [1] "Here is y:"
## [1] "there"
## [1] "Now about to plot x, y"

## Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion

## Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion

## Warning in min(x): no non-missing arguments to min; returning Inf

## Warning in max(x): no non-missing arguments to max; returning -Inf

## Warning in min(x): no non-missing arguments to min; returning Inf

## Warning in max(x): no non-missing arguments to max; returning -Inf

## Error in plot.window(...): need finite 'xlim' values