36-350
13 October 2014
Basic tricks for debugging:
Better success through design!
Our two competing goals:
An important distinction, as these these go back and forth with each other:
Since programming means making a procedure, we check the substance primarily.
Test cases with known answers
add <- function (part1, part2) { part1 + part2 }
a <- runif(1)
add(2,3) == 5
[1] TRUE
add(a,0) == a
[1] TRUE
add(a,-a) == 0
[1] TRUE
Real numbers and floating-point precision
cor(c(1,-1,1,1),c(-1,1,-1,1))
[1] -0.5774
-1/sqrt(3)
[1] -0.5774
cor(c(1,-1,1,1),c(-1,1,-1,1)) == -1/sqrt(3)
[1] FALSE
Compare alternate routes to the same answer:
test.unif <- runif(n=3,min=-10,max=10)
add(test.unif[1],test.unif[2]) ==
add(test.unif[2],test.unif[1])
[1] TRUE
add(add(test.unif[1],test.unif[2]),test.unif[3]) ==
add(test.unif[1],add(test.unif[2],test.unif[3]))
[1] TRUE
add(test.unif[3]*test.unif[1],test.unif[3]*test.unif[2]) ==
test.unif[3]*add(test.unif[1],test.unif[2])
[1] FALSE
Test function: numerical derivative
x <- runif(10,-10,10)
f <- function(x) {x^2*exp(-x^2)}
g <- function(x) {2*x*exp(-x^2) -2* x^3*exp(-x^2)}
isTRUE(all.equal(derivative(f,x), g(x)))
If this seems too unstatistical…
xx <- runif(10)
aa <- runif(1)
cor(xx,xx) == 1
[1] TRUE
cor(xx,-xx) == -1
[1] TRUE
cor(xx,aa*xx) == 1
[1] FALSE
pp <- runif(10); mean=0; sd=xx
all(pnorm(0,mean=mean,sd=sd) == 0.5)
[1] TRUE
pnorm(xx,mean,sd) == pnorm((xx-mean)/sd,0,1)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
all(pnorm(xx,0,1) == 1-pnorm(-xx,0,1))
[1] TRUE
pnorm(qnorm(pp)) == pp
[1] TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE
qnorm(pnorm(xx)) == xx
[1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
With finite precision we don't really want to insist that these be exact!
Statistical hypothesis testing: risk of false alarm (size) vs. probability of detection (power) – this balances type I vs. type II errors
In software testing: no false alarms allowed (false alarm rate \( =0 \)). This must reduce our power to detect errors; code can pass all our tests and still be wrong.
But! we can direct the power to detect certain errors, including where the error lies, if we test small pieces.
The idea behind unit testing:
After making changes to a function, re-run its tests, and those of functions that depend on it.
When we have a version of the code which we are confident gets some cases right, keep it around (under a separate name).
Now compare new versions to the old, on those cases
Keep debugging until the new version is at least as good as the old
General strategy for development.
Have an idea about what the program should do.
Modify code until it passes all the tests
When you find a new error, write a new test
When you add a new capacity, write a new test
When you change your mind about the goal, change the tests
By the end, the tests specify what the program should do, and the program does it
Boundary cases, “at the edge” of something, or non-standard inputs. Such as:
add(5,NA) # NA, presumably
[1] NA
try(add("a","b")) # NA, or error message?
divide <- function (top, bottom) top/bottom
divide(10,0) # Inf, presumably
[1] Inf
divide(0,0) # NA?
[1] NaN
Pinning down awkward cases helps specify function
var(1) # NA? error?
[1] NA
cor(c(1,-1,1,-1),c(-1,1,NA,1)) # NA? -1? -1 with a warning?
[1] NA
try(cor(c(1,-1,1,-1),c(-1,1,"z",1))) # NA? -1? -1 with a warning?
try(cor(c(1,-1),c(-1,1,-1,1))) # NA? 0? -1?