Environment: what the function can see and do

Each function has its own environment
Names here override names in the global environment
Internal environment starts with the named arguments
Assignments inside the function only change the internal environment
Names undefined in the function are looked for in the global environment

Environment examples

x = 7
y = c("A","C","G","T","U")
adder = function(y) { x = x+y; x }
adder(1)

## [1] 8

## [1] 7

## [1] "A" "C" "G" "T" "U"

(Continued)

circle.area = function(r) { pi*r^2 }
circle.area(1:3)

## [1]  3.141593 12.566371 28.274334

true.pi = pi
pi = 3 # Valid in 1800s Indiana
circle.area(1:3)

## [1]  3 12 27

pi = true.pi # Restore sanity
circle.area(1:3)

## [1]  3.141593 12.566371 28.274334

Relying on variables outside of the function’s environment

Generally OK for built-in constants like pi, letters, month.names, etc.
Generally not OK for user-defined variables outside of the function
For the latter, pass these as input arguments to your function

Top-down function design

Start with the big-picture view of the task
Break the task into a few big parts
Figure out how to fit the parts together
Repeat this for each part

Start off with a code sketch

You can write top-level code, right away, for your function’s design:

# Not actual code
big.job = function(lots.of.arguments) {
  first.result = first.step(some.of.the.args)
  second.result = second.step(first.result, more.of.the.args)
  final.result = third.step(second.result, rest.of.the.args)
  return(final.result)
}

After you write down your design, go ahead and write the sub-functions (here first.step(), second.step(), third.step()). The process may be iterative, in that you may write these sub-functions, then go back and change the design a bit, etc.

Example of a code sketch

Suppose that we wanted to (were instructed to) write a function that takes a vector of strings (each of which is a URL), builds a document-term matrix from these documents, computes correlations, and as a side effect (if asked): prints out a summary to the console.

Sounds complicated! But let’s write a code sketch:

compare.docs = function(str.urls, split="[[:space:]]|[[:punct:]]",
                        tolower=TRUE, keep.numbers=FALSE, print.summary=TRUE) {
  # Compute the document-term matrix
  dt.mat = get.dt.mat(str.urls, split, tolower, keep.numbers)
  # Compute correlations
  cor.mat = cor(t(dt.mat))
  # Print a summary, if we're asked to
  if (print.summary) print.dt.mat(dt.mat)
  # Return a list with document-term matrix and correlations
  return(list(dt.mat=dt.mat, cor.mat=cor.mat))
}

(Continued)

That wasn’t too bad, and now we know exactly what to work on next! More code sketching:

get.dt.mat = function(str.urls, split="[[:space:]]|[[:punct:]]",
                      tolower=TRUE, keep.numbers=FALSE) {
  # First, compute all the individual word tables
  wordtabs = get.wordtabs(str.urls, split, tolower, keep.numbers)
  # Then, build the document-term matrix from these, and return it
  return(dt.mat.from.wordtabs(wordtabs))
}

Luckily, we’ve already written get.wordtabs(); we need to write dt.mat.from.wordtabs(). Also need to sketch/write print.dt.mat()

Function Design

Environment: what the function can see and do

Environment examples

(Continued)

Relying on variables outside of the function’s environment

Top-down function design

Start off with a code sketch

Example of a code sketch

(Continued)