Skip to content

10 Function Factories

Alexandre Henrique edited this page Feb 26, 2020 · 2 revisions

10 Function Factories

  • Definition: A function factory is a function that makes functions. Take the following as example:
power1 <- function(exp) {
  function(x) {
    x ^ exp
  }
}

square <- power1(2)
cube <- power1(3)

We call square() and cube() manufactured functions.

Function factories are empowered by the combination of three important properties of R.

  1. In section 6.2.3 we talked about first-class functions. In R, you bind a function to a name in the same way as you bind any object to a name: with <-.
  2. In section 7.4.2 we discussed how a function captures (encloses) the environment in which it is created.
  3. In section 7.4.4 we learned about the fresh start scoping rule of R functions which states that a function runs standalone every time it is called. This environment is usually ephemeral, but here it becomes the enclosing environment of the manufactured function.

Outline

  • Section 10.2 explains how function factories works and also how function factories can be used to implement a memory for functions.
  • Section 10.3 discusses how function factories empower some functionalities of ggplot2.
  • Section 10.4 uses function factories to tackle three challenges from statistics: understanding the Box-Cox transform, solving maximum likelihood problems, and drawing bootstrap resamples.
  • Section 10.5 shows how to combine function factories with functionals to rapidly generate a family of functions from data.

Prerequisites

Knowledge of the topics covered in Sections 6.2.3 (first-class functions), 7.4.2 (The function environment), and 7.4.4 (execution environments).

library(rlang)
library(ggplot2)
library(scales)

10.2 Factory fundamentals

The key idea of function factories can be expressed in a few words:

The enclosing environment of the manufactured function is an execution environment of the function factory.

10.2.1 Environments

Taking a look into the environments of both square and cube:

square
#> function(x) {
#>     x ^ exp
#>   }
#> <environment: 0x3b00368>

cube
#> function(x) {
#>     x ^ exp
#>   }
#> <bytecode: 0x2331800>
#> <environment: 0x3b7e0a0>

The bodies are identical, therefore, we cannot obtain any information such as: where does exp comes from? Using rlang::env_print() we get the following output:

env_print(square)
#> <environment: 0x3b00368>
#> parent: <environment: global>
#> bindings:
#>  * exp: <dbl>

env_print(cube)
#> <environment: 0x3b7e0a0>
#> parent: <environment: global>
#> bindings:
#>  * exp: <dbl>

Each function has its own environments:

  1. An individual environment of the single execution of power1().
  2. A parent environment: global which is the enclosing environment of power1().

env_print() additionaly shows us that both environments have a binding to exp. We can see who exp is getting the environment and then the value bound to exp:

fn_env(square)$exp
#> [1] 2

fn_env(cube)$exp
#> [1] 3

This is what makes manufactured functions behave differently from one another: names in the enclosing environment are bound to different values.

10.2.2 Diagram Conventions