Skip to content

Commit

Permalink
revert use of gt for now
Browse files Browse the repository at this point in the history
  • Loading branch information
romainfrancois committed Oct 29, 2018
1 parent fa25c43 commit 2e5a185
Show file tree
Hide file tree
Showing 2 changed files with 87 additions and 1,655 deletions.
126 changes: 44 additions & 82 deletions content/articles/2018-10-dplyr-0-8-0-release-candidate.Rmarkdown
Original file line number Diff line number Diff line change
Expand Up @@ -20,45 +20,44 @@ knitr::opts_chunk$set(collapse = T, comment = "#>")
options(tibble.print_min = 4L, tibble.print_max = 4L)
library(dplyr)
# devtools::install_github("rstudio/gt@colorize")
library(gt)

show_groups <- function(tbl) {
# Get a named list of column labels
col_labels <-
lapply(names(tbl), function(name) {
x <- tbl[[name]]
class_x <- class(x)

value <- if (class_x == "factor") {
paste0("fctr<", paste0(levels(x), collapse = ","), ">")
} else {
paste0("<", class(x), ">")
}
paste0(name, " ", value)
}) %>%
magrittr::set_names(names(tbl))

# Create the gt table
gt(tbl) %>%
cols_label(.list = col_labels) %>%
cols_color_gradient_n(
column = "x",
breaks = c(0, 3),
colors = c("lightblue", "steelblue")
) %>%
cols_color_manual(
column = "f1", values = c("a", "b", "c"),
colors = c("cornsilk", "bisque", "goldenrod")
) %>%
cols_color_manual(
column = "f2", values = c("d", "e", "f"),
colors = c("pink", "pink2", "pink3")
) %>%
cols_color_gradient_n(
column = "n", breaks = c(0, 1),
colors = c("red", "white")
)
}
# library(gt)
# show_groups <- function(tbl) {
# # Get a named list of column labels
# col_labels <-
# lapply(names(tbl), function(name) {
# x <- tbl[[name]]
# class_x <- class(x)
#
# value <- if (class_x == "factor") {
# paste0("fctr<", paste0(levels(x), collapse = ","), ">")
# } else {
# paste0("<", class(x), ">")
# }
# paste0(name, " ", value)
# }) %>%
# magrittr::set_names(names(tbl))
#
# # Create the gt table
# gt(tbl) %>%
# cols_label(.list = col_labels) %>%
# cols_color_gradient_n(
# column = "x",
# breaks = c(0, 3),
# colors = c("lightblue", "steelblue")
# ) %>%
# cols_color_manual(
# column = "f1", values = c("a", "b", "c"),
# colors = c("cornsilk", "bisque", "goldenrod")
# ) %>%
# cols_color_manual(
# column = "f2", values = c("d", "e", "f"),
# colors = c("pink", "pink2", "pink3")
# ) %>%
# cols_color_gradient_n(
# column = "n", breaks = c(0, 1),
# colors = c("red", "white")
# )
# }
```

A new release of dplyr (0.8.0) is on the horizon, and since it is a major release, we'd love for the
Expand Down Expand Up @@ -95,77 +94,46 @@ match the observed data. This closes the epic issue
Let's illustrate the new algorithm with the [count()](https://dplyr.tidyverse.org/reference/tally.html)
function:

```{r, results = "hide"}
```{r}
df <- tibble(
f1 = factor(c("a", "a", "a", "b", "b"), levels = c("a", "b", "c")),
f2 = factor(c("d", "e", "d", "e", "f"), levels = c("d", "e", "f")),
x = c(1, 1, 1, 2, 2),
y = 1:5
)
```

```{r, echo = FALSE}
show_groups(df)
```

```{r, results = "hide"}
df
df %>%
count(f1)
```

```{r, echo = FALSE}
df %>%
count(f1) %>%
show_groups()
```

Where previous versions of `dplyr` would have created only two groups (for levels `a` and `b`),
it now creates one group per level, and the group related to the level `c` just happens to be
empty.

Groups are still made to match the data on other types of columns:

```{r, results = "hide"}
```{r}
df %>%
count(x)
```

```{r, echo = FALSE}
df %>%
count(x) %>%
show_groups()
```

Expansion of groups for factors happens at each step of the grouping, so if we group
by `f1` and `f2` we get 9 groups,

```{r, results = "hide"}
```{r}
df %>%
count(f1, f2)
```

```{r, echo = FALSE}
df %>%
count(f1, f2) %>%
show_groups()
```


When factors and non factors are involved in the grouping, the number of
groups depends on the order. At each level of grouping, factors are always expanded
to one group per level, but non factors only create groups based on observed data.

```{r, results = "hide"}
```{r}
df %>%
count(f1, x)
```

```{r, echo = FALSE}
df %>%
count(f1, x) %>%
show_groups()
```


In this example, we group by `f1` then `x`. At the first layer, grouping on `f1` creates
two groups. Each of these grouops is then subdivided based on the values of the second
Expand All @@ -176,17 +144,11 @@ The last group, associated with the level `c` of the factor `f1` is empty, and
consequently has no values for the vector `x`. In that case, `group_by()` uses
`NA`.

```{r, results = "hide"}
```{r}
df %>%
count(x, f1)
```

```{r, echo = FALSE}
df %>%
count(x, f1) %>%
show_groups()
```

When we group by `x` then `f1` we initially split the data according to `x` which
gives 2 groups. Each of these two groups is then further divided in 3 groups,
i.e. one for each level of `f1`.
Expand Down
Loading

0 comments on commit 2e5a185

Please sign in to comment.