-
-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor puzzling loop in attributes.R #1327
Comments
L559 can be deleted, the rest is more difficult. |
value is a named list. the code above is for the case where we don't have as many values as vertices. we create a list as long as the number of vertices. then the indices are used to fill the list elements. the other elements remain NULL. surely there's an elegant way to do this?! |
elegant and readable |
first fill with NA, then make NA NULL? 🤔 no, they are NA in the current version. |
a good thing is that the lines are covered by tests |
first make the loop easier then use purrr::reduce? |
tmp <- rep(NA, length(vs))
tmp[index] <- value[[i]]
value[[i]] <- tmp |
oooh for value a vector, we get NA as fillers but for value a named list, we get NULL as fillers. |
My attempt to isolate the essence of the coding:
The effect of In this example using the old logic the expression
thereby blocking indexing by name. In the new logic the result will be:
Removing the names could be a matter of defensive coding. |
Create an empty named list. L558, L559 are equivalent to: To replace lines L558, L559, L560 by a single line: |
thanks! @krlmlr helped me shorten the code, I'll soon finish up the related PR. |
My thoughts.
|
note to self: use rlang's "purrr" standalone file |
Thanks, good catch. Using For continuous benchmarking, https://github.com/lorenzwalthert/touchstone does a decent job, but we'd need a good set of fast test cases -- can be small initially and grow over time. |
For reference, my own analysis, with two other variants. n <- 1E5
m <- 1E2
vs <- seq(n)
index <- seq(m)
names(index) <- index
value <- list(rev = as.list(rev(index)), nchar = as.list(nchar(index)))
set_value_at <- function(value, idx, length_out) {
out <- value[NULL]
length(out) <- length_out
out[idx] <- value
unname(out)
}
bench::mark(
"This" = {
value_nw1 <- value
for (i in seq_along(value)) {
tmp <- value[[i]]
length(tmp) <- 0
length(tmp) <- length(vs)
tmp[index] <- value[[i]]
value_nw1[[i]] <- tmp
}
lapply(value_nw1, unname)
},
"Rmatch" = {
value_nw2 <- lapply(value, function(x) {
unname(x[match(seq_along(vs), index)])
})
},
"vec_match" = {
value_nw2 <- lapply(value, function(x) {
unname(x[vctrs::vec_match(seq_along(vs), index)])
})
},
"value_at" = {
value_nw3 <- lapply(value, function(x) {
set_value_at(x, index, length(vs))
})
},
"Rbase" = {
value_nw4 <- lapply(
value,
function(x) {
tmp <- x[NULL]
length(tmp) <- length(vs)
tmp[index] <- x
tmp
}
)
lapply(value_nw4, unname)
}
)
#> # A tibble: 5 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 This 2.79ms 3.01ms 316. 8.79MB 68.1
#> 2 Rmatch 1.48ms 1.76ms 543. 4.42MB 125.
#> 3 vec_match 2.36ms 2.62ms 321. 4.99MB 72.9
#> 4 value_at 733.41µs 1.12ms 681. 4.61MB 176.
#> 5 Rbase 774µs 1.15ms 759. 4.59MB 206. Created on 2024-04-09 with reprex v2.1.0 |
Looking at the reprex results, the self-contained function doesn't seem to be that bad. With a larger input: n <- 1E6
m <- 1E3
vs <- seq(n)
index <- seq(m)
names(index) <- index
value <- list(rev = as.list(rev(index)), nchar = as.list(nchar(index)))
set_value_at <- function(value, idx, length_out) {
out <- value[NULL]
length(out) <- length_out
out[idx] <- value
unname(out)
}
bench::mark(
"This" = {
value_nw1 <- value
for (i in seq_along(value)) {
tmp <- value[[i]]
length(tmp) <- 0
length(tmp) <- length(vs)
tmp[index] <- value[[i]]
value_nw1[[i]] <- tmp
}
lapply(value_nw1, unname)
},
"Rmatch" = {
value_nw2 <- lapply(value, function(x) {
unname(x[match(seq_along(vs), index)])
})
},
"vec_match" = {
value_nw2 <- lapply(value, function(x) {
unname(x[vctrs::vec_match(seq_along(vs), index)])
})
},
"value_at" = {
value_nw3 <- lapply(value, function(x) {
set_value_at(x, index, length(vs))
})
},
"Rbase" = {
value_nw4 <- lapply(
value,
function(x) {
tmp <- x[NULL]
length(tmp) <- length(vs)
tmp[index] <- x
tmp
}
)
lapply(value_nw4, unname)
}
)
#> # A tibble: 5 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 This 17.47ms 18.5ms 51.2 50MB 79.6
#> 2 Rmatch 21.03ms 22.5ms 44.3 42.2MB 44.3
#> 3 vec_match 29.86ms 33.8ms 29.9 49.6MB 39.8
#> 4 value_at 7.85ms 9.5ms 104. 45.8MB 173.
#> 5 Rbase 9.34ms 11.5ms 89.6 45.8MB 122. Created on 2024-04-09 with reprex v2.1.0 |
If no named list is needed,
|
Another idea: Using magrittr s pipe %>%, assuming dependency is not a drawback.
|
Magrittr s pipe %>% is not faster as I thought before.
|
see current state of #1330 |
rigraph/R/attributes.R
Lines 557 to 564 in 36ad3c9
The text was updated successfully, but these errors were encountered: