-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
div_section inside analyze call #835
Comments
I think the issue here is that when I wrote it I did not test it for cases with multiple analyzed variables, hence the behavior is not as expected. Can I ask you what you expect with the analyze and multiple variables? In other words, do you need to have @edelarua @shajoezhu do you have any preference here? Surely needs to be fixed but which behavior you think would be best? |
I'd prefer to have the divider to show up between the different variables, rather than between the rows within outputs from afun |
@iaugusty I fixed the analyze call in the linked PR (does it work for you? added reprex here), but I think that library(rtables)
#> Loading required package: formatters
#> Loading required package: magrittr
#>
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#>
#> str
# Regression test for #835
lyt <- basic_table() %>%
split_rows_by("Species", section_div = "|") %>%
analyze(c("Petal.Width", "Petal.Length"),
afun = function(x) list("m" = mean(x), "sd" = sd(x)), section_div = "-")
tbl <- build_table(lyt, iris)
tbl
#> all obs
#> ——————————————————————————————————
#> setosa
#> Petal.Width
#> m 0.246
#> sd 0.105385589380046
#> ----------------------------------
#> Petal.Length
#> m 1.462
#> sd 0.173663996480184
#> ||||||||||||||||||||||||||||||||||
#> versicolor
#> Petal.Width
#> m 1.326
#> sd 0.197752680004544
#> ----------------------------------
#> Petal.Length
#> m 4.26
#> sd 0.469910977239958
#> ||||||||||||||||||||||||||||||||||
#> virginica
#> Petal.Width
#> m 2.026
#> sd 0.274650055636667
#> ----------------------------------
#> Petal.Length
#> m 5.552
#> sd 0.551894695663983
# One-var still works
lyt <- basic_table() %>%
split_rows_by("Species", section_div = "|") %>%
analyze("Petal.Width",
afun = function(x) list("m" = mean(x), "sd" = sd(x)), section_div = "-")
tbl <- build_table(lyt, iris)
tbl
#> all obs
#> ——————————————————————————————
#> setosa
#> m 0.246
#> ------------------------------
#> sd 0.105385589380046
#> ||||||||||||||||||||||||||||||
#> versicolor
#> m 1.326
#> ------------------------------
#> sd 0.197752680004544
#> ||||||||||||||||||||||||||||||
#> virginica
#> m 2.026
#> ------------------------------
#> sd 0.274650055636667 Created on 2024-03-06 with reprex v2.1.0 |
@Melkiades, I'm not yet that experienced in how to get your code updates available in my R environment. If you could provide me some guidance on that I can test it out for my situation. |
To test a feature by checking out a specific branch in Git and then installing an R package from that branch, follow the steps below. These steps assume you have Git and R installed on your machine, and you're somewhat familiar with using the command line for Git operations and R for package installation. Step 1: Clone the Repository (if you haven't already) - If you do just do
|
Thanks for the guidance on this. library(rtables)
#> Loading required package: formatters
#> Loading required package: magrittr
#>
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#>
#> str
library(tern)
#> Registered S3 method overwritten by 'tern':
#> method from
#> tidy.glm broom
lyt <- basic_table() %>%
split_rows_by("Species", section_div = "|") %>%
analyze(c("Petal.Width", "Petal.Length"),
afun = function(x) list("m" = mean(x), "sd" = sd(x)), section_div = "-")
tbl <- build_table(lyt, iris)
tbl
#> all obs
#> ——————————————————————————————————
#> setosa
#> Petal.Width
#> m 0.246
#> sd 0.105385589380046
#> ----------------------------------
#> Petal.Length
#> m 1.462
#> sd 0.173663996480184
#> ||||||||||||||||||||||||||||||||||
#> versicolor
#> Petal.Width
#> m 1.326
#> sd 0.197752680004544
#> ----------------------------------
#> Petal.Length
#> m 4.26
#> sd 0.469910977239958
#> ||||||||||||||||||||||||||||||||||
#> virginica
#> Petal.Width
#> m 2.026
#> sd 0.274650055636667
#> ----------------------------------
#> Petal.Length
#> m 5.552
#> sd 0.551894695663983
lyt0 <- basic_table() %>%
split_rows_by("ARM", section_div = "+") %>%
analyze(c("AGE","BMRKR1","RACE"),
afun = a_summary,
extra_args=list(
.stats=c("n","count_fraction","mean","sd")),
section_div = "~")
tbl0 <- build_table(lyt0, DM)
print(tbl0)
#> all obs
#> ——————————————————————————————————————————————————————————
#> A: Drug X
#> AGE
#> n 121
#> Mean 34.9
#> SD 7.8
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> BMRKR1
#> n 121
#> Mean 5.8
#> SD 3.0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> RACE
#> n 121
#> ASIAN 79 (65.3%)
#> BLACK OR AFRICAN AMERICAN 28 (23.1%)
#> WHITE 14 (11.6%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> B: Placebo
#> AGE
#> n 106
#> Mean 33.0
#> SD 6.3
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> BMRKR1
#> n 106
#> Mean 6.1
#> SD 3.2
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> RACE
#> n 106
#> ASIAN 68 (64.2%)
#> BLACK OR AFRICAN AMERICAN 24 (22.6%)
#> WHITE 14 (13.2%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> C: Combination
#> AGE
#> n 129
#> Mean 34.6
#> SD 6.5
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> BMRKR1
#> n 129
#> Mean 5.7
#> SD 3.4
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> RACE
#> n 129
#> ASIAN 84 (65.1%)
#> BLACK OR AFRICAN AMERICAN 27 (20.9%)
#> WHITE 18 (14%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
### section divider from multivar analyze call is a divider in between of vars, rather than stats: as I'd prefer
### also when extra analyze calls are added, the same divider is used, even if no longer explicitely mentioned: OK
lyt1 <- basic_table() %>%
split_rows_by("ARM", section_div = "+") %>%
analyze(c("AGE","BMRKR1","RACE"),
afun = a_summary,
extra_args=list(
.stats=c("n","count_fraction","mean","sd")),
section_div = "~") %>%
analyze(c("STRATA1")) %>%
analyze(c("SEX")) %>%
analyze(c("STRATA1","SEX"),table_names=c("STRATAv2","SEXv2"))
tbl1 <- build_table(lyt1, DM)
print(tbl1)
#> all obs
#> ——————————————————————————————————————————————————————————
#> A: Drug X
#> AGE
#> n 121
#> Mean 34.9
#> SD 7.8
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> BMRKR1
#> n 121
#> Mean 5.8
#> SD 3.0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> RACE
#> n 121
#> ASIAN 79 (65.3%)
#> BLACK OR AFRICAN AMERICAN 28 (23.1%)
#> WHITE 14 (11.6%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> STRATA1
#> A 36
#> B 41
#> C 44
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> SEX
#> F 70
#> M 51
#> U 0
#> UNDIFFERENTIATED 0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> STRATA1
#> A 36
#> B 41
#> C 44
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> SEX
#> F 70
#> M 51
#> U 0
#> UNDIFFERENTIATED 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> B: Placebo
#> AGE
#> n 106
#> Mean 33.0
#> SD 6.3
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> BMRKR1
#> n 106
#> Mean 6.1
#> SD 3.2
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> RACE
#> n 106
#> ASIAN 68 (64.2%)
#> BLACK OR AFRICAN AMERICAN 24 (22.6%)
#> WHITE 14 (13.2%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> STRATA1
#> A 33
#> B 40
#> C 33
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> SEX
#> F 56
#> M 50
#> U 0
#> UNDIFFERENTIATED 0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> STRATA1
#> A 33
#> B 40
#> C 33
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> SEX
#> F 56
#> M 50
#> U 0
#> UNDIFFERENTIATED 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> C: Combination
#> AGE
#> n 129
#> Mean 34.6
#> SD 6.5
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> BMRKR1
#> n 129
#> Mean 5.7
#> SD 3.4
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> RACE
#> n 129
#> ASIAN 84 (65.1%)
#> BLACK OR AFRICAN AMERICAN 27 (20.9%)
#> WHITE 18 (14%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> STRATA1
#> A 45
#> B 38
#> C 46
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> SEX
#> F 61
#> M 68
#> U 0
#> UNDIFFERENTIATED 0
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> STRATA1
#> A 45
#> B 38
#> C 46
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> SEX
#> F 61
#> M 68
#> U 0
#> UNDIFFERENTIATED 0
#### situation which would be more problematic:
### first analyze call is a single var analysis, followed by a multivar analyze call
### in order to get a section_div working at the second, you need to specify a section_div in the first, but this will be a divider in between stats
lyt2 <- basic_table() %>%
split_rows_by("ARM", section_div = "+") %>%
analyze(c("STRATA1")) %>%
analyze(c("AGE","BMRKR1","RACE"),
afun = a_summary,
extra_args=list(
.stats=c("n","count_fraction","mean","sd")),section_div = "~")
tbl2 <- build_table(lyt2, DM)
print(tbl2)
#> all obs
#> ——————————————————————————————————————————————————————————
#> A: Drug X
#> STRATA1
#> A 36
#> B 41
#> C 44
#> AGE
#> n 121
#> Mean 34.9
#> SD 7.8
#> BMRKR1
#> n 121
#> Mean 5.8
#> SD 3.0
#> RACE
#> n 121
#> ASIAN 79 (65.3%)
#> BLACK OR AFRICAN AMERICAN 28 (23.1%)
#> WHITE 14 (11.6%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> B: Placebo
#> STRATA1
#> A 33
#> B 40
#> C 33
#> AGE
#> n 106
#> Mean 33.0
#> SD 6.3
#> BMRKR1
#> n 106
#> Mean 6.1
#> SD 3.2
#> RACE
#> n 106
#> ASIAN 68 (64.2%)
#> BLACK OR AFRICAN AMERICAN 24 (22.6%)
#> WHITE 14 (13.2%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> C: Combination
#> STRATA1
#> A 45
#> B 38
#> C 46
#> AGE
#> n 129
#> Mean 34.6
#> SD 6.5
#> BMRKR1
#> n 129
#> Mean 5.7
#> SD 3.4
#> RACE
#> n 129
#> ASIAN 84 (65.1%)
#> BLACK OR AFRICAN AMERICAN 27 (20.9%)
#> WHITE 18 (14%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
### single analyze call without a section_div followed by a multivar analyze call with a section_div is not as I would expect it
lyt3 <- basic_table() %>%
split_rows_by("ARM", section_div = "+") %>%
analyze(c("STRATA1"),section_div = "~") %>%
analyze(c("AGE","BMRKR1","RACE"),
afun = a_summary,
extra_args=list(
.stats=c("n","count_fraction","mean","sd")))
tbl3 <- build_table(lyt3, DM)
print(tbl3)
#> all obs
#> ——————————————————————————————————————————————————————————
#> A: Drug X
#> STRATA1
#> A 36
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> B 41
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> C 44
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> AGE
#> n 121
#> Mean 34.9
#> SD 7.8
#> BMRKR1
#> n 121
#> Mean 5.8
#> SD 3.0
#> RACE
#> n 121
#> ASIAN 79 (65.3%)
#> BLACK OR AFRICAN AMERICAN 28 (23.1%)
#> WHITE 14 (11.6%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> B: Placebo
#> STRATA1
#> A 33
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> B 40
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> C 33
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> AGE
#> n 106
#> Mean 33.0
#> SD 6.3
#> BMRKR1
#> n 106
#> Mean 6.1
#> SD 3.2
#> RACE
#> n 106
#> ASIAN 68 (64.2%)
#> BLACK OR AFRICAN AMERICAN 24 (22.6%)
#> WHITE 14 (13.2%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
#> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#> C: Combination
#> STRATA1
#> A 45
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> B 38
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> C 46
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> AGE
#> n 129
#> Mean 34.6
#> SD 6.5
#> BMRKR1
#> n 129
#> Mean 5.7
#> SD 3.4
#> RACE
#> n 129
#> ASIAN 84 (65.1%)
#> BLACK OR AFRICAN AMERICAN 27 (20.9%)
#> WHITE 18 (14%)
#> AMERICAN INDIAN OR ALASKA NATIVE 0
#> MULTIPLE 0
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0
#> OTHER 0
#> UNKNOWN 0
### conclusion: only the section divider from the first analyze call is used
### if the first is a single var analyze call, this is not always a desirable outcome
### if the fist analyze call is a multivar analyze call, the outcome is acceptable
Created on 2024-03-07 with reprex v2.1.0 |
Thanks for your research! I think solutions to these would be complicated as you need each analysis split to be aware of other analysis splits. I prefer to keep each analysis call self-consistent. As it is now, you can still construct both scenarios (each analysis line separator) and var group separator if you do either add section_div for each single var call or put the multi var call first. I will double-check how complicated it is to make it a bit smarter, but I am not sure to find a straightforward solution; at the end, it all depends on the representation structure underneath and it is all dipendent on a recursive call of splits |
I'm happy with the current version, a single analysis var, followed by other (multivar) analyze calls is not how I would be using this most often |
Reporting an Issue with rtables
I'd like to get section divider only in between different variables from an analyze call, and not in between the multiple row stats from the analyze function.
As current behavior of div_section = "~" in an analyze call is to have this divider in between the rows as well as in between the stats, the only way to accomplish the wanted behavior is to use the setter method for div_section.
However, this approach does not always work as expected.
The dividers where character + has been used are incorrectly overwritten by NA.
The setter example from help function, works as expected.
SessionInfo:
stringr_1.5.0 tern_0.9.3.9006 rtables_0.6.6.9010 formatters_0.5.5.9009 dplyr_1.1.3
The text was updated successfully, but these errors were encountered: