Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: add_rowcounts doesn't work if layout begins with non-population variable (eg AVISIT) #1060

Open
anajens opened this issue Sep 13, 2023 · 2 comments
Labels
bug Something isn't working sme

Comments

@anajens
Copy link
Contributor

anajens commented Sep 13, 2023

What happened?

In the layout we first split by AVISIT (non-population dataset variable) and then ARM. When trying to add the row counts (N=xx) form alt_counts_df = ADSL there is an error because ADSL does not include AVISIT.

Not very nice workaround is to add a dummy AVISIT to ADSL and repeat the dataset as a many times as there are levels in AVISIT.

Noticed this when working on PKPT03.

library(rtables)
library(tern)
library(scda)
library(dplyr)

adsl <- synthetic_cdisc_dataset("latest", "adsl")
advs <- synthetic_cdisc_dataset("latest", "advs") %>%
  filter(AVISITN %in% c(0, 1)) %>%
  filter(PARAMCD %in% c("SYSBP", "DIABP"))

lyt <- basic_table() %>%
  split_cols_by_multivar(c("AVAL", "CHG")) %>%
  split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
  split_rows_by("ARM", split_fun = drop_split_levels) %>%
  add_rowcounts(alt_counts = TRUE) %>%
  split_rows_by("PARAM", split_fun = drop_split_levels) %>%
  analyze_colvars(afun = mean, format = "xx.x")

# Error
build_table(lyt, advs, alt_counts_df = adsl)

Error

Error: Following error encountered in splitting alt_counts_df:  variable(s) [AVISIT] not present in data. (VarLevelSplit)
# not pretty workaround: dummy adsl with visit
adsl_visit <- rbind(adsl, adsl) %>%
  select(ARM) %>%
  mutate(
    AVISIT = factor(rep.int(c("BASELINE", "WEEK 1 DAY 8"), c(nrow(adsl), nrow(adsl))))
  )

build_table(lyt, advs, alt_counts_df = adsl_visit)

Desired output

BASELINE                                   
  A: Drug X (N=134)                        
    Diastolic Blood Pressure               
      mean                     96.5    0.0 
    Systolic Blood Pressure                
      mean                     151.7   0.0 
  B: Placebo (N=134)                       
    Diastolic Blood Pressure               
      mean                     101.1   0.0 
    Systolic Blood Pressure                
      mean                     149.5   0.0 
  C: Combination (N=132)                   
    Diastolic Blood Pressure               
      mean                     102.8   0.0 
    Systolic Blood Pressure                
      mean                     144.7   0.0 
WEEK 1 DAY 8                               
  A: Drug X (N=134)                        
    Diastolic Blood Pressure               
      mean                     100.6   4.1 
...

Example 2: With trim_levels_in_group split function

Set to NA a specific combination of the split vars - we want to keep this displayed in the table as missing

advs_miss <- advs %>%
  mutate(
    AVAL = if_else(
      AVISIT == "BASELINE" & ARM == "A: Drug X" & PARAMCD == "DIABP",
    NA, AVAL),
    CHG = if_else(
      AVISIT == "BASELINE" & ARM == "A: Drug X" & PARAMCD == "DIABP",
      NA, CHG)
  )

lyt_trim <- basic_table() %>%
  split_cols_by_multivar(c("AVAL", "CHG")) %>%
  split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
  split_rows_by("ARM", split_fun = trim_levels_in_group("PARAMCD")) %>% ## change split fun here <------
  add_rowcounts(alt_counts = TRUE) %>%
  split_rows_by("PARAMCD") %>%
  analyze_colvars(afun = mean, format = "xx.x")

build_table(lyt_trim, advs_miss, alt_counts_df = adsl_visit)

GIves an error because PARAMCD is not it alt_counts_df:

Error: Following error encountered in splitting alt_counts_df: Error applying custom split function: no applicable method for 'droplevels' applied to an object of class "NULL"
	split: VarLevelSplit (ARM)
	occured at path: AVISIT[BASELINE]

Now we add PARAMCD to alt_counts_df to avoid the error:

adsl_avisit_param <- adsl_visit %>%
  mutate(PARAMCD = factor(NA_character_, levels = levels(advs$PARAMCD)))

build_table(lyt_trim, advs_miss, alt_counts_df = adsl_avisit_param)

And get the desired table:

                          AVAL    CHG 
———————————————————————————————————————
BASELINE                               
  A: Drug X (N=134)                    
    DIABP                              
      mean                  NA      NA 
    SYSBP                              
      mean                 151.7   0.0 
  B: Placebo (N=134)                   
    DIABP                              
      mean                 101.1   0.0 
    SYSBP                              
      mean                 149.5   0.0 
...
@anajens anajens added bug Something isn't working sme labels Sep 13, 2023
@anajens
Copy link
Contributor Author

anajens commented Sep 14, 2023

@Melkiades I added a second example to the issue.

tl;dr : alt_counts_df (ADSL) needs to be pre-processed to include potentially all variables from the row splits in the layout (depending on type of split functions used).

From rtables perspective this makes sense, it's just not very user friendly.

@edelarua
Copy link
Contributor

Related to #535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sme
Projects
None yet
Development

No branches or pull requests

2 participants