Enable different tests per cutting for `sim_gs_n()` #229

jdblischak · 2024-04-18T20:00:25Z

My previous PR #215 enabled passing a different test function to be applied at each cutting orchestrated by sim_gs_n(). However, it was quite limited because it only worked when the test functions gave the exact same output (which was rare).

The recent PR #227 from @LittleBeannie standardized the output format of the test functions (see Issue #222 for tracking the progress of this effort). This means we can now combine the results from more diverse test functions.

The trickiest part of this PR was accommodating the tests that return more than one result (milestone() returns 2 values for se and maxcombo() returns 2 values for z). To continue returning a single data frame, I had to use list columns.

With this PR, it is now possible to combine the results of wlr() (with any arguments), rmst(), and milestone(). Below is example output where wlr() was applied at the first cutting, rmst() at the second cutting, and milestone() at the third cutting:

head(result)
##      method          parameter  estimation                     se         z analysis cut_date sim_id
## 1       WLR FH(rho=0, gamma=0) -28.1948254               7.521418 -3.748605        1 24.00000      1
## 2      RMST                 20   2.8531309              0.7147367  3.991863        2 32.00000      1
## 3 milestone                 10   0.1752731 0.03495270, 0.03484071 12.078381        3 45.00000      1
## 4       WLR FH(rho=0, gamma=0) -26.8487111               7.721484 -3.477144        1 24.00000      2
## 5      RMST                 20   2.3996924              0.7282783  3.295021        2 32.00000      2
## 6 milestone                 10   0.1173894 0.03543263, 0.03491216  5.457930        3 46.21933      2
##     n event
## 1 400   229
## 2 400   295
## 3 400   355
## 4 400   241
## 5 400   290
## 6 400   350

str(result)
## 'data.frame':	9 obs. of  10 variables:
##  $ method    : chr  "WLR" "RMST" "milestone" "WLR" ...
##  $ parameter : chr  "FH(rho=0, gamma=0)" "20" "10" "FH(rho=0, gamma=0)" ...
##  $ estimation: num  -28.195 2.853 0.175 -26.849 2.4 ...
##  $ se        :List of 9
##   ..$ : num 7.52
##   ..$ : num 0.715
##   ..$ : num  0.035 0.0348
##   ..$ : num 7.72
##   ..$ : num 0.728
##   ..$ : num  0.0354 0.0349
##   ..$ : num 7.5
##   ..$ : num 0.742
##   ..$ : num  0.0354 0.0347
##  $ z         : num  -3.75 3.99 12.08 -3.48 3.3 ...
##  $ analysis  : int  1 2 3 1 2 3 1 2 3
##  $ cut_date  : num  24 32 45 24 32 ...
##  $ sim_id    : int  1 1 1 2 2 2 3 3 3
##  $ n         : int  400 400 400 400 400 400 400 400 400
##  $ event     : num  229 295 355 241 290 350 226 282 350

Future work:

The results from maxcombo() are still too heterogenous to combine with the other test functions
If we are satisfied with this setup, I can send a future PR to enable applying multiple tests per cutting

LittleBeannie · 2024-04-22T15:07:03Z

Hi @jdblischak , can we give an error message when people implement maxcombo test in sim_gs_n? Besides, shall we delete the examples with maxcombo tests?

jdblischak · 2024-04-22T17:35:09Z

can we give an error message when people implement maxcombo test in sim_gs_n?

Unfortunately that is not straightforward since create_test() obscures the name of the original test function. Below is what the argument test, a list of functions, looks like when passed to sim_gs_n(). I generated this when passing wlr(), rmst(), and milestone() tests created with create_test().

str(test)
List of 3
 $ :function (data)  
  ..- attr(*, "srcref")= 'srcref' int [1:8] 380 3 382 3 3 3 380 382
  .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x0000025bb811edb0> 
 $ :function (data)  
  ..- attr(*, "srcref")= 'srcref' int [1:8] 380 3 382 3 3 3 380 382
  .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x0000025bb811edb0> 
 $ :function (data)  
  ..- attr(*, "srcref")= 'srcref' int [1:8] 380 3 382 3 3 3 380 382
  .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x0000025bb811edb0>

I could try to add some error catching code that notices when the number of columns (or the column names) is different and remind the user that the test functions must return consistent results in order to be combined. But I'd prefer to address that in a future PR.

Besides, shall we delete the examples with maxcombo tests?

I agree we should delete the examples that combine maxcombo with other tests, ie the current examples 8 and 9. I'll delete those now.

simtrial/R/sim_gs_n.R

Lines 200 to 201 in 42290ea

    
           #' # Test 8: MaxCombo (WLR-FH(0,0.5) + milestone(10)) 
        
           #' # for all analyses

simtrial/R/sim_gs_n.R

Lines 216 to 217 in 42290ea

    
           #' # Test 9: MaxCombo (WLR-FH(0,0) at IAs 
        
           #' # and WLR-FH(0,0) + milestone(10) + WLR-MB(4,2) at FA)

However, I think we can leave example 7, which applies maxcombo at every cutting.

simtrial/R/sim_gs_n.R

Lines 186 to 187 in 42290ea

    
           #' # Test 7: MaxCombo (WLR-FH(0,0) + WLR-FH(0, 0.5)) 
        
           #' # for all analyses

jdblischak · 2024-04-22T17:54:20Z

The CI jobs are failing because the Suggested dependency {bshazard} was archived on Saturday, 2024-04-20.

https://cran.r-project.org/package=bshazard

The CRAN package check problems appear trivial to fix (documentation related), so I am guessing it is no longer maintained.

https://cran-archive.r-project.org/web/checks/2024/2024-04-20_check_results_bshazard.html

LittleBeannie · 2024-04-22T18:38:06Z

The CI jobs are failing because the Suggested dependency {bshazard} was archived on Saturday, 2024-04-20.

https://cran.r-project.org/package=bshazard

The CRAN package check problems appear trivial to fix (documentation related), so I am guessing it is no longer maintained.

https://cran-archive.r-project.org/web/checks/2024/2024-04-20_check_results_bshazard.html

Yeah, I got the same message. Could you please suggest a way to fix it?

jdblischak · 2024-04-22T18:46:48Z

Could you please suggest a way to fix it?

Long term, we should stop using it in the vignette arbitrary-hazard.Rmd

simtrial/vignettes/arbitrary-hazard.Rmd

Line 100 in 42290ea

fit <- bshazard(Surv(tte, event) ~ 1, data = y, nk = 120)

In the short term, if we are hoping the maintainers might fix it, we can install it from the CRAN mirror on GitHub by adding the following to DESCRIPTION:

Remotes:  
    cran/bshazard

Note that we won't be able to submit to CRAN using the GitHub-installed version

You can read more about the Remotes field at the links below

https://r-pkgs.org/dependencies-in-practice.html#sec-dependencies-nonstandard
https://devtools.r-lib.org/articles/dependencies.html

LittleBeannie · 2024-04-22T18:50:38Z

Could you please suggest a way to fix it?

Long term, we should stop using it in the vignette arbitrary-hazard.Rmd

simtrial/vignettes/arbitrary-hazard.Rmd

Line 100 in 42290ea

fit <- bshazard(Surv(tte, event) ~ 1, data = y, nk = 120)

In the short term, if we are hoping the maintainers might fix it, we can install it from the CRAN mirror on GitHub by adding the following to DESCRIPTION:
Remotes:  
    cran/bshazard
Note that we won't be able to submit to CRAN using the GitHub-installed version

You can read more about the Remotes field at the links below

https://r-pkgs.org/dependencies-in-practice.html#sec-dependencies-nonstandard https://devtools.r-lib.org/articles/dependencies.html

Can users install bshazard from anywhere?

jdblischak · 2024-04-22T18:55:34Z

Can users install bshazard from anywhere?

Technically yes, but I don't think that helps us. We need this package to be installed in CI and on CRAN to pass R CMD check. Also, relying on https://github.com/cran/bshazard isn't a great long-term option. This is simply an unofficial mirror of packages available on CRAN. I couldn't find any explicit policy on what they do with archived packages, but certainly there is no guarantee it will continue to exist now that CRAN has archived it.

LittleBeannie

Thanks for working on this and continously improving sim_gs_n. After reviewing, I have 1 minor comment, and 1 question.

1 Minor comment

Can we arrange the order of the output? The screenshot below is always the columns order we preferred.

1 Question

It appears that the current version of sim_gs_n only permits one type of test throughout the analyses. I attempted a similar example, but it did not run successfully (code attached). In this particular case, I aimed to conduct a WLR-FH test at IA1, WLR-MB test at IA2, while performing a milestone test at FA. Is it possible to incorporate this functionality into sim_gs_n as well?

I have added this example as Example 7 of the help file in sim_gs_n at my commit 6cb1d39. Can we get this example work?

I talked with Keaven this afternoon, and he suggested to add multiple tests at a single analysis. Examples are provided in the help file in sim_gs_n at my commit 8173979 (see example 8). Can we also get this example work?

jdblischak · 2024-04-26T14:55:35Z

It appears that the current version of sim_gs_n only permits one type of test throughout the analyses.

The sole purpose of this PR is to enable different tests per cutting. The caveat is that the tests must return the same results, as discussed in #222.

Currently WLR-FH and WLR-MB return incompatible results. The former uses estimation and the latter uses estimate.

library("simtrial")
x <- sim_pw_surv(n = 200) |> cut_data_by_event(100)

ia1_test <- create_test(wlr, weight = fh(rho = 0, gamma = 0.5))
ia2_test <- create_test(wlr, weight = mb(delay = 6, w_max = Inf))

x |> ia1_test() |> names()
## [1] "method"     "parameter"  "estimation" "se"         "z"  
x |> ia2_test() |> names()
## [1] "method"    "parameter" "estimate"  "se"        "z"

jdblischak · 2024-04-26T17:01:54Z

Can we arrange the order of the output? The screenshot below is always the columns order we preferred.

Kind of. Technically sim_gs_n() is supposed to be agnostic to the names returned by the test function. For example, this is why maxcombo() can be used by itself even though it returns different names compared to wlr(), etc.

I've updated the column order to the following. Is this close enough to what you want? If it's really important, I can move method and parameter to the 2nd and 3rd positions, with the caveat that this will reduce the flexibility, ie all test functions will then be required to return method and parameter.

  sim_id analysis cut_date   n event method          parameter  estimate       se         z
1      1        1 24.00000 400   229    WLR FH(rho=0, gamma=0) -28.19483 7.521418 -3.748605
2      1        2 32.00000 400   295    WLR FH(rho=0, gamma=0) -38.32581 8.459808 -4.530340
3      1        3 45.00000 400   355    WLR FH(rho=0, gamma=0) -39.49230 9.149248 -4.316453
4      2        1 24.00000 400   241    WLR FH(rho=0, gamma=0) -26.84871 7.721484 -3.477144
5      2        2 32.00000 400   290    WLR FH(rho=0, gamma=0) -32.54824 8.425310 -3.863150
6      2        3 46.21933 400   350    WLR FH(rho=0, gamma=0) -30.06631 9.172772 -3.277778
7      3        1 24.00000 400   226    WLR FH(rho=0, gamma=0) -23.06302 7.498065 -3.075863
8      3        2 32.00000 400   282    WLR FH(rho=0, gamma=0) -30.16330 8.333910 -3.619345
9      3        3 50.86585 400   350    WLR FH(rho=0, gamma=0) -38.75506 9.178027 -4.222592

LittleBeannie

Thanks, @jdblischak, it looks great to me and I only have 1 minor comment of the column order. Could you please order the columns as shown in the following screenshot?

milestone() was recently overhauled to 1. Use the log-log method to calculate the `estimate`. Setting `test_type = "naive"` restored the previous value 1. Return a single value for `se` instead of two 1. Return the unsquared Z statistic for `z` Merck#237

jdblischak · 2024-05-07T14:17:58Z

I updated the column order so that "method" and "parameter" are the 2nd and 3rd columns

  sim_id method          parameter analysis cut_date   n event  estimate       se         z
1      1    WLR FH(rho=0, gamma=0)        1 24.00000 400   229 -28.19483 7.521418 -3.748605
2      1    WLR FH(rho=0, gamma=0)        2 32.00000 400   295 -38.32581 8.459808 -4.530340
3      1    WLR FH(rho=0, gamma=0)        3 45.00000 400   355 -39.49230 9.149248 -4.316453
4      2    WLR FH(rho=0, gamma=0)        1 24.00000 400   241 -26.84871 7.721484 -3.477144
5      2    WLR FH(rho=0, gamma=0)        2 32.00000 400   290 -32.54824 8.425310 -3.863150
6      2    WLR FH(rho=0, gamma=0)        3 46.21933 400   350 -30.06631 9.172772 -3.277778
7      3    WLR FH(rho=0, gamma=0)        1 24.00000 400   226 -23.06302 7.498065 -3.075863
8      3    WLR FH(rho=0, gamma=0)        2 32.00000 400   282 -30.16330 8.333910 -3.619345
9      3    WLR FH(rho=0, gamma=0)        3 50.86585 400   350 -38.75506 9.178027 -4.222592

LittleBeannie

I find it excellent! This demonstrates tremendous effort. Thank you, @jdblischak!

jdblischak · 2024-05-07T20:27:49Z

FYI I just realized when updating locally that man/sim_gs_n.Rd wasn't updated to change the indentation applied by the styler. This is purely cosmetic and has no effect on the software. Just wanted to let you all know in case any of you notice that this file is updated the next time you work on the package

jdblischak requested review from nanxstats and LittleBeannie April 18, 2024 20:00

jdblischak self-assigned this Apr 18, 2024

cmansch mentioned this pull request Apr 19, 2024

Enhancement for doFuture implementation and stable parallel RNG in sim_gs_n #230

Closed

jdblischak mentioned this pull request Apr 23, 2024

Release simtrial 0.4.0 #235

Closed

15 tasks

jdblischak force-pushed the different-tests-per-cutting branch from 31cd5e2 to d67cf5f Compare April 23, 2024 14:56

LittleBeannie reviewed Apr 25, 2024

View reviewed changes

jdblischak requested a review from LittleBeannie April 26, 2024 17:12

jdblischak mentioned this pull request Apr 26, 2024

Standardize the output format of the test functions #222

Closed

LittleBeannie mentioned this pull request May 2, 2024

Require a single formula syntax for rmst() #242

Merged

LittleBeannie added the development New feature or request label May 6, 2024

jdblischak force-pushed the different-tests-per-cutting branch from 2abc4fa to 73b29d1 Compare May 6, 2024 15:57

LittleBeannie reviewed May 6, 2024

View reviewed changes

jdblischak and others added 8 commits May 7, 2024 09:58

Uncomment sim_gs_n() tests

9a4164e

Fix sim_gs_n() for wlr() and rmst() test functions

8c994d3

sim_gs_n: use list columns for test functions milestone() and maxcombo()

58fb49c

maxcombo can not be combined with the other test functions

2ea803b

delete the maxcombo test at the examples

b98eab2

test -> example

a31f84e

add example 7

782e121

update Rd file

ad45e1d

LittleBeannie and others added 10 commits May 7, 2024 09:58

cmd check error

3988f45

add example 8: there are multiple tests at a single analysis

decefdd

Fix comments in sim_gs_n() examples

2a94ecd

The test functions now return "estimate" instead of "estimation"

4b278a7

Multiple tests per cut is not supported yet

e7dfada

Rearrange columns of data frame returned by sim_gs_n()

f875edf

Update sim_gs_n() tests

a3172c5

Apply styler:::style_active_pkg()

4d0a601

sim_gs_n: move "method" and "parameter" to 2nd and 3rd columns

19a6683

jdblischak force-pushed the different-tests-per-cutting branch from 0b03e6c to 19a6683 Compare May 7, 2024 14:14

jdblischak requested a review from LittleBeannie May 7, 2024 14:18

LittleBeannie approved these changes May 7, 2024

View reviewed changes

nanxstats approved these changes May 7, 2024

View reviewed changes

nanxstats merged commit bac3da0 into Merck:main May 7, 2024
7 checks passed

jdblischak deleted the different-tests-per-cutting branch May 7, 2024 20:22

nanxstats mentioned this pull request May 7, 2024

Run roxygen2 #248

Merged

cmansch mentioned this pull request May 13, 2024

Adding %doFuture% parallel framework for the sim_gs_n.R code and upda… #249

Merged

jdblischak mentioned this pull request Jun 12, 2024

Enable multiple tests per cut for sim_gs_n() #258

Open

jdblischak mentioned this pull request Aug 20, 2024

sim_gs_n: combine hetergenous test columns, fill missing with NAs #277

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable different tests per cutting for `sim_gs_n()` #229

Enable different tests per cutting for `sim_gs_n()` #229

jdblischak commented Apr 18, 2024 •

edited

Loading

LittleBeannie commented Apr 22, 2024

jdblischak commented Apr 22, 2024

jdblischak commented Apr 22, 2024

LittleBeannie commented Apr 22, 2024

jdblischak commented Apr 22, 2024

LittleBeannie commented Apr 22, 2024

jdblischak commented Apr 22, 2024

LittleBeannie left a comment •

edited

Loading

jdblischak commented Apr 26, 2024

jdblischak commented Apr 26, 2024

LittleBeannie left a comment

jdblischak commented May 7, 2024

LittleBeannie left a comment

jdblischak commented May 7, 2024

Enable different tests per cutting for sim_gs_n() #229

Enable different tests per cutting for sim_gs_n() #229

Conversation

jdblischak commented Apr 18, 2024 • edited Loading

LittleBeannie commented Apr 22, 2024

jdblischak commented Apr 22, 2024

jdblischak commented Apr 22, 2024

LittleBeannie commented Apr 22, 2024

jdblischak commented Apr 22, 2024

LittleBeannie commented Apr 22, 2024

jdblischak commented Apr 22, 2024

LittleBeannie left a comment • edited Loading

Choose a reason for hiding this comment

1 Minor comment

1 Question

jdblischak commented Apr 26, 2024

jdblischak commented Apr 26, 2024

LittleBeannie left a comment

Choose a reason for hiding this comment

jdblischak commented May 7, 2024

LittleBeannie left a comment

Choose a reason for hiding this comment

jdblischak commented May 7, 2024

Enable different tests per cutting for `sim_gs_n()` #229

Enable different tests per cutting for `sim_gs_n()` #229

jdblischak commented Apr 18, 2024 •

edited

Loading

LittleBeannie left a comment •

edited

Loading