diff --git a/Comp/r-sas_jonckheere.qmd b/Comp/r-sas_jonckheere.qmd new file mode 100644 index 000000000..f9c925f75 --- /dev/null +++ b/Comp/r-sas_jonckheere.qmd @@ -0,0 +1,60 @@ +--- +title: "R vs SAS on the Jonckheere-Terpstra test" +--- + +## Comparison + ++----------------------------------------------------------------------------+----------------+------------------+--------------------+------------------------------------------------------------+ +| Analysis | Supported in R | Supported in SAS | Results Match | Notes | ++============================================================================+================+==================+====================+============================================================+ +| Jonckheere-Terpstra test using normal approximation | Yes | Yes | Partial match | The test statistics was 184.5 from both languages. | +| | | | | | +| | | | | Regarding the p-value, R yields 0.002655, and SAS 0.002649 | ++----------------------------------------------------------------------------+----------------+------------------+--------------------+------------------------------------------------------------+ +| Jonckheere-Terpstra test using Monte Carlo approximation for an exact test | Yes | Yes | Partially matching | The resampling number is 10000. | +| | | | | | +| | | | | The test statistics was 184.5 from both languages. | +| | | | | | +| | | | | Regarding the p-value, R yields 0.0023, and SAS 0.0016 | ++----------------------------------------------------------------------------+----------------+------------------+--------------------+------------------------------------------------------------+ + +## Conclusion +### Results from normal approximation +For the test using normal approximation, the results look slightly different. +The reason for this gap may be either of the following. + +- Continuity correction +- Handling of ties in calculating the variance of the test statistics +- Numerical integration for normal distribution + +Regarding continuity correction, the SAS manual mentions that PROC FREQ does not +apply it. The DescTools manual does not mention anything about this point. + +Regarding variance of the test statistics, it depends only on the "cell counts" +in the context of cross tabulation analysis. From the viewpoint of rank tests, +it depends on the frequencies of each tie values. However, since the same test +statistics value was given by both R and SAS, it is less likely that a gap exists +in calculation variance between languages. + +Based on consideration above, the gap looks acceptable. However, it should +kept in mind that R and SAS may take different approaches in continuity +correction. + +### Results from Monte Carlo approximation of an exact test +For the test using simulation, the results also look slightly different. + +As mentioned above, R and SAS may take different approaches in continuity +correction and calculation of variance for the test statistics. In addition, +simulation-based results generally differ between different environments. + +The $95 \%$ CI for the approximate p-value given by SAS was $[0.0008, 0.0024]$. +Since the p-value from R, 0.0023, locates within the CI, this result looks +comparable. + +### Overall conclusion +Overall, the gap between R and SAS is accaptable regarding the Jonckheere-Terpstra +test. However, users should know that R and SAS may take different approaches +for the following aspects: + +- Continuity correction +- Handling of ties in calculating the variance of the test statistics diff --git a/R/jonckheere.qmd b/R/jonckheere.qmd new file mode 100644 index 000000000..1f286a1e8 --- /dev/null +++ b/R/jonckheere.qmd @@ -0,0 +1,69 @@ +--- +title: "Jonckheere-Terpstra test" +--- + +## Available R packages + +As far as I know, the following packages are available: + +- DescTools +- clinfun +- PMCMRplus +- fastJT + +Due to availability in the company, DescTools version 0.99.55 is used to compare the results with SAS. Of these packages DescTools is the most common. + +## Data used + +The data for testing is a sample dataset on a dose-response study. + +![](/images/jonckheere/jonck_bp.png) + +The Group indicates a dose of a drug. The scores for Group represent ordering of dose arms. Then the boxplot implies a declining dose-response relationship. + +## Example Code + +```{r} + +library(DescTools) +library(ggplot2) +library(readr) + +# +# Constants +k_n_samp <- 10000 + +set.seed(4989) +# +# The input dataset is imported. +# +inds <- read_csv("../data/jonck.csv", col_select = c(DOSE, value)) + + +# +# Analysis +# +jt_norm <- DescTools::JonckheereTerpstraTest( + value ~ DOSE, + alternative = "decreasing", + data = inds +) + +jt_resamp <- DescTools::JonckheereTerpstraTest( + value ~ DOSE, + alternative = "decreasing", + data = inds, + nperm = k_n_samp +) + +jt_norm +jt_resamp + + +``` + + + +## Reference + +Signorell A (2024). DescTools: Tools for Descriptive Statistics. R package version 0.99.55, https://github.com/AndriSignorell/DescTools/, https://andrisignorell.github.io/DescTools/. diff --git a/SAS/jonchkheere_terpstra.qmd b/SAS/jonchkheere_terpstra.qmd index f11b7c589..5bde2bbf5 100644 --- a/SAS/jonchkheere_terpstra.qmd +++ b/SAS/jonchkheere_terpstra.qmd @@ -12,19 +12,19 @@ The JT test is particularly well-suited for dose-response or trend analysis with To request Jonckheere-Terpstra test, specify the **JT** option in the Table statement like below: -```{r, eval=FALSE} +```{r, eval=FALSE} Proc freq; table Var1 * Var2 / JT ; Quit; ``` The JT option in the TABLES statement provides the Jonckheere-Terpstra test. -PROC FREQ also provides exact p-values for the Jonckheere-Terpstra test. You can request the exact test by specifying the **JT** option in the EXACT statement.$^{[3]}$ +PROC FREQ also provides exact p-values for the Jonckheere-Terpstra test. You can request the exact test by specifying the **JT** option in the EXACT statement.$^{[3]}$ ## Data used 1 This dataset has been generated using example data which aligned with the specifications outlined in the section on the Jonckheere–Terpstra test from reference \[5\]. It represents the duration of hospital stays for a randomly selected group of patients across three distinct ICU departments: cardiothoracic, medical, and neurosurgical. -```{r, eval=FALSE} +```{r, eval=FALSE} data ICU_Stay; input ICU $ Stay; label Stay = 'Length of Stay in Days'; @@ -53,13 +53,14 @@ proc sort data=ICU_Stay; run; ``` - ## Example Code using 1 The code performs a frequency analysis on the 'ICU_Stay' dataset, examining the relationship between 'ICU' and 'Stay' variables. It applies the Jonckheere-Terpstra test using JT option to identify trends in the ordered categorical 'Stay' variable. The output is streamlined by omitting percentages and totals for columns and rows with the 'nopercent nocol norow' options, emphasizing the Jonckheere-Terpstra test outcomes. -```{r, eval=FALSE} -proc freq data=ICU_Stay; table ICU * Stay / JT nopercent nocol norow; run; +```{r, eval=FALSE} +proc freq data=ICU_Stay; + table ICU * Stay / JT nopercent nocol norow; +run; ``` @@ -73,7 +74,7 @@ Comparing this with a standard Normal distribution gives a P value of 0.005, ind This dataset incorporates illustrative data extracted from reference \[3\]. It encapsulates the responses of subjects randomly assigned to one of four treatment arms: placebo, low dosage(20mg), medium dosage(60mg), and high dosage(180mg). The variable of interest is a continuous measure. The variable 'groupn' is used to provide an order of 'group'. -```{r, eval=FALSE} +```{r, eval=FALSE} data contin; input groupn group $ subject response; cards; @@ -109,14 +110,39 @@ run; The code is performing a Jonckheere-Terpstra trend test on a continuous 'response' variable, categorized by a 'group' variable, using the 'proc freq' procedure. The analysis is applied to the dataset named 'contin'. The result is presented with a title "Jonckheere-Terpstra Trend Test for Continuous Data", indicating the specific nature of the test being conducted. The 'JT' option is used to specify the Jonckheere-Terpstra test. -```{r, eval=FALSE} -proc freq data=contin; tables group * response/JT; title "Jonckheere-Terpstra Trend Test for Continuous Data"; run; +```{SAS, eval=FALSE} +proc freq data=contin; + tables group * response/JT; + title "Jonckheere-Terpstra Trend Test for Continuous Data"; +run; ``` + ## Test Result 2 ![Test Result 2](../../CAMIS/images/jonchkheere_terpstra/result2.png "Test Result 2") -There is a significant trend across different groups in the response gives a P value of <.0001. +There is a significant trend across different groups in the response gives a P value of \<.0001. + +## EXACT Options +With EXACT statement, the exact version and it Monte Carlo approximation can be also conducted. However, it should be noted that the exact test, i.e., a permuation test takes a long time to compelete the task even for a small dataset. + +```{SAS, eval = FALSE} +proc freq data = inds; + title "Asymptotic p-value calculation"; + table ICU * Stay / jt; + ods output JTTest = o_jt; +run; + +proc freq data = inds; + title "Approximation of exact test by resampling"; + table ICU * Stay / jt; + exact jt / mc seed = 4989 n = 10000 alpha = 0.05; + ods output JTTestMC = o_jt_sim; +run; + +``` + + ## Conclusion diff --git a/data/jonck.csv b/data/jonck.csv new file mode 100644 index 000000000..de219d82b --- /dev/null +++ b/data/jonck.csv @@ -0,0 +1,41 @@ +"","DOSE","value" +"1",0,153 +"2",0,134.2 +"3",0,139 +"4",0,124.2 +"5",0,157.4 +"6",0,128.8 +"7",0,140.1 +"8",0,134.1 +"9",0,129.1 +"10",0,135.3 +"11",1,124.6 +"12",1,129.5 +"13",1,138.9 +"14",1,124 +"15",1,124.6 +"16",1,124 +"17",1,135.4 +"18",1,129 +"19",1,118.5 +"20",1,136.3 +"21",2,140.1 +"22",2,141.3 +"23",2,122.8 +"24",2,143.1 +"25",2,116.4 +"26",2,117.5 +"27",2,136.5 +"28",2,130.9 +"29",2,126.2 +"30",2,139.2 +"31",3,115.7 +"32",3,115.6 +"33",3,118.6 +"34",3,121.2 +"35",3,131.7 +"36",3,109.8 +"37",3,133.8 +"38",3,118.6 +"39",3,132.9 +"40",3,120.6 diff --git a/images/jonckheere/jonck_bp.png b/images/jonckheere/jonck_bp.png new file mode 100644 index 000000000..b68c4e4ee Binary files /dev/null and b/images/jonckheere/jonck_bp.png differ diff --git a/images/mmrm/review-convergence-rate-missingness-1.png b/images/mmrm/review-convergence-rate-missingness-1.png deleted file mode 100644 index 7c6942e10..000000000 Binary files a/images/mmrm/review-convergence-rate-missingness-1.png and /dev/null differ diff --git a/images/mmrm/review-treatment-bcva-1.png b/images/mmrm/review-treatment-bcva-1.png deleted file mode 100644 index 52c06b3f0..000000000 Binary files a/images/mmrm/review-treatment-bcva-1.png and /dev/null differ diff --git a/images/mmrm/review-treatment-bcva-2.png b/images/mmrm/review-treatment-bcva-2.png deleted file mode 100644 index eed43a3fc..000000000 Binary files a/images/mmrm/review-treatment-bcva-2.png and /dev/null differ diff --git a/images/mmrm/review-treatment-fev-1.png b/images/mmrm/review-treatment-fev-1.png deleted file mode 100644 index 249a8f0f3..000000000 Binary files a/images/mmrm/review-treatment-fev-1.png and /dev/null differ diff --git a/renv.lock b/renv.lock index 2290c3a46..9b9b35c94 100644 --- a/renv.lock +++ b/renv.lock @@ -76,6 +76,35 @@ ], "Hash": "cd52c065c9e687c60c56b51f10f7bcd3" }, + "DescTools": { + "Package": "DescTools", + "Version": "0.99.57", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "Exact", + "MASS", + "R", + "Rcpp", + "base", + "boot", + "cli", + "data.table", + "expm", + "gld", + "grDevices", + "graphics", + "httr", + "methods", + "mvtnorm", + "readxl", + "rstudioapi", + "stats", + "utils", + "withr" + ], + "Hash": "3cdd73c0534df2123efc25236ec23295" + }, "DiceDesign": { "Package": "DiceDesign", "Version": "1.10", @@ -86,6 +115,20 @@ ], "Hash": "ac8b12951882c375d1a14f64c93e78f1" }, + "Exact": { + "Package": "Exact", + "Version": "3.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "graphics", + "rootSolve", + "stats", + "utils" + ], + "Hash": "72eeb792b6ae3bef92624916c328bcff" + }, "FactoMineR": { "Package": "FactoMineR", "Version": "2.11", @@ -245,13 +288,6 @@ ], "Hash": "0776bf7526869e0286b0463cb72fb211" }, - "PropCIs": { - "Package": "PropCIs", - "Version": "0.3-0", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "83c746e8590a3e64d791daca69f1bf27" - }, "R6": { "Package": "R6", "Version": "2.5.1", @@ -1265,6 +1301,17 @@ ], "Hash": "790c46974d99ff51442f4d134b2d70eb" }, + "expm": { + "Package": "expm", + "Version": "1.0-0", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "Matrix", + "methods" + ], + "Hash": "a44b2810f36c1cda5d52eee6ec96cafa" + }, "factoextra": { "Package": "factoextra", "Version": "1.0.7", @@ -1608,6 +1655,19 @@ ], "Hash": "c5ba8f5056487403a299b91984be86ca" }, + "gld": { + "Package": "gld", + "Version": "2.6.6", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "e1071", + "graphics", + "lmom", + "stats" + ], + "Hash": "71173258033324618dc8a09b3e27269e" + }, "glmnet": { "Package": "glmnet", "Version": "4.1-8", @@ -2333,6 +2393,18 @@ ], "Hash": "16a08fc75007da0d08e0c0388c7c33e6" }, + "lmom": { + "Package": "lmom", + "Version": "3.2", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "graphics", + "stats" + ], + "Hash": "a75d2dd42af9b4cb291815889cdcff77" + }, "lmtest": { "Package": "lmtest", "Version": "0.9-40", @@ -3519,6 +3591,19 @@ ], "Hash": "062470668513dcda416927085ee9bdc7" }, + "rootSolve": { + "Package": "rootSolve", + "Version": "1.8.2.4", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "grDevices", + "graphics", + "stats" + ], + "Hash": "c6fa270a97604238a5ce5fe5d327fdad" + }, "rpact": { "Package": "rpact", "Version": "4.1.0",