Skip to content

Commit

Permalink
Merge pull request #285 from Tsumiyama-Isao/jonckheere
Browse files Browse the repository at this point in the history
Jonckheere
  • Loading branch information
statasaurus authored Oct 10, 2024
2 parents 1db1b36 + f67f4a7 commit 10fc881
Show file tree
Hide file tree
Showing 10 changed files with 298 additions and 17 deletions.
60 changes: 60 additions & 0 deletions Comp/r-sas_jonckheere.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: "R vs SAS on the Jonckheere-Terpstra test"
---

## Comparison

+----------------------------------------------------------------------------+----------------+------------------+--------------------+------------------------------------------------------------+
| Analysis | Supported in R | Supported in SAS | Results Match | Notes |
+============================================================================+================+==================+====================+============================================================+
| Jonckheere-Terpstra test using normal approximation | Yes | Yes | Partial match | The test statistics was 184.5 from both languages. |
| | | | | |
| | | | | Regarding the p-value, R yields 0.002655, and SAS 0.002649 |
+----------------------------------------------------------------------------+----------------+------------------+--------------------+------------------------------------------------------------+
| Jonckheere-Terpstra test using Monte Carlo approximation for an exact test | Yes | Yes | Partially matching | The resampling number is 10000. |
| | | | | |
| | | | | The test statistics was 184.5 from both languages. |
| | | | | |
| | | | | Regarding the p-value, R yields 0.0023, and SAS 0.0016 |
+----------------------------------------------------------------------------+----------------+------------------+--------------------+------------------------------------------------------------+

## Conclusion
### Results from normal approximation
For the test using normal approximation, the results look slightly different.
The reason for this gap may be either of the following.

- Continuity correction
- Handling of ties in calculating the variance of the test statistics
- Numerical integration for normal distribution

Regarding continuity correction, the SAS manual mentions that PROC FREQ does not
apply it. The DescTools manual does not mention anything about this point.

Regarding variance of the test statistics, it depends only on the "cell counts"
in the context of cross tabulation analysis. From the viewpoint of rank tests,
it depends on the frequencies of each tie values. However, since the same test
statistics value was given by both R and SAS, it is less likely that a gap exists
in calculation variance between languages.

Based on consideration above, the gap looks acceptable. However, it should
kept in mind that R and SAS may take different approaches in continuity
correction.

### Results from Monte Carlo approximation of an exact test
For the test using simulation, the results also look slightly different.

As mentioned above, R and SAS may take different approaches in continuity
correction and calculation of variance for the test statistics. In addition,
simulation-based results generally differ between different environments.

The $95 \%$ CI for the approximate p-value given by SAS was $[0.0008, 0.0024]$.
Since the p-value from R, 0.0023, locates within the CI, this result looks
comparable.

### Overall conclusion
Overall, the gap between R and SAS is accaptable regarding the Jonckheere-Terpstra
test. However, users should know that R and SAS may take different approaches
for the following aspects:

- Continuity correction
- Handling of ties in calculating the variance of the test statistics
69 changes: 69 additions & 0 deletions R/jonckheere.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: "Jonckheere-Terpstra test"
---

## Available R packages

As far as I know, the following packages are available:

- DescTools
- clinfun
- PMCMRplus
- fastJT

Due to availability in the company, DescTools version 0.99.55 is used to compare the results with SAS. Of these packages DescTools is the most common.

## Data used

The data for testing is a sample dataset on a dose-response study.

![](/images/jonckheere/jonck_bp.png)

The Group indicates a dose of a drug. The scores for Group represent ordering of dose arms. Then the boxplot implies a declining dose-response relationship.

## Example Code

```{r}
library(DescTools)
library(ggplot2)
library(readr)
#
# Constants
k_n_samp <- 10000
set.seed(4989)
#
# The input dataset is imported.
#
inds <- read_csv("../data/jonck.csv", col_select = c(DOSE, value))
#
# Analysis
#
jt_norm <- DescTools::JonckheereTerpstraTest(
value ~ DOSE,
alternative = "decreasing",
data = inds
)
jt_resamp <- DescTools::JonckheereTerpstraTest(
value ~ DOSE,
alternative = "decreasing",
data = inds,
nperm = k_n_samp
)
jt_norm
jt_resamp
```



## Reference

Signorell A (2024). DescTools: Tools for Descriptive Statistics. R package version 0.99.55, https://github.com/AndriSignorell/DescTools/, https://andrisignorell.github.io/DescTools/.
46 changes: 36 additions & 10 deletions SAS/jonchkheere_terpstra.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,19 @@ The JT test is particularly well-suited for dose-response or trend analysis with

To request Jonckheere-Terpstra test, specify the **JT** option in the Table statement like below:

```{r, eval=FALSE}
```{r, eval=FALSE}
Proc freq; table Var1 * Var2 / JT ; Quit;
```

The JT option in the TABLES statement provides the Jonckheere-Terpstra test.

PROC FREQ also provides exact p-values for the Jonckheere-Terpstra test. You can request the exact test by specifying the **JT** option in the EXACT statement.$^{[3]}$
PROC FREQ also provides exact p-values for the Jonckheere-Terpstra test. You can request the exact test by specifying the **JT** option in the EXACT statement.$^{[3]}$

## Data used 1

This dataset has been generated using example data which aligned with the specifications outlined in the section on the Jonckheere–Terpstra test from reference \[5\]. It represents the duration of hospital stays for a randomly selected group of patients across three distinct ICU departments: cardiothoracic, medical, and neurosurgical.

```{r, eval=FALSE}
```{r, eval=FALSE}
data ICU_Stay;
input ICU $ Stay;
label Stay = 'Length of Stay in Days';
Expand Down Expand Up @@ -53,13 +53,14 @@ proc sort data=ICU_Stay;
run;
```


## Example Code using 1

The code performs a frequency analysis on the 'ICU_Stay' dataset, examining the relationship between 'ICU' and 'Stay' variables. It applies the Jonckheere-Terpstra test using JT option to identify trends in the ordered categorical 'Stay' variable. The output is streamlined by omitting percentages and totals for columns and rows with the 'nopercent nocol norow' options, emphasizing the Jonckheere-Terpstra test outcomes.

```{r, eval=FALSE}
proc freq data=ICU_Stay; table ICU * Stay / JT nopercent nocol norow; run;
```{r, eval=FALSE}
proc freq data=ICU_Stay;
table ICU * Stay / JT nopercent nocol norow;
run;
```

Expand All @@ -73,7 +74,7 @@ Comparing this with a standard Normal distribution gives a P value of 0.005, ind

This dataset incorporates illustrative data extracted from reference \[3\]. It encapsulates the responses of subjects randomly assigned to one of four treatment arms: placebo, low dosage(20mg), medium dosage(60mg), and high dosage(180mg). The variable of interest is a continuous measure. The variable 'groupn' is used to provide an order of 'group'.

```{r, eval=FALSE}
```{r, eval=FALSE}
data contin;
input groupn group $ subject response;
cards;
Expand Down Expand Up @@ -109,14 +110,39 @@ run;

The code is performing a Jonckheere-Terpstra trend test on a continuous 'response' variable, categorized by a 'group' variable, using the 'proc freq' procedure. The analysis is applied to the dataset named 'contin'. The result is presented with a title "Jonckheere-Terpstra Trend Test for Continuous Data", indicating the specific nature of the test being conducted. The 'JT' option is used to specify the Jonckheere-Terpstra test.

```{r, eval=FALSE}
proc freq data=contin; tables group * response/JT; title "Jonckheere-Terpstra Trend Test for Continuous Data"; run;
```{SAS, eval=FALSE}
proc freq data=contin;
tables group * response/JT;
title "Jonckheere-Terpstra Trend Test for Continuous Data";
run;
```

## Test Result 2

![Test Result 2](../../CAMIS/images/jonchkheere_terpstra/result2.png "Test Result 2")

There is a significant trend across different groups in the response gives a P value of <.0001.
There is a significant trend across different groups in the response gives a P value of \<.0001.

## EXACT Options
With EXACT statement, the exact version and it Monte Carlo approximation can be also conducted. However, it should be noted that the exact test, i.e., a permuation test takes a long time to compelete the task even for a small dataset.

```{SAS, eval = FALSE}
proc freq data = inds;
title "Asymptotic p-value calculation";
table ICU * Stay / jt;
ods output JTTest = o_jt;
run;
proc freq data = inds;
title "Approximation of exact test by resampling";
table ICU * Stay / jt;
exact jt / mc seed = 4989 n = 10000 alpha = 0.05;
ods output JTTestMC = o_jt_sim;
run;
```



## Conclusion

Expand Down
41 changes: 41 additions & 0 deletions data/jonck.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"","DOSE","value"
"1",0,153
"2",0,134.2
"3",0,139
"4",0,124.2
"5",0,157.4
"6",0,128.8
"7",0,140.1
"8",0,134.1
"9",0,129.1
"10",0,135.3
"11",1,124.6
"12",1,129.5
"13",1,138.9
"14",1,124
"15",1,124.6
"16",1,124
"17",1,135.4
"18",1,129
"19",1,118.5
"20",1,136.3
"21",2,140.1
"22",2,141.3
"23",2,122.8
"24",2,143.1
"25",2,116.4
"26",2,117.5
"27",2,136.5
"28",2,130.9
"29",2,126.2
"30",2,139.2
"31",3,115.7
"32",3,115.6
"33",3,118.6
"34",3,121.2
"35",3,131.7
"36",3,109.8
"37",3,133.8
"38",3,118.6
"39",3,132.9
"40",3,120.6
Binary file added images/jonckheere/jonck_bp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file removed images/mmrm/review-treatment-bcva-1.png
Binary file not shown.
Binary file removed images/mmrm/review-treatment-bcva-2.png
Binary file not shown.
Binary file removed images/mmrm/review-treatment-fev-1.png
Binary file not shown.
Loading

0 comments on commit 10fc881

Please sign in to comment.