Merge pull request #137 from dbosak01/main

Added Paired T-Test
PSIAIMS · Jan 30, 2024 · 46bf241 · 46bf241
2 parents 0268e66 + fcc1a37
commit 46bf241
Show file tree

Hide file tree

Showing 9 changed files with 319 additions and 7 deletions.
diff --git a/Comp/r-sas_ttest_Paired.qmd b/Comp/r-sas_ttest_Paired.qmd
@@ -0,0 +1,47 @@
+---
+title: "R vs SAS Paired T-Test"
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+library(procs)
+```
+
+# Paired t-test Comparison
+
+The following table shows the types of Paired t-test analysis, the capabilities of each language, and whether or not the results from each language match.
+
+| Analysis                      | Supported in R                            | Supported in SAS                       | Results Match    | Notes                                                 |
+|---------------|---------------|---------------|---------------|---------------|
+| Paired t-test, normal data    | [Yes](../R/ttest_Paired.html#normal)      | [Yes](../SAS/ttest_Paired.html#normal) | [Yes](#normal)   | In Base R, use `paired = TRUE` on `t.test()` function |
+| Paired t-test, lognormal data | [Maybe](../R/ttest_Paired.html#lognormal) | [Yes](../SAS/ttest_Paired.html#lognormal) | [NA](#lognormal) | May be supported by **envstats** package              |
+
+## Comparison Results
+
+### Normal Data {#normal}
+
+Here is a table of comparison values between `t.test()`, `proc_ttest()`, and SAS `PROC TTEST`:
+
+| Statistic          | t.test() | proc_ttest() | PROC TTEST | Match | Notes |
+|--------------------|----------|--------------|------------|-------|-------|
+| Degrees of Freedom | 11       | 11           | 11         | Yes   |       |
+| t value            | -1.089648	|-1.089648    | -1.089648	     | Yes   |      |
+| p value            | 0.2992   | 0.2992       | 0.2992     | Yes   |       |
+
+### Lognormal Data {#lognormal}
+
+Since there is currently no known support for lognormal t-test in R, this comparison is not applicable.
+
+# Summary and Recommendation
+
+For normal data, the R paired t-test capabilities are comparable to SAS. Comparison between SAS and R show identical results for the datasets tried. The **procs** package `proc_ttest()` function is very similar to SAS in the syntax and output produced. `proc_ttest()` also supports by groups, where `t.test()` does not.
+
+For the lognormal version of the t-test, it does not appear to be supported in the **stats** or **procs** package. It may be supported in the **envstats** package. More exploration is needed to determine whether this package will produce the expected results, and whether the results will match SAS.
+
+# References
+
+R `t.test()` documentation: <https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/t.test>
+
+R `proc_ttest()` documentation: <https://procs.r-sassy.org/reference/proc_ttest.html>
+
+SAS `PROC TTEST` Paired analysis documentation: <https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_ttest_syntax08.htm>
diff --git a/R/ttest_Paired.qmd b/R/ttest_Paired.qmd
@@ -0,0 +1,80 @@
+---
+title: "Paired t-test"
+output: html_document
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+```
+
+# **Paired t-test in R**
+
+The Paired t-test is used when two samples are naturally correlated. In the Paired t-test, the difference of the means between the two samples is compared to a given number that represents the null hypothesis. For a Paired t-test, the number of observations in each sample must be equal.
+
+In R, a Paired t-test can be performed using the Base R `t.test()` from the **stats** package or the `proc_ttest()` function from the **procs** package.
+
+## Normal Data {#normal}
+
+By default, the R paired t-test functions assume normality in the data and use a classic Student's t-test.
+
+### Data Used
+
+The following data was used in this example.
+
+```{r eval=TRUE, echo = TRUE}
+# Create sample data
+pressure <- tibble::tribble(
+  ~SBPbefore, ~SBPafter,
+  120, 128,   
+  124, 131,   
+  130, 131,   
+  118, 127,
+  140, 132,   
+  128, 125,   
+  140, 141,   
+  135, 137,
+  126, 118,   
+  130, 132,   
+  126, 129,   
+  127, 135
+)
+```
+
+### Base R
+
+#### Code
+
+The following code was used to test the comparison in Base R.
+
+```{r eval=TRUE, echo = TRUE}
+
+  # Perform t-test
+  t.test(pressure$SBPbefore, pressure$SBPafter, paired = TRUE)
+
+```
+
+### Procs Package
+
+#### Code
+
+The following code from the **procs** package was used to perform a paired t-test.
+
+```{r eval=TRUE, echo = TRUE, message=FALSE, warning=FALSE}
+  library(procs)
+
+  # Perform t-test
+  proc_ttest(pressure,
+     paired = "SBPbefore*SBPafter")
+```
+
+Viewer Output:
+
+```{r, echo=FALSE, fig.align='center', out.width="50%"}
+knitr::include_graphics("../images/ttest/paired_rtest1.png")
+```
+
+## Lognormal Data {#lognormal}
+
+The Base R `t.test()` function does not have an option for lognormal data. Likewise, the **procs** `proc_ttest()` function also does not have an option for lognormal data.
+
+One possibility may be the `tTestLnormAltPower()` function from the **EnvStats** package. This package has not been evaluated yet.
diff --git a/SAS/ttest.qmd → SAS/ttest_2Sample.qmd b/SAS/ttest.qmd → SAS/ttest_2Sample.qmd
@@ -1,13 +1,13 @@
 ---
-title: "Students t-test"
+title: "Independant Two-Sample t-test"
 output: html_document
 ---
 
 ```{r setup, include=FALSE}
 knitr::opts_chunk$set(echo = TRUE)
 ```
 
-### **Independant Two-Sample t-test in SAS**
+### **Independent Two-Sample t-test in SAS**
 
 The null hypothesis of the Independent Samples t-test is, the means for the two populations are equal.
 
@@ -37,7 +37,7 @@ Here the t-value is --0.70, degrees of freedom is 30 and P value is 0.4912 which
 
 Note: Before entering straight into the t-test we need to check whether the assumptions (like the equality of variance, the observations should be independent, observations should be normally distributed) are met or not. If normality is not satisfied, we may consider using a suitable non-parametric test.
 
-1.  Normality: You can check for data to be normally distributed by plotting a histogram of the data by treatment. Alternatively, you can use the Shapiro-Wilk test or the Kolmogorov-Smirnov test. If the test is <0.05 and your sample is quite small then this suggests you should not use the t-test. However, if your sample in each treatment group is large (say >30 in each group), then you do not need to rely so heavily on the assumption that the data have an underlying normal distribution in order to apply the two-sample t-test. This is where plotting the data using histograms can help to support investigation into the normality assumption. We have checked the normality of the observations using the code below. Here for both the treatment groups we have P value greater than 0.05 (Shapiro-Wilk test is used), therefore the normality assumption is there for our data.
+1.  Normality: You can check for data to be normally distributed by plotting a histogram of the data by treatment. Alternatively, you can use the Shapiro-Wilk test or the Kolmogorov-Smirnov test. If the test is \<0.05 and your sample is quite small then this suggests you should not use the t-test. However, if your sample in each treatment group is large (say \>30 in each group), then you do not need to rely so heavily on the assumption that the data have an underlying normal distribution in order to apply the two-sample t-test. This is where plotting the data using histograms can help to support investigation into the normality assumption. We have checked the normality of the observations using the code below. Here for both the treatment groups we have P value greater than 0.05 (Shapiro-Wilk test is used), therefore the normality assumption is there for our data.
 
 ```{r}
 #| eval: false 
@@ -65,7 +65,7 @@ knitr::include_graphics("../images/ttest/trt_sas.png")
 knitr::include_graphics("../images/ttest/placb_sas.png")
 ```
 
-2.  Homogeneity of variance (or Equality of variance): Homogeniety of variance will be tested by default in PROC TTEST itself by Folded F-test. In our case the P values is 0.6981 which is greater than 0.05. So we accept the null hypothesis of F-test, i.e. variances are same. Then we will consider the pooled method for t-test. If the F test is statistically significant (p<0.05), then the pooled t-test may give erroneous results. In this instance, if it is believed that the population variances may truly differ, then the Satterthwaite (unequal variances) analysis results should be used. These are provided in the SAS output alongside the Pooled results as default.
+2.  Homogeneity of variance (or Equality of variance): Homogeniety of variance will be tested by default in PROC TTEST itself by Folded F-test. In our case the P values is 0.6981 which is greater than 0.05. So we accept the null hypothesis of F-test, i.e. variances are same. Then we will consider the pooled method for t-test. If the F test is statistically significant (p\<0.05), then the pooled t-test may give erroneous results. In this instance, if it is believed that the population variances may truly differ, then the Satterthwaite (unequal variances) analysis results should be used. These are provided in the SAS output alongside the Pooled results as default.
 
 Output:
 

diff --git a/SAS/ttest_Paired.qmd b/SAS/ttest_Paired.qmd
@@ -0,0 +1,82 @@
+---
+title: "Paired t-test"
+output: html_document
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+```
+
+# **Paired t-test in SAS**
+
+The Paired t-test is used when two samples are naturally correlated. In the Paired t-test, the difference of the means between the two samples is compared to a given number that represents the null hypothesis. For a Paired t-test, the number of observations in each sample must be equal.
+
+In SAS, a Paired t-test is typically performed using PROC TTEST.
+
+## Normal Data {#normal}
+
+By default, SAS PROC TTEST t-test assumes normality in the data and uses a classic Student's t-test.
+
+### Data Used
+
+The following data was used in this example.
+
+```         
+  data pressure;
+     input SBPbefore SBPafter @@;
+     datalines;
+  120 128   124 131   130 131   118 127
+  140 132   128 125   140 141   135 137
+  126 118   130 132   126 129   127 135
+  ;
+```
+
+### Code
+
+The following code was used to test the comparison of two paired samples of Systolic Blood Pressure before and after a procedure.
+
+```         
+  proc ttest data=pressure;
+     paired SBPbefore*SBPafter;
+  run;
+```
+
+Output:
+
+```{r, echo=FALSE, fig.align='center', out.width="50%"}
+knitr::include_graphics("../images/ttest/paired_test1.png")
+```
+
+## Lognormal Data {#lognormal}
+
+The SAS paired t-test also supports analysis of lognormal data. Here is the data used for the lognormal analysis.
+
+### Data
+
+```         
+  data auc;
+     input TestAUC RefAUC @@;
+     datalines;
+  103.4 90.11  59.92 77.71  68.17 77.71  94.54 97.51
+  69.48 58.21  72.17 101.3  74.37 79.84  84.44 96.06
+  96.74 89.30  94.26 97.22  48.52 61.62  95.68 85.80
+  ;
+```
+
+### Code
+
+For cases when the data is lognormal, SAS offers the "DIST" option to chose between a normal and lognormal distribution. The procedure also offers the TOST option to specify the equivalence bounds.
+
+```         
+  proc ttest data=auc dist=lognormal tost(0.8, 1.25);
+     paired TestAUC*RefAUC;
+  run;
+```
+
+Output:
+
+```{r, echo=FALSE, fig.align='center', out.width="70%"}
+knitr::include_graphics("../images/ttest/paired_test2.png")
+```
+
+As can be seen in the figure above, the lognormal variation of the TTEST procedure offers additional results for geometric mean, coefficient of variation, and TOST equivalence analysis. The output also includes multiple p-values.
diff --git a/data/stat_method_tbl.csv b/data/stat_method_tbl.csv
@@ -1,8 +1,9 @@
-method_grp,method_subgrp,r_links,sas_links,comparison_links
+method_grp,method_subgrp,r_links,sas_links,comparison_links
 Summary Statistics,Rounding,[R](R/rounding),[SAS](SAS/rounding),[R vs SAS](Comp/r-sas_rounding)
 Summary Statistics,Summary statistics,[R](R/summary-stats),[SAS](SAS/summary-stats),[R vs SAS](Comp/r-sas-summary-stats)
-General Linear Models,Students t-test,,[SAS](SAS/ttest),
-General Linear Models,Paired t-test,,,
+General Linear Models,One Sample t-test,,,
+General Linear Models,Paired t-test,[R](R/ttest_Paired),[SAS](SAS/ttest_Paired),[R vs SAS](Comp/r-sas_ttest_Paired)
+General Linear Models,Two Sample t-test,,[SAS](SAS/ttest_2Sample),
 General Linear Models,ANOVA,[R](R/anova),[SAS](SAS/anova),[R vs SAS](Comp/r-sas_anova)
 General Linear Models,ANCOVA,[R](R/ancova),,
 General Linear Models,MANOVA,[R](R/manova),[SAS](SAS/manova),[R vs SAS](Comp/r-sas_manova)

diff --git a/images/ttest/paired_rtest1.png b/images/ttest/paired_rtest1.png
diff --git a/images/ttest/paired_test1.png b/images/ttest/paired_test1.png
diff --git a/images/ttest/paired_test2.png b/images/ttest/paired_test2.png