-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path08.Rmd
785 lines (600 loc) · 35 KB
/
08.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
# ~~JAGS~~ brms
```{r, echo = FALSE, cachse = FALSE}
knitr::opts_chunk$set(fig.retina = 2.5)
knitr::opts_chunk$set(fig.align = "center")
options(width = 100)
```
We, of course, will be using **brms** in place of JAGS.
## ~~JAGS~~ brms and its relation to R
In the opening paragraph in his [GitHub repository for **brms**](https://github.com/paul-buerkner/brms), Bürkner explained:
> The **brms** package provides an interface to fit Bayesian generalized (non-)linear multivariate multilevel models using Stan, which is a C++ package for performing full Bayesian inference (see [http://mc-stan.org/](http://mc-stan.org/)). The formula syntax is very similar to that of the package lme4 to provide a familiar and simple interface for performing regression analyses. A wide range of response distributions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, and even self-defined mixture models all in a multilevel context. Further modeling options include non-linear and smooth terms, auto-correlation structures, censored data, missing value imputation, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Multivariate models (i.e., models with multiple response variables) can be fit, as well. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. Model fit can easily be assessed and compared with posterior predictive checks, cross-validation, and Bayes factors. (**emphasis** in the original)
Bürkner's **brms** repository includes many helpful links, such as to where [**brms** lives on CRAN](https://CRAN.R-project.org/package=brms), a [list of blog posts](https://paul-buerkner.github.io/blog/brms-blogposts/) highlighting **brms**, and [a forum](https://discourse.mc-stan.org) where users can ask questions about **brms** in specific or about Stan in general.
You can install the current official version of **brms** in the same way you would any other **R** package (i.e., `install.packages("brms", dependencies = T)`). If you want the current developmental version, you could download it from GitHub by executing the following.
```{r, eval = F}
if (!requireNamespace("devtools")) {
install.packages("devtools")
}
devtools::install_github("paul-buerkner/brms")
```
## A complete example
We express the likelihood for our coin toss example as
$$y_{i} \sim \operatorname{Bernoulli}(\theta).$$
Our prior will be
$$\theta \sim \operatorname{Beta}(\alpha, \beta).$$
Kruschke pictured the relationships among the data, the likelihood, and the prior in the model diagram in Figure 8.2 (p. 196). If you're tricky, you can make those in **R**. I'm not going to walk my method out in great detail in this ebook. If you want a step-by-step tutorial, check my blog post, [*Make model diagrams, Kruschke style*](https://solomonkurz.netlify.app/post/2020-03-09-make-model-diagrams-kruschke-style/). Here's the code for our version of Figure 8.2.
```{r, warning = F, message = F}
library(tidyverse)
library(patchwork)
# plot of a beta density
p1 <-
tibble(x = seq(from = .01, to = .99, by = .01),
d = (dbeta(x, 2, 2)) / max(dbeta(x, 2, 2))) %>%
ggplot(aes(x = x, y = d)) +
geom_area(fill = "grey67") +
annotate(geom = "text",
x = .5, y = .2,
label = "beta",
size = 7) +
annotate(geom = "text",
x = .5, y = .6,
label = "italic(A)*', '*italic(B)",
size = 7, family = "Times", parse = TRUE) +
scale_x_continuous(expand = c(0, 0)) +
theme_void() +
theme(axis.line.x = element_line(size = 0.5))
## an annotated arrow
# save our custom arrow settings
my_arrow <- arrow(angle = 20, length = unit(0.35, "cm"), type = "closed")
p2 <-
tibble(x = .5,
y = 1,
xend = .5,
yend = 0) %>%
ggplot(aes(x = x, xend = xend,
y = y, yend = yend)) +
geom_segment(arrow = my_arrow) +
annotate(geom = "text",
x = .375, y = 1/3,
label = "'~'",
size = 10, family = "Times", parse = T) +
xlim(0, 1) +
theme_void()
# bar plot of Bernoulli data
p3 <-
tibble(x = 0:1,
d = (dbinom(x, size = 1, prob = .6)) / max(dbinom(x, size = 1, prob = .6))) %>%
ggplot(aes(x = x, y = d)) +
geom_col(fill = "grey67", width = .4) +
annotate(geom = "text",
x = .5, y = .2,
label = "Bernoulli",
size = 7) +
annotate(geom = "text",
x = .5, y = .94,
label = "theta",
size = 7, family = "Times", parse = T) +
xlim(-.75, 1.75) +
theme_void() +
theme(axis.line.x = element_line(size = 0.5))
# another annotated arrow
p4 <-
tibble(x = c(.375, .625),
y = c(1/3, 1/3),
label = c("'~'", "italic(i)")) %>%
ggplot(aes(x = x, y = y, label = label)) +
geom_text(size = c(10, 7), parse = T, family = "Times") +
geom_segment(x = .5, xend = .5,
y = 1, yend = 0,
arrow = my_arrow) +
xlim(0, 1) +
theme_void()
# some text
p5 <-
tibble(x = 1,
y = .5,
label = "italic(y[i])") %>%
ggplot(aes(x = x, y = y, label = label)) +
geom_text(size = 7, parse = T, family = "Times") +
xlim(0, 2) +
theme_void()
```
```{r fig.height = 3.5, fig.width = 2}
layout <- c(
area(t = 1, b = 2, l = 1, r = 1),
area(t = 3, b = 3, l = 1, r = 1),
area(t = 4, b = 5, l = 1, r = 1),
area(t = 6, b = 6, l = 1, r = 1),
area(t = 7, b = 7, l = 1, r = 1)
)
(p1 + p2 + p3 + p4 + p5) +
plot_layout(design = layout) &
ylim(0, 1) &
theme(plot.margin = margin(0, 5.5, 0, 5.5))
```
> Diagrams like Figure 8.2 should be scanned from the bottom up. This is because models of data always start with the data, then conceive of a likelihood function that describes the data values in terms of meaningful parameters, and finally determine a prior distribution over the parameters. (p. 196)
### Load data.
"Logically, models of data start with the data. We must know their basic scale and structure to conceive of a descriptive model" (p. 197).
Here we load the data with the `readr::read_csv()` function, the **tidyverse** version of base **R** `read.csv()`.
```{r, message = F}
my_data <- read_csv("data.R/z15N50.csv")
```
Unlike what Kruschke wrote about JAGS, the **brms** package does not require us to convert the data into a list. It can handle data in lists or data frames, of which [tibbles are a special case](https://cran.r-project.org/web/packages/tibble/vignettes/tibble.html). Here are what the data look like.
```{r}
glimpse(my_data)
```
We might visualize them in a bar plot.
```{r, fig.width = 3, fig.height = 3, warning = F, message = F}
library(cowplot)
my_data %>%
mutate(y = y %>% as.character()) %>%
ggplot(aes(x = y)) +
geom_bar() +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
theme_minimal_hgrid()
```
For the past couple chapters, we've been using the `cowplot::theme_cowplot()` theme for our plots. The **cowplot** package, however, includes a few more. The `theme_minimal_grid()` is similar, but subtracts the axis lines and adds in minimalistic grid lines. In his [-@Wilke2020Themes] [*Themes* vignette](https://wilkelab.org/cowplot/articles/themes.html), however, Wilke showed vertical grid lines are often unsightly in bar plots. To avoid offending Wilke with our grid lines, we used the `theme_minimal_hgrid()` theme, which only added horizontal grid lines.
Anyway, if you wanted to compute "Ntotal", the number of rows in our tibble, one way is with `count()`.
```{r}
my_data %>%
count()
```
However, we're not going to do anything with an "Ntotal" value. For **brms**, the data are fine in their current data frame form. No need for a `dataList`.
### Specify model.
Let's open **brms**.
```{r, warning = F, message = F}
library(brms)
```
The **brms** package does not have code blocks following the JAGS format or the sequence in Kruschke's diagrams. Rather, its syntax is modeled in part after the popular frequentist mixed-effects package, [**lme4**](https://cran.r-project.org/web/packages/lme4/index.html). To learn more about how **brms** compares to **lme4**, see Bürkner's [-@burknerBrmsPackageBayesian2017] overview, [*brms: An R package for Bayesian multilevel models
using Stan*](https://cran.r-project.org/web/packages/brms/vignettes/brms_overview.pdf).
The primary function in **brms** is `brm()`. Into this one function we will specify the data, the model, the likelihood function, the prior(s), and any technical settings such as the number of MCMC chains, iterations, and so forth. You can order the arguments in any way you like. My typical practice is to start with `data`, `family` (i.e., the likelihood function), the model `formula`, and my `prior`s. If there are any technical specifications such as the number of MCMC iterations I'd like to change from their default values, I usually do that last.
Here's how to fit the model.
```{r fit8.1}
fit8.1 <-
brm(data = my_data,
family = bernoulli(link = identity),
formula = y ~ 1,
prior(beta(2, 2), class = Intercept, lb = 0, ub = 1),
iter = 500 + 3334, warmup = 500, chains = 3,
seed = 8,
file = "fits/fit08.01")
```
Also note our use of the `file` argument. This automatically saved the fit object as an external file. You don't have to do that and you can avoid doing so by omitting the `file` argument. For a more detailed explanation of the `brms::brm()` function, spend some time with the `brm` section of the [**brms** reference manual](https://CRAN.R-project.org/package=brms) [@R-brms].
### Initialize chains.
In Stan, and in **brms** by extension, the initial values have default settings. In the [Initialization section of the Program Execution chapter](https://mc-stan.org/docs/2_29/reference-manual/initialization.html) in the *Stan reference manual, Version 2.29* [@standevelopmentteamStanReferenceManual2022] we read:
> If there are no user-supplied initial values, the default initialization strategy is to initialize the unconstrained parameters directly with values drawn uniformly from the interval $(−2, 2)$. The bounds of this initialization can be changed but it is always symmetric around $0$. The value of $0$ is special in that it represents the median of the initialization. An unconstrained value of $0$ corresponds to different parameter values depending on the constraints declared on the parameters.
In general, I do not recommend setting custom initial values in **brms** or Stan. Under the hood, Stan will transform the parameters to the unconstrained space in models where they are bounded. In our Bernoulli model, $\theta$ is bounded at 0 and 1. A little further down in the same section, we read:
> For parameters bounded above and below, the initial value of $0$ on the unconstrained scale corresponds to a value at the midpoint of the constraint interval. For probability parameters, bounded below by $0$ and above by $1$, the transform is the inverse logit, so that an initial unconstrained value of $0$ corresponds to a constrained value of $0.5$, $-2$ corresponds to $0.12$ and $2$ to $0.88$. Bounds other than $0$ and $1$ are just scaled and translated.
If you want to play around with this, have at it. In my experience, it sometimes helps to set these manually to zero, which you can do that by specifying `inits = 0` within `brm()`. But if you really want to experiment, you might check out my blog post, [*Don't forget your inits*](https://solomonkurz.netlify.app/post/2021-06-05-don-t-forget-your-inits/).
### Generate chains.
By default, **brms** will use 4 chains of 2,000 iterations each. The type of MCMC **brms** uses is Hamiltonian Monte Carlo (HMC). You can learn more about HMC at the Stan website, [https://mc-stan.org](https://mc-stan.org), which includes resources such as the [*Stan user's guide*](https://mc-stan.org/docs/2_29/stan-users-guide/index.html) [@standevelopmentteamStanUserGuide2022], the [*Stan reference manual*](https://mc-stan.org/docs/2_29/reference-manual/) [@standevelopmentteamStanReferenceManual2022], and a list of [tutorials](https://mc-stan.org/users/documentation/tutorials). McElreath has a [nice intro lecture](https://www.youtube.com/watch?v=BWEtS3HuU5A) on MCMC in general and HMC in particular. Michael Bentacourt has some good lectures on Stan and HMC, such as [here](https://youtu.be/pHsuIaPbNbY) and [here](https://www.youtube.com/watch?v=jUSZboSq1zg). And, of course, we will cover HMC with Kruschke in [Chapter 14][Stan].
Within each HMC chain, the first $n$ iterations are warmups. Within the Stan-HMC paradigm, [warmups are somewhat analogous to but not synonymous with burn-in iterations](https://andrewgelman.com/2017/12/15/burn-vs-warm-iterative-simulation-algorithms/) as done by the Gibbs sampling in JAGS. But HMC warmups are like Gibbs burn-ins in that both are discarded and not used to describe the posterior. As such, the **brms** default settings yield 1,000 post-warmup iterations for each of the 4 HMC chains. However, we specified `iter = 500 + 3334, warmup = 500, chains = 3`. Thus instead of defaults, we have 3 HMC chains. Each chain has 500 + 3,334 = 3,834 total iterations, of which 500 were discarded `warmup` iterations.
To learn more about the warmup stage in Stan, check out the [HMC Algorithm Parameters section of the MCMC Sampling chapter](https://mc-stan.org/docs/2_26/reference-manual/hmc-algorithm-parameters.html) of the *Stan reference manual*.
### Examine chains.
The `brms::plot()` function returns a density and trace plot for each model parameter.
```{r, fig.width = 10, fig.height = 1.5}
plot(fit8.1)
```
Note how the `brms::plot()` function simply took our model fit object as input. Other post-processing functions will require us to pass the posterior draws in the form of a data frame. To get ready for them, here we'll save them as a data frame with the `as_draws_df()` function.
```{r}
draws <- as_draws_df(fit8.1)
```
If you want to display each chain as its own density, you can use the handy `mcmc_dens_overlay()` function from the [**bayesplot** package](https://CRAN.R-project.org/package=bayesplot).
```{r, warning = F, message = F}
library(bayesplot)
```
Now we're ready to use our `draws` object within the `mcmc_dens_overlay()` function to return the overlaid densities.
```{r, fig.width = 4, fig.height = 2, message = F, warning = F}
draws %>%
mutate(chain = .chain) %>%
mcmc_dens_overlay(pars = vars(b_Intercept)) +
theme_minimal_hgrid()
```
The `bayesplot::mcmc_acf()` function will give us the autocorrelation plots.
```{r, fig.width = 4, fig.height = 4}
draws %>%
mutate(chain = .chain) %>%
mcmc_acf(pars = vars(b_Intercept), lags = 35) +
theme_minimal_hgrid()
```
With **brms** functions, we get a sole $\widehat R$ value for each parameter rather than a running vector.
```{r}
rhat(fit8.1)["b_Intercept"]
```
We'll have to employ `brms::as.mcmc()` and `coda::gelman.plot()` to make our running $\widehat R$ plot.
```{r, fig.width = 4, fig.height = 3, warning = F}
fit8.1_c <- as.mcmc(fit8.1)
coda::gelman.plot(fit8.1_c[, "b_Intercept", ])
```
For whatever reason, many of the package developers within the Stan/**brms** ecosystem don't seem interested in shrink factor plots, like this.
#### ~~The `plotPost` function~~ How to plot your brms posterior distributions.
We'll get into plotting in just a moment. But before we do, here's a summary of the model.
```{r}
print(fit8.1)
```
To summarize a posterior in terms of central tendency, **brms** defaults to the mean value (i.e., the value in the 'Estimate' column of the `print()` output). In many of the other convenience functions, you can also request the median instead. For example, we can set `robust = TRUE` to get the 'Estimate' in terms of the median.
```{r}
posterior_summary(fit8.1, robust = T)
```
Across functions, the intervals default to 95%. With `print()` and `summary()` you can adjust the level with a `prob` argument. For example, here we'll use 50% intervals.
```{r}
print(fit8.1, prob = .5)
```
But in many other **brms** convenience functions, you can use the `probs` argument to request specific percentile summaries.
```{r}
posterior_summary(fit8.1, probs = c(.025, .25, .75, .975))
```
Regardless of what `prob` or `probs` levels you use, **brms** functions always return percentile-based estimates. All this central tendency and interval talk will be important in a moment...
When plotting the posterior distribution of a parameter estimated with **brms**, you typically do so working with the results of an object returned by one of the `as_draws_` functions. Recall we already saved those results as `draws`. Here's a look at `draws`.
```{r}
head(draws)
```
With `draws` in hand, we can use **ggplot2** to do the typical distributional plots, such as with `geom_histogram()`.
```{r, fig.width = 4, fig.height = 2.5, warning = F, message = F}
draws %>%
ggplot(aes(x = b_Intercept)) +
geom_histogram(color = "grey92", fill = "grey67",
size = .2) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
labs(title = "Theta via ggplot2::geom_histogram()",
x = expression(theta)) +
theme_minimal_hgrid() +
theme(plot.title.position = "plot")
```
The `bayesplot::mcmc_areas()` function offers a nice way to depict the posterior densities, along with their percentile-based 50% and 95% ranges.
```{r, fig.width = 4, fig.height = 2.5, message = F, warning = F}
mcmc_areas(
draws,
pars = vars(b_Intercept),
prob = 0.5,
prob_outer = 0.95,
point_est = "mean"
) +
scale_y_discrete(expand = expansion(mult = c(0, 0.05))) +
labs(title = "Theta via bayesplot::mcmc_areas()",
x = expression(theta)) +
theme_minimal_hgrid() +
theme(plot.title.position = "plot")
```
**brms** doesn't have a convenient way to compute the posterior mode or HDIs. Base **R** is no help, either. But Matthew Kay's **tidybayes** package makes it easy to compute posterior modes and HDIs with handy functions like `stat_halfeye()` and `stat_histinterval()`.
```{r, fig.width = 4, fig.height = 2.5, warning = F, message = F}
library(tidybayes)
draws %>%
ggplot(aes(x = b_Intercept, y = 0)) +
stat_halfeye(point_interval = mode_hdi, .width = c(.95, .5)) +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = expression(theta*" via tidybayes::stat_halfeye()"),
x = expression(theta)) +
theme_minimal_hgrid()
draws %>%
ggplot(aes(x = b_Intercept, y = 0)) +
stat_histinterval(point_interval = mode_hdi, .width = c(.95, .5)) +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = expression(theta*" via tidybayes::stat_histinterval()"),
x = expression(theta)) +
theme_minimal_hgrid()
```
The `tidybayes::stat_halfeye()` function returns a density with a measure of the posterior's central tendency in a dot and one or multiple interval bands as horizontal lines at the base of the density. The `stat_histinterval()` function returns much the same, but replaces the density with a histogram. For both functions, the `point_interval = mode_hdi` argument allowed us to request the mode to be our measure of central tendency and the highest posterior density intervals to be our type intervals. With `.width = c(.95, .5)`, we requested our HDIs be at both the 95% and 50% levels.
If we wanted to be more congruent with Kruschke's plotting sensibilities, we could further modify `tidybayes::stat_histinterval()`.
```{r, fig.width = 4, fig.height = 2.5, warning = F, message = F}
# this is unnecessary, but makes for nicer x-axis breaks
my_breaks <-
mode_hdi(draws$b_Intercept)[, 1:3] %>%
pivot_longer(everything(), values_to = "breaks") %>%
mutate(labels = breaks %>% round(digits = 3))
# here's the main plot code
draws %>%
ggplot(aes(x = b_Intercept, y = 0)) +
stat_histinterval(point_interval = mode_hdi, .width = .95,
fill = "steelblue", slab_color = "white",
breaks = 40, slab_size = .25, outline_bars = T) +
scale_x_continuous(breaks = my_breaks$breaks,
labels = my_breaks$labels) +
scale_y_continuous(NULL, breaks = NULL) +
labs(title = "tidybayes::stat_histinterval() all gussied up",
x = expression(theta)) +
theme_minimal_hgrid() +
theme(title = element_text(size = 10.5))
```
With the `point_interval` argument within `stat_histinterval()` and related functions, we can request different combinations of measures of central tendency (i.e., mean, median, mode) and interval types (i.e., percentile-based and HDIs). Although all of these are legitimate ways to summarize a posterior, they can yield somewhat different results. For example, here we'll contrast our mode + HDI summary with a median + percentile-based interval summary using `tidybayes::stat_pointinterval()`.
```{r, fig.width = 4, fig.height = 1.5}
draws %>%
ggplot(aes(x = b_Intercept)) +
stat_pointinterval(aes(y = 1), point_interval = median_qi, .width = c(.95, .5)) +
stat_pointinterval(aes(y = 2), point_interval = mode_hdi, .width = c(.95, .5)) +
scale_y_continuous(NULL, breaks = 1:2,
labels = c("median_qi", "mode_hdi")) +
coord_cartesian(ylim = c(0, 3)) +
labs(title = "Theta via tidybayes::stat_pointinterval()",
x = expression(theta)) +
theme_minimal_vgrid() +
theme(axis.line.y.left = element_blank(),
axis.text.y = element_text(hjust = 0),
axis.ticks.y = element_blank(),
plot.title.position = "plot",
title = element_text(size = 10.5))
```
Similar, yet distinct. To get a sense of the full variety of ways **tidybayes** allows users to summarize and plot the results of a Bayesian model, check out Kay's [-@kaySlabIntervalStats2022] vignette, [*Slab + interval stats and geoms*](https://mjskay.github.io/ggdist/articles/slabinterval.html). Also, did you notice how we switched to `theme_minimal_vgrid()`? That's how we added the vertical grid lines in the absence of horizontal grid lines.
## Simplified scripts for frequently used analyses
A lot has happened in **R** for Bayesian analysis since Kruschke wrote his [-@kruschkeDoingBayesianData2015] text. In addition to our use of the **tidyverse**, the **brms**, **bayesplot**, and **tidybayes** packages offer an array of useful convenience functions. We can and occasionally will write our own. But really, the rich **R** ecosystem already has us pretty much covered.
## Example: Difference of biases
Here are our new data.
```{r, message = F}
my_data <- read_csv("data.R/z6N8z2N7.csv")
glimpse(my_data)
```
They look like this.
```{r, fig.width = 5, fig.height = 3, warning = F, message = F}
library(ggthemes)
my_data %>%
mutate(y = y %>% as.character()) %>%
ggplot(aes(x = y, fill = s)) +
geom_bar(show.legend = F) +
scale_fill_colorblind() +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
theme_minimal_hgrid() +
facet_wrap(~ s)
```
Note our use of the `scale_fill_colorblind()` function from the [**ggthemes** package](https://CRAN.R-project.org/package=ggthemes) [@R-ggthemes]. When you want to use color to emphasize different factor levels, such as `s`, it's a good idea to make sure those colors can be distinguished by folks who are colorblind. The `scale_fill_colorblind()` function provides a discrete palette of eight colorblind safe colors. If you'd prefer to use a different palette, but wand to make sure its accessible to folks with colorblindness, check out the [**colorblindr** package](https://github.com/clauswilke/colorblindr) [@R-colorblindr].
Here's our **ggplot2** version of the model diagram in Figure 8.5.
```{r, fig.width = 2, fig.height = 3.5}
# plot of a beta density
p1 <-
tibble(x = seq(from = .01, to = .99, by = .01),
d = (dbeta(x, 2, 2)) / max(dbeta(x, 2, 2))) %>%
ggplot(aes(x = x, y = d)) +
geom_area(fill = "steelblue") +
annotate(geom = "text",
x = .5, y = .2,
label = "beta",
size = 7) +
annotate(geom = "text",
x = .5, y = .6,
label = "italic(A)*', '*italic(B)",
size = 7, family = "Times", parse = TRUE) +
scale_x_continuous(expand = c(0, 0)) +
theme_void() +
theme(axis.line.x = element_line(size = 0.5))
# an annotated arrow
p2 <-
tibble(x = c(.35, .65),
y = 1/3,
label = c("'~'", "italic(s)")) %>%
ggplot(aes(x = x, y = y, label = label)) +
geom_text(size = c(10, 7), parse = T, family = "Times") +
geom_segment(x = .5, xend = .5,
y = 1, yend = 0,
arrow = my_arrow) +
xlim(0, 1) +
theme_void()
# bar plot of Bernoulli data
p3 <-
tibble(x = 0:1,
d = (dbinom(x, size = 1, prob = .6)) / max(dbinom(x, size = 1, prob = .6))) %>%
ggplot(aes(x = x, y = d)) +
geom_col(fill = "steelblue", width = .4) +
annotate(geom = "text",
x = .5, y = .2,
label = "Bernoulli",
size = 7) +
annotate(geom = "text",
x = .5, y = .92,
label = "theta[italic(s)]",
size = 7, family = "Times", parse = T) +
xlim(-.75, 1.75) +
theme_void() +
theme(axis.line.x = element_line(size = 0.5))
# another annotated arrow
p4 <-
tibble(x = c(.35, .65),
y = c(1/3, 1/3),
label = c("'~'", "italic(i)*'|'*italic(s)")) %>%
ggplot(aes(x = x, y = y, label = label)) +
geom_text(size = c(10, 7), parse = T, family = "Times") +
geom_segment(x = .5, xend = .5,
y = 1, yend = 0,
arrow = my_arrow) +
xlim(0, 1) +
theme_void()
# some text
p5 <-
tibble(x = 1,
y = .5,
label = "italic(y)[italic(i)*'|'*italic(s)]") %>%
ggplot(aes(x = x, y = y, label = label)) +
geom_text(size = 7, parse = T, family = "Times") +
xlim(0, 2) +
theme_void()
# define the layout
layout <- c(
area(t = 1, b = 2, l = 1, r = 1),
area(t = 3, b = 3, l = 1, r = 1),
area(t = 4, b = 5, l = 1, r = 1),
area(t = 6, b = 6, l = 1, r = 1),
area(t = 7, b = 7, l = 1, r = 1)
)
# combine and plot!
(p1 + p2 + p3 + p4 + p5) +
plot_layout(design = layout) &
ylim(0, 1) &
theme(plot.margin = margin(0, 5.5, 0, 5.5))
```
Here is how we might fit the model with `brms::brm()`.
```{r fit8.2}
fit8.2 <-
brm(data = my_data,
family = bernoulli(identity),
y ~ 0 + s,
prior(beta(2, 2), class = b, lb = 0, ub = 1),
iter = 2000, warmup = 500, cores = 4, chains = 4,
seed = 8,
file = "fits/fit08.02")
```
More typically, we'd parameterize the model as `y ~ 1 + s`. This form would yield an intercept and a slope. Behind the scenes, **brms** would treat the nominal `s` variable as an 0-1 coded dummy variable. One of the nominal levels would become the reverence category, depicted by the `Intercept`, and the difference between that and the other category would be the `s` slope. However, with our `y ~ 0 + s` syntax, we've suppressed the typical model intercept. The consequence is that each level of the nominal variable `s` gets its own intercept or [i] index, if you will. This is analogous to Kruschke's `y[i] ∼ dbern(theta[s[i]])` code.
All that aside, here are the chains.
```{r, fig.width = 10, fig.height = 3}
plot(fit8.2, widths = c(2, 3))
```
The model `summary()` is as follows:
```{r}
summary(fit8.2)
```
The `brms::pairs()` function gets us the bulk of Figure 8.6.
```{r, fig.width = 4.5, fig.height = 4}
pairs(fit8.2,
off_diag_args = list(size = 1/3, alpha = 1/3))
```
But to get at that difference-score distribution, we'll have extract the posterior draws with `as_draws_df()`, make difference score with `mutate()`, and manually plot with **ggplot2**.
```{r}
draws <- as_draws_df(fit8.2)
draws <-
draws %>%
rename(theta_Reginald = b_sReginald,
theta_Tony = b_sTony) %>%
mutate(`theta_Reginald - theta_Tony` = theta_Reginald - theta_Tony)
glimpse(draws)
```
```{r, fig.width = 10, fig.height = 3, message = F, warning = F}
long_draws <-
draws %>%
select(starts_with("theta")) %>%
pivot_longer(everything()) %>%
mutate(name = factor(name, levels = c("theta_Reginald", "theta_Tony", "theta_Reginald - theta_Tony")))
long_draws %>%
ggplot(aes(x = value, y = 0, fill = name)) +
stat_histinterval(point_interval = mode_hdi, .width = .95,
slab_color = "white", outline_bars = T,
normalize = "panels") +
scale_fill_manual(values = colorblind_pal()(8)[2:4], breaks = NULL) +
scale_y_continuous(NULL, breaks = NULL) +
theme_minimal_hgrid() +
facet_wrap(~ name, scales = "free")
```
Note how this time we used the `colorblind_pal()` function within `scale_fill_manual()` to manually select three of the of the `scale_fill_colorblind()` colors.
Anyway, here's a way to get the numeric summaries out of `post`.
```{r}
long_draws %>%
group_by(name) %>%
mode_hdi()
```
In this context, the `mode_hdi()` summary yields:
* `name` (i.e., the name we used to denote the parameters)
* `value` (i.e., the value of the measure of central tendency)
* `.lower` (i.e., the lower level of the 95% HDI)
* `.upper` (i.e., the upper level...)
* `.width` (i.e., what interval we used)
* `.point` (i.e., the type of measure of central tendency)
* `.interval` (i.e., the type of interval)
## Sampling from the prior distribution in ~~JAGS~~ brms
There are a few ways to sample from the prior distribution with **brms**. Here we'll practice by setting `sample_prior = "only"`. As a consequence, `brm()` will ignore the likelihood and return draws based solely on the model priors.
```{r fit8.3}
fit8.3 <-
brm(data = my_data,
family = bernoulli(identity),
y ~ 0 + s,
prior = c(prior(beta(2, 2), class = b, coef = sReginald),
prior(beta(2, 2), class = b, coef = sTony),
# this just sets the lower and upper bounds
prior(beta(2, 2), class = b, lb = 0, ub = 1)),
iter = 2000, warmup = 500, cores = 4, chains = 4,
sample_prior = "only",
seed = 8,
file = "fits/fit08.03")
```
Because we set `sample_prior = "only"`, the `as_draws_df()` function will now return draws from the priors.
```{r, warning = F}
draws <- as_draws_df(fit8.3) %>%
select(starts_with("b_"))
head(draws)
```
With our prior draws in hand, we're almost ready to make the prior histograms of Figure 8.7. But first we'll want to determine the $z/N$ values in order to mark them off in the plots. [You'll note Kruschke did so with gray plus marks in his.]
```{r, message = F}
my_data %>%
group_by(s) %>%
summarise(z = sum(y),
N = n()) %>%
mutate(`z/N` = z / N)
levels <- c("theta_Reginald", "theta_Tony", "theta_Reginald - theta_Tony")
d_line <-
tibble(value = c(.75, .286, .75 - .286),
name = factor(c("theta_Reginald", "theta_Tony", "theta_Reginald - theta_Tony"),
levels = levels))
```
Behold the histograms of Figure 8.7.
```{r, fig.width = 10, fig.height = 3, message = F, warning = F}
draws %>%
rename(theta_Reginald = b_sReginald,
theta_Tony = b_sTony) %>%
mutate(`theta_Reginald - theta_Tony` = theta_Reginald - theta_Tony) %>%
pivot_longer(contains("theta")) %>%
mutate(name = factor(name, levels = levels)) %>%
ggplot(aes(x = value, y = 0)) +
stat_histinterval(point_interval = mode_hdi, .width = .95,
fill = colorblind_pal()(8)[5], normalize = "panels") +
geom_vline(data = d_line,
aes(xintercept = value),
linetype = 2) +
scale_y_continuous(NULL, breaks = NULL) +
labs(subtitle = expression("The dashed vertical lines mark off "*italic(z[s])/italic(N[s]))) +
theme_cowplot() +
facet_wrap(~ name, scales = "free")
```
Here's how you might make the scatter plot.
```{r, fig.width = 3, fig.height = 3}
draws %>%
rename(theta_Reginald = b_sReginald,
theta_Tony = b_sTony) %>%
ggplot(aes(x = theta_Reginald, y = theta_Tony)) +
geom_point(alpha = 1/4, color = colorblind_pal()(8)[6]) +
coord_equal() +
theme_minimal_grid()
```
Or you could always use a two-dimensional density plot with `stat_density_2d()`.
```{r, fig.width = 3, fig.height = 3, warning = F}
draws %>%
rename(theta_Reginald = b_sReginald,
theta_Tony = b_sTony) %>%
ggplot(aes(x = theta_Reginald, y = theta_Tony)) +
stat_density_2d(aes(fill = stat(density)),
geom = "raster", contour = F) +
scale_fill_viridis_c(option = "B", breaks = NULL) +
scale_x_continuous(expression(theta[1]),
expand = c(0, 0), limits = c(0, 1),
breaks = 0:4 / 4, labels = c("0", ".25", ".5", ".75", "1")) +
scale_y_continuous(expression(theta[2]),
expand = c(0, 0), limits = c(0, 1),
breaks = 0:4 / 4, labels = c("0", ".25", ".5", ".75", "1")) +
coord_equal() +
theme_minimal_grid()
```
The **viridis** color palettes in functions like `scale_fill_viridis_c()` and `scale_fill_viridis_d()` are designed to be colorblind safe, too [@R-viridis].
## Probability distributions available in ~~JAGS~~ brms
> [**brms**] has a large collection of frequently used probability distributions that are built-in. These distributions include the beta, gamma, normal, Bernoulli, and binomial along with many others. A complete list of distributions, and their [**brms**] names, can be found in [Bürkner's [-@Bürkner2022Parameterization] vignette [*Parameterization of response distributions in brms*](https://CRAN.R-project.org/package=brms/vignettes/brms_families.html)]. [@kruschkeDoingBayesianData2015, pp. 213--214, **emphasis** added]
### Defining new likelihood functions.
In addition to all the likelihood functions listed in above mentioned vignette, you can also make your own likelihood functions. Bürkner explained the method in his [-@Bürkner2022Define] vignette, [*Define custom response distributions with brms*](https://CRAN.R-project.org/package=brms/vignettes/brms_customfamilies.html).
## Faster sampling with parallel processing in ~~runjags~~ `brms::brm()`
We don't need to open another package to sample in parallel in **brms**. In fact, we've already been doing that. Take another look at the code use used for the last model, `fit8.2`.
```{r, eval = F}
fit8.2 <-
brm(data = my_data,
family = bernoulli(identity),
y ~ 0 + s,
prior(beta(2, 2), class = b, lb = 0, ub = 1),
iter = 2000, warmup = 500, cores = 4, chains = 4,
seed = 8,
file = "fits/fit08.02")
```
See the `cores = 4, chains = 4` arguments? With that bit of code, we told `brms::brm()` we wanted 4 chains, which we ran in parallel across 4 cores.
## Tips for expanding ~~JAGS~~ brms models
I'm in complete agreement with Kruschke, here:
> Often, the process of programming a model is done is stages, starting with a simple model and then incrementally incorporating complexifications. At each step, the model is checked for accuracy and efficiency. This procedure of incremental building is useful for creating a desired complex model from scratch, for expanding a previously created model for a new application, and for expanding a model that has been found to be inadequate in a posterior predictive check. (p. 218)
## Session info {-}
```{r}
sessionInfo()
```
```{r, message = F, warning = F, echo = F}
# Here we'll remove our objects
rm(p1, my_arrow, p2, p3, p4, p5, layout, my_data, fit8.1, draws, fit8.1_c, estimate_mode, my_breaks, fit8.2, long_draws, fit8.3, d_line, levels)
```
```{r, echo = F, message = F, warning = F, results = "hide"}
pacman::p_unload(pacman::p_loaded(), character.only = TRUE)
```