Conditional means priors #97

wlandau · 2023-07-18T19:38:03Z

wlandau
Jul 18, 2023
Maintainer

We talked about specifying a joint prior on the placebo means or treatment effects, then translating it to an induced prior on any parameterization of regression coefficients. The literature calls this approach "conditional means priors" (CMPs), or BCJ priors after the authors of https://www.jstor.org/stable/2291571, and the topic comes up in prior elicitation. Sources:

E. Bedrick, R. Christensen, and W. Johnson. A new perspective on priors for generalized linear models. Journal of the American Statistical Association, 91(436):1450–1460, 1996. https://www.jstor.org/stable/2291571.
E. Bedrick, R. Christensen, and W. Johnson. Bayesian binomial regression: Predicting survival at a trauma center. The American Statistician, 51(3):211–218, 1997.
R. Christensen, W. Johnson, A. Branscum, T. Hanson. Bayesian Ideas and Data Analysis, CRC Press: Boca Raton. Section 8.4.2.1, p. 203. 2011.
G. Rosner, P Laud, W. Johnson. Bayesian thinking in biostatistics. CRC Press, 2021.

The method itself is exactly what you would expect: for a model of the form g(E(y|data)) = X*b for link function g(), and a vector of conditional means m such that g(m) = P*b given invertible matrix P, we specify independent priors on the components of m and simply consider the induced prior on b = P_inverse * g(m). The components of m become the true parameters of the model, and the regression coefficients in b are just transformed parameters. The full augmented model looks like:

y ~ MVN(g_inverse(X*b), Sigma)       # likelihood
    b = P_inverse * g(m) # transformation
    m ~ p(...)            # prior
    Sigma ~ LKJ(...)      # prior

Since the link g() is the identity function, the model collapses down to this:

y ~ MVN(Q*m, Sigma)      # likelihood
    m ~ p(...)           # prior
    Sigma ~ LKJ(...)     # prior

where Q = X * P_inverse. In other words, our true parameters are m and Sigma. When we do MCMC, we draw joint posterior samples of (m, Sigma). This seems to be the most feasible (and recommended) way to handle CMPs. The alternative is to analytically transform the joint prior on m to a correlated joint prior on b beforehand, which seems extremely messy and error-prone even in a specialized case like ours. Then we would need to hack the Stan code of brms to implement a custom likelihood family because of the correlated priors on the components of b, which is also be extremely messy and error-prone. So unfortunately, I think CMPs force us away from brms.

wlandau · 2023-07-18T19:45:19Z

wlandau
Jul 18, 2023
Maintainer Author

If we want to stick with brms for informative priors, I think it would be reasonable to change the parameterization to accommodate whatever kind of informative prior the user wants.

0 replies

wlandau · 2023-07-19T15:58:52Z

wlandau
Jul 19, 2023
Maintainer Author

Given that CMPs do not seem feasible with brms, we as a group decided to handle them in an alternative MMRM package. (Although I will first ask around if they are possible in brms somehow.)

0 replies

wlandau · 2023-07-19T16:23:55Z

wlandau
Jul 19, 2023
Maintainer Author

Just posted https://discourse.mc-stan.org/t/conditional-means-priors-in-brms/32160 to make sure we are not missing something.

0 replies

andrew-bean · 2023-07-19T20:53:07Z

andrew-bean
Jul 19, 2023
Maintainer

I may be missing something, but what prevents us from just fitting the reparametrized model

y ~ MVN(X * inverse(P) * m, Sigma)

where we just have a different fixed-effects design matrix W = X * inverse(P) instead of X? In the situation where X consists only of main effects and interactions for time and treatment, and we would rather put priors on the cell means m instead of the regression coefficients, W would be dummy variables for the cell means. This can be done in brms.

Another interesting brms feature is around custom contrasts. If you do

contrasts(my_factor) <- my_contrast_matrix
brm(response ~ my_factor, ...)

my understanding is it actually results in a reparametrized model with a different design matrix. I don't fully understand how brms handles this, but maybe someone on Stan forums will, or I'll try to investigate more.

0 replies

wlandau · 2023-07-19T21:14:31Z

wlandau
Jul 19, 2023
Maintainer Author

Wow, you're right! Brilliant!

I think I missed it before because in the more general formulation for GLMs, we would have to go through a link function, and that would make things more difficult unless we formally transform the components of m accordingly.

So yes, I think it would be possible to implement this in brms.mmrm. We would need to manually construct the design matrix, but that seems doable.

Given @andrew-bean's workaround, what are everyone's throughs about #26?

0 replies

wlandau · 2023-07-19T21:22:18Z

wlandau
Jul 19, 2023
Maintainer Author

@andrew-bean, if you post #40 (comment) to https://discourse.mc-stan.org/t/conditional-means-priors-in-brms/32160, I will mark it as the solution.

0 replies

andrew-bean · 2023-07-20T03:53:18Z

andrew-bean
Jul 20, 2023
Maintainer

@wlandau ok sounds fine, thanks. Done.

On the same topic: one key difference between the two parametrizations is that cell means are correlated a-priori under the usual prior (e.g. the mean response for a treatment group at two different timepoints are correlated due to the shared main effect of treatment).

For me that correlation seems desirable actually, and it matters for the way historical data would inform a model. With independent priors on the cell means, historical data from one cell would influence the posterior only for that cell. It seems undesirable, because e.g. if you handed me data showing a drug was ineffective at 4 and 8 weeks in a past study, for me it should change my prior for that drug at 12 weeks in a future study.

0 replies

chstock · 2023-07-20T08:26:59Z

chstock
Jul 20, 2023
Maintainer

Thanks, @andrew-bean and @wlandau, this sounds smart! With the proposed solution, b isn't an explicit parameter in the model anymore, however, we would probably want to report the model results in the typical MMRM fashion (i.e. including b). In Stan we might use a transformed parameters {} block. I wonder how we would obtain b when working with brms.

0 replies

wlandau · 2023-07-20T13:16:43Z

wlandau
Jul 20, 2023
Maintainer Author

With the proposed solution, b isn't an explicit parameter in the model anymore, however, we would probably want to report the model results in the typical MMRM fashion (i.e. including b). In Stan we might use a transformed parameters {} block. I wonder how we would obtain b when working with brms.

I think we could take the posterior samples of m and transform them into samples of b using inverse(P) * m. This would be straightforward as a post-processing step after brms finishes running the MCMC.

On the same topic: one key difference between the two parametrizations is that cell means are correlated a-priori under the usual prior (e.g. the mean response for a treatment group at two different timepoints are correlated due to the shared main effect of treatment).

Yes, whereas the usual notion of conditional means priors seems to assume prior independence among the components of m. (At least that is how we would probably need to do it within brms.)

For me that correlation seems desirable actually, and it matters for the way historical data would inform a model. With independent priors on the cell means, historical data from one cell would influence the posterior only for that cell. It seems undesirable, because e.g. if you handed me data showing a drug was ineffective at 4 and 8 weeks in a past study, for me it should change my prior for that drug at 12 weeks in a future study.

So then maybe cell means are not always the right choice for m. We might want them to be contrasts of b which

Are independent, and
Are intuitive and useful with respect to assigning informative priors.

(1) not only justifies modeling assumptions, it could also ensure we have nice well-behaved posterior geometry and efficient sampling in HMC. Might be hard to combine with (2), but it's worth taking time to think about which parameterizations would make sense for both.

0 replies

andrew-bean · 2023-07-20T14:08:20Z

andrew-bean
Jul 20, 2023
Maintainer

Was playing with some code to compare the two approaches (just simulating from priors here). Mostly this is to investigate the brms feature I mentioned about custom contrasts. See fit_cell_contrasts. Though I'm not sure how this feature could be useful to us.

Tacked on an illustration of backtransforming m samples to b in the way Will described.

devtools::load_all()
library(brms.mmrm)
library(dplyr)
library(tibble)
library(brms)
library(ggplot2)
library(bayesplot)
set.seed(0L)

n_group <- 3
n_patient <- 2
n_time <- 4

observations <- brm_simulate(
  n_group = 3,
  n_patient = 2,
  n_time = 4
)$data %>%
  mutate(
    group = paste("group", group),
    patient = paste("patient", patient),
    time = paste("time", time)
  ) %>%
  mutate(across(-response, factor))

cells <- observations %>%
  select(group, time) %>%
  distinct() %>%
  mutate(cell = factor(1:n(), 1:n()))

observations <- observations %>%
  left_join(cells, c("group", "time"))

# design matrix for E(y) = X * b
X <- model.matrix(~ group * time, data = observations)

# replace simulated responses using fixed values for b, Sig
b <- c(0, 2, 4, 0.4, 0.8, 1.2, 0, 0, 0, 0, 0, 0)
Sig <- matrix(0.5, nrow = n_time, ncol = n_time)
diag(Sig) <- 1
Z <- matrix(rnorm(nrow(observations)), ncol = n_time)
eps <- as.numeric(Z %*% chol(Sig))
observations$response <- X %*% b + eps
plot_obs <- ggplot(observations, aes(x = time, color = group)) +
  geom_path(aes(y = response, group = patient))

# cell means m = inverse(M) * b under model E(y) = X * b
Minv <- model.matrix( ~ group * time, data = cells)
m <- Minv %*% b
M <- solve(Minv)
W <- X %*% M

# overlay true cell means
plot_obs + geom_path(aes(y = m, group = group), data = cells, linewidth = 2)

# three models:
# (sampling from the prior)

# 1. using a regression model for the mean
# (compound symmetry assumption for simplicity)
fit_reg <- brm(
  response ~ group * time + cosy(gr = patient),
  observations,
  prior = c(
    prior("normal(0, 1)", class = "b"),
    prior("normal(0, 1)", class = "Intercept"),
    prior("uniform(0, 1)", class = "cosy")
  ),
  sample_prior = "only"
)

# 2. model parametrized with independent cell means
fit_cell <- brm(
  response ~ 0 + cell + cosy(gr = patient),
  observations,
  prior = c(
    prior("normal(0, 1)", class = "b"),
    prior("uniform(0, 1)", class = "cosy")
  ),
  sample_prior = "only"
)

# 3. cell-mean parametrization, but using custom contrasts
observations$cell_contrasts <- observations$cell
contrasts(observations$cell_contrasts) <- Minv[,-1]
fit_cell_contrasts <- brm(
  response ~ cell_contrasts + cosy(gr = patient), # doesn't handle contrasts the same way if intercept is omitted
  observations,
  prior = c(
    prior("normal(0, 1)", class = "Intercept"),
    prior("normal(0, 1)", class = "b"),
    prior("uniform(0, 1)", class = "cosy")
  ),
  sample_prior = "only"
)

# Design matrix for the "regression" approach makes sense
X_reg <- standata(fit_reg)$X
X_reg <- X_reg[as.character(1:nrow(X)),]
all(X_reg == X)

# As does the "cell-means" approach
X_cell <- standata(fit_cell)$X
X_cell <- X_cell[as.character(1:nrow(X)),]
all(X_cell == W)

# brms handles the custom-contrasts model in an interesting way
X_cell_contrasts <- standata(fit_cell_contrasts)$X
X_cell_contrasts <- X_cell_contrasts[as.character(1:nrow(X)),]
# the design matrix matches X; not close to W
all(X_cell_contrasts == W)
all(X_cell_contrasts == X)

# How correlated are the priors induced on the cell means
# across time points for one group?
newdata <- filter(cells, group == "group 1") %>%
  mutate(patient = "foo") # otherwise the next line complains

newdata$cell_contrasts <- newdata$cell
contrasts(newdata$cell_contrasts) <- Minv[,-1]

lps <- lapply(
  setNames(list(fit_reg, fit_cell, fit_cell_contrasts),
           c("regression", "cell-mean", "cell-mean w/ custom contrasts")),
  posterior_linpred,
  newdata = newdata
)

# the cell-mean model has uncorrelated linear predictors
# regression does not
lapply(lps, function(x) round(cor(x), 2))
pairs_plots <- lapply(lps, function(x) bayesplot::mcmc_pairs(posterior::as_draws_matrix(x)))
pairs_plots[["regression"]]
pairs_plots[["cell-mean"]]

# transforming posterior draws for m to draws for b
m_draws <- brms::as_draws_matrix(fit_cell, variable = "b", regex = TRUE)
b_draws_trans <- m_draws %*% t(M)

# these are now correlated
cov(b_draws_trans) %>% round(1)
M %*% t(M)

# and have slightly different marginal distributions than under the X*b model
b_draws <- brms::as_draws_matrix(fit_reg, variable = "b", regex = TRUE)

plots <- lapply(list(b_draws_trans, b_draws), mcmc_areas)
bayesplot_grid(plots = plots, xlim = c(-10, 10),
               titles = c("Fit E(y) = W * m with m ~ N(0,1) then transform to b",
                          "Fit E(y) = X * b with b ~ N(0,1)"))

0 replies

wlandau · 2023-07-22T02:44:00Z

wlandau
Jul 22, 2023
Maintainer Author

This is really helpful to walk through, @andrew-bean!

I really like the way you calculate Minv from model.matrix(). Really elegant and easy to do correctly.

So for fit_cell_contrasts: it looks like if you nominally regress on m but give it contrasts that map to b, then the model will actually estimate b as parameters. In other words, it actually estimates the contrasts as parameters instead of the individual factor levels. Do I have that right? If so, and if it is possible to give contrasts to the entirety of group * time, would conditional means priors be possible using the contrast technique in the reverse direction? It's an interesting connection.

On the other hand, I would find it clearer to manually construct the model matrix and supply it directly to brm() (should be possible to do through the data). We might have to do that anyway, e.g. if we are adjusting for nuisance covariates and need to manually set a reference level so the components of m are not implicitly conditional on a subset of the data (e.g. at a level of a nuisance factor, c.f. #24 (comment)).

0 replies

wlandau · 2023-07-22T20:10:49Z

wlandau
Jul 22, 2023
Maintainer Author

So along the lines of #40 (comment):

For me that correlation seems desirable actually, and it matters for the way historical data would inform a model. With independent priors on the cell means, historical data from one cell would influence the posterior only for that cell. It seems undesirable, because e.g. if you handed me data showing a drug was ineffective at 4 and 8 weeks in a past study, for me it should change my prior for that drug at 12 weeks in a future study.

and #40 (comment):

So then maybe cell means are not always the right choice for m. We might want them to be contrasts of b which
1. Are independent, and
2. Are intuitive and useful with respect to assigning informative priors.

@andrew-bean and others: if we choose something other than cell means for m, what would you propose?

0 replies

weberse2 · 2023-07-25T07:23:16Z

weberse2
Jul 25, 2023
Maintainer

I have not absorbed all content above, but a good read of factor contrasts in R is this document:

https://bbolker.github.io/mixedmodels-misc/notes/contrasts.pdf

for MMRMs I think that the MASS::contr.sdif contrasts are a really good fit. These are forward differences and allow to define the parameters to mean things like the difference in the visit 3 vs visit 2 being a parameter (consecutive visits). Defining contrasts is a very standard thing in R, which is why I consider this as a very attractive way to setup a given parametrisation.

0 replies

wlandau · 2023-07-25T12:55:27Z

wlandau
Jul 25, 2023
Maintainer Author

Thanks for the reference, @weberse2. I have always had trouble with contrast notation in R, but maybe I won't after reading that article.

0 replies

wlandau · 2023-07-25T15:48:35Z

wlandau
Jul 25, 2023
Maintainer Author

Reading https://bbolker.github.io/mixedmodels-misc/notes/contrasts.pdf, I am reminded of how uncertain I feel when I use the contrast interface and lme4-like formulas:

Did I correctly set the contrast matrix, or do I need to invert or transpose it?
Am I certain I trust the mapping between the levels of the factor and the columns (rows?) of the contrast matrix?
Am I completely confident in how the intercept is handled? Did I put a -1 index in the right place, or remember to include a + 0 in the formula? And if I am looking back at old code and see a nameless -1, what does that do to my mental picture of the current model matrix?
At what point can I trust the formulas and contrast matrices instead of repeatedly calling model.matrix() to check my work?

I may be in the minority, but even after 10+ years of using R to do statistics, I still find it hard to trust that the contrast/formula machinery is faithfully representing a model I have in mind. It takes a lot of mental arithmetic and cognitive load just to make sure I am not making obvious errors.

I would prefer to set up the regression coefficients in a more explicit and pedantic way, with transparent/verbose assurances, and strict guardrails (in our case, simple objects passed to well-documented fit-for-purpose formal arguments).

0 replies

weberse2 · 2023-07-26T07:58:46Z

weberse2
Jul 26, 2023
Maintainer

I do understand your concerns with the R contrast system, which is not super intuitive. It's just that a lot of R works in this way and going against this established system is a choice to make. However, you are probably right in that the majority of people are anyway not familiar with these conventions in R and it is hence better to create your own one.

0 replies

wlandau · 2023-08-29T20:38:49Z

wlandau
Aug 29, 2023
Maintainer Author

Taking a step back and revisiting #40 (comment), I wonder if explicit CMPs are even necessary for our identity link function. As @andrew-bean observed, the model matrix just collapses back down into a different model matrix. In that situation, I would think it is more efficient to choose whatever prior-friendly parameterization makes sense (for example, cell means) and then compute quantities like the intercept as transformed parameters if needed.

Of course, if we want our priors on fixed effects to be correlated, that might complicate things. But are correlated priors be necessary to consider? Seems like a challenge to account for correlations between different time points in the prior, especially if we need to elicit that prior. And we already have a flexible correlation matrix on the residuals.

0 replies

weberse2 · 2023-08-30T11:38:21Z

weberse2
Aug 30, 2023
Maintainer

Well... if you are going for non-informative priors then cell means are easy and simple to set up... in practice I'd much prefer to put a prior on the difference in the response from visit to visit. This isn't big usually and it creates immediately a correlation structure which is appropriate for the problem at hand in many cases.

0 replies

wlandau · 2024-04-15T20:30:49Z

wlandau
Apr 15, 2024
Maintainer Author

Coming back to this after almost a year, I think it would help to list all the major cases we care about for assigning priors. I can think of priors on:

The difference in response from visit to visit.
Cell means.
Cell means for placebo response, but effects for treatment.

Any others?

0 replies

wlandau · 2024-04-16T15:43:12Z

wlandau
Apr 16, 2024
Maintainer Author

I just added a prototype for time differences: #92. Would be great to discuss this and #89 in our meeting tomorrow.

0 replies

wlandau · 2024-04-22T17:07:07Z

wlandau
Apr 22, 2024
Maintainer Author

For this issue, we seem to be heading away from conditional means priors and towards reparameterization in general. In the original formulation by Bedrick et al., CMPs are about using elicited priors on the response scale to induce priors on the model coefficients. We don't actually need that here. All we need is a model that has a parameterization amenable to informative priors. Beyond that point, any deterministic function on posterior samples is straightforward.

0 replies

wlandau · 2024-04-22T19:06:54Z

wlandau
Apr 22, 2024
Maintainer Author

Moving this to a discussion in favor of #96.

0 replies

chstock · 2024-04-29T15:57:58Z

chstock
Apr 29, 2024
Maintainer

Another related paper: "A hierarchical prior for generalized linear models based on predictions for the mean response".

0 replies

wlandau · 2024-05-20T18:14:12Z

wlandau
May 20, 2024
Maintainer Author

On the original thread, I think we landed well in #96 and #100 with informative prior archetypes.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conditional means priors #97

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 24 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Conditional means priors #97

wlandau Jul 18, 2023 Maintainer

Replies: 24 comments

wlandau Jul 18, 2023 Maintainer Author

wlandau Jul 19, 2023 Maintainer Author

wlandau Jul 19, 2023 Maintainer Author

andrew-bean Jul 19, 2023 Maintainer

wlandau Jul 19, 2023 Maintainer Author

wlandau Jul 19, 2023 Maintainer Author

andrew-bean Jul 20, 2023 Maintainer

chstock Jul 20, 2023 Maintainer

wlandau Jul 20, 2023 Maintainer Author

andrew-bean Jul 20, 2023 Maintainer

wlandau Jul 22, 2023 Maintainer Author

wlandau Jul 22, 2023 Maintainer Author

weberse2 Jul 25, 2023 Maintainer

wlandau Jul 25, 2023 Maintainer Author

wlandau Jul 25, 2023 Maintainer Author

weberse2 Jul 26, 2023 Maintainer

wlandau Aug 29, 2023 Maintainer Author

weberse2 Aug 30, 2023 Maintainer

wlandau Apr 15, 2024 Maintainer Author

wlandau Apr 16, 2024 Maintainer Author

wlandau Apr 22, 2024 Maintainer Author

wlandau Apr 22, 2024 Maintainer Author

chstock Apr 29, 2024 Maintainer

wlandau May 20, 2024 Maintainer Author

wlandau
Jul 18, 2023
Maintainer

wlandau
Jul 18, 2023
Maintainer Author

wlandau
Jul 19, 2023
Maintainer Author

wlandau
Jul 19, 2023
Maintainer Author

andrew-bean
Jul 19, 2023
Maintainer

wlandau
Jul 19, 2023
Maintainer Author

wlandau
Jul 19, 2023
Maintainer Author

andrew-bean
Jul 20, 2023
Maintainer

chstock
Jul 20, 2023
Maintainer

wlandau
Jul 20, 2023
Maintainer Author

andrew-bean
Jul 20, 2023
Maintainer

wlandau
Jul 22, 2023
Maintainer Author

wlandau
Jul 22, 2023
Maintainer Author

weberse2
Jul 25, 2023
Maintainer

wlandau
Jul 25, 2023
Maintainer Author

wlandau
Jul 25, 2023
Maintainer Author

weberse2
Jul 26, 2023
Maintainer

wlandau
Aug 29, 2023
Maintainer Author

weberse2
Aug 30, 2023
Maintainer

wlandau
Apr 15, 2024
Maintainer Author

wlandau
Apr 16, 2024
Maintainer Author

wlandau
Apr 22, 2024
Maintainer Author

wlandau
Apr 22, 2024
Maintainer Author

chstock
Apr 29, 2024
Maintainer

wlandau
May 20, 2024
Maintainer Author