Introduction

In this sixth edition, we take a deeper look into the structure of Bayesian models by introducing the concept of conjugate prior distributions. These are prior distributions that lead to posterior distributions in the same family, offering mathematical convenience and analytical tractability.

We’ll begin with a general definition, then explore key examples of naturally conjugate models used throughout Bayesian inference. Understanding these models is essential for applying Bayesian methods efficiently in real-world problems.


What Are Conjugate Priors?

Definition

Let:

  • \(\mathcal{F}\) be a class of likelihood functions \(p(y \mid \theta)\),
  • \(\mathcal{P}\) be a class of prior distributions \(p(\theta)\).

Then \(\mathcal{P}\) is said to be conjugate to \(\mathcal{F}\) if for all \(p(\cdot \mid \theta) \in \mathcal{F}\) and all \(p(\cdot) \in \mathcal{P}\), the posterior \(p(\theta \mid y)\) also belongs to \(\mathcal{P}\).

Why are Conjugate Priors so important?

  • Conjugate priors simplify computation: posterior distributions can be computed analytically.
  • They provide closed-form posterior expressions for many standard models.
  • The functional form of the prior mirrors the likelihood, resulting in interpretability and mathematical efficiency.

Common Conjugate Models

Single-Parameter Models

ModelLikelihoodPriorPosterior
Binomial-Beta\(Y \sim \text{Bin}(n,\theta)\)\(\theta \sim \text{Beta}(\alpha, \beta)\)\(\theta \mid y \sim \text{Beta}(\alpha + y, \beta + n – y)\)
NegativeBinomial-Beta\(Y \sim \text{NegBin}(r, \theta)\)\(\theta \sim \text{Beta}(\alpha, \beta)\)Same as above
Poisson-Gamma\(Y \sim \text{Poisson}(\theta)\)\(\theta \sim \text{Gamma}(\alpha, \beta)\)\(\theta \mid y \sim \text{Gamma}(\alpha + y, \beta + 1)\)
Exponential-Gamma\(Y \sim \text{Exp}(\theta)\)\(\theta \sim \text{Gamma}(\alpha, \beta)\)Gamma posterior
Normal-Normal\(Y \sim \mathcal{N}(\theta, \sigma^2)\)\(\theta \sim \mathcal{N}(\mu, \tau^2)\)Normal posterior
Normal-Inverse-Gamma\(Y \sim \mathcal{N}(\theta, \sigma^2)\) with unknown \(\sigma^2\)Hierarchical prior: \(\theta \mid \sigma^2 \sim \mathcal{N}(\mu, v\sigma^2)\), \(\sigma^2 \sim \text{InvGamma}(\alpha, \beta)\)Full posterior is Normal-Inverse-Gamma

Multivariate Models

Multinomial-Dirichlet:
$$\mathbf{Y} \sim \text{Multinomial}(n, \boldsymbol{\theta}), \quad \boldsymbol{\theta} \sim \text{Dirichlet}(\boldsymbol{\alpha})$$
$$\Rightarrow \boldsymbol{\theta} \mid \mathbf{Y} \sim \text{Dirichlet}(\boldsymbol{\alpha} + \mathbf{Y})$$

Multivariate Normal:
$$\mathbf{Y} \sim \mathcal{N}_k(\boldsymbol{\theta}, \Sigma), \quad \boldsymbol{\theta} \sim \mathcal{N}_k(\boldsymbol{\mu}, V)$$
The posterior is again multivariate normal.


Example: Normal-Normal Conjugate Model

Suppose:

  • \(Y \mid \theta \sim \mathcal{N}(\theta, \sigma^2)\), with \(\sigma^2\) known.
  • Prior: \(\theta \sim \mathcal{N}(\mu, \tau^2)\)

Then:

$$\theta \mid y \sim \mathcal{N}(\mu_*, \tau_*^2)$$

with:

$$\mu_* = \frac{\tau^2 y + \sigma^2 \mu}{\tau^2 + \sigma^2}, \quad \tau_*^2 = \frac{\tau^2 \sigma^2}{\tau^2 + \sigma^2}$$

This posterior mean \(\mu_*\) is a weighted average of the prior mean and the observed data.

🔍 Insight: The more precise the data (i.e., smaller \(\sigma^2\)), the more influence it has on the posterior mean.


Example: Multinomial-Dirichlet Model

Generalizing the Binomial-Beta model to \(k\)-categories:

Likelihood:
$$p(\mathbf{Y} \mid \boldsymbol{\theta}) = {n \choose y_1, \ldots, y_k} \theta_1^{y_1} \cdots \theta_k^{y_k}$$

Prior:
$$p(\boldsymbol{\theta}) = \frac{\Gamma(\sum_j \alpha_j)}{\prod_j \Gamma(\alpha_j)} \prod_j \theta_j^{\alpha_j – 1}$$

Posterior:
$$\boldsymbol{\theta} \mid \mathbf{Y} \sim \text{Dirichlet}(\alpha_1 + y_1, \ldots, \alpha_k + y_k)$$

💡 This conjugate structure makes updating beliefs over multinomial outcomes straightforward.


Why Conjugate Priors Matter

Conjugate priors provide several key advantages in Bayesian analysis. They enable analytical solutions without numerical integration, helping with interpretability of parameter updates. These distributions provide a foundation for hierarchical modeling and are essential for understanding Bayesian predictive distributions. The mathematical tractability they offer makes them invaluable tools for both theoretical development and practical implementation of Bayesian methods.


Simulation and Visualization in R: Plotting Prior and Posterior Distributions

# Parameters
alpha_prior <- 2
beta_prior <- 2
n <- 20
y <- 14

# Posterior parameters
alpha_post <- alpha_prior + y
beta_post <- beta_prior + n - y

# Create theta grid
theta <- seq(0, 1, length.out = 1000)

# Densities
prior <- dbeta(theta, alpha_prior, beta_prior)
posterior <- dbeta(theta, alpha_post, beta_post)

# Combine into data frame
df <- tibble(
  theta = theta,
  Prior = prior,
  Posterior = posterior
) %>%
  pivot_longer(-theta, names_to = "Distribution", values_to = "Density")

# Create visualization
ggplot(df, aes(x = theta, y = Density, color = Distribution)) +
  geom_line(size = 1.2) +
  labs(
    title = "Binomial-Beta Conjugate Prior Updating",
    subtitle = "Prior: Beta(2,2), Data: 14 successes in 20 trials",
    x = expression(theta),
    y = "Density"
  ) +
  theme_minimal() +
  scale_color_manual(values = c("steelblue", "firebrick"))

plot of chunk conjugate-simulation


Conclusion

This edition formalized the theory and practice of conjugate priors in Bayesian inference. From the intuitive Beta-Binomial to the flexible Normal-Inverse-Gamma, conjugate models allow us to update beliefs coherently and efficiently.

Understanding conjugate priors provides the mathematical foundation necessary for more advanced Bayesian modeling techniques. These distributions serve as building blocks for complex hierarchical models and offer computational advantages that make Bayesian analysis tractable in many real-world applications.

In the next edition, we will delve deeper into some usual priors.


Keep exploring with 3 D Statistical Learning.

We thank Dr. Dany Djeudeu for his dedication to making complex statistical ideas accessible, rigorous, and inspiring.