1. From Classical to Bayesian Decision Making

In earlier editions, we defined:

  • Statistical decision problems
  • Point estimation, hypothesis testing, confidence intervals
  • Loss functions and risk functions
  • Concepts of admissibility, dominance, and
    optimality

In Edition 6, we move toward a new class of decision
rules: Bayes rules.

2. Motivation: Why Bayesian Decision Rules?

In frequentist decision theory, we evaluated a decision rule \(\delta\) using the risk
function:

$$
R(\theta, \delta) = \mathbb{E}[L(\theta, \delta(X))]
$$

But this depends on the unknown \(\theta\).

Idea: If we have prior beliefs about \(\theta\), we can average the risk using a
prior distribution \(\pi(\theta)\).

3. Bayes Risk

Bayes Risk

$$B(\pi, \delta) = \int R(\theta, \delta) \pi(\theta)d\theta$$

This is the expected risk averaged over the prior.

  • Helps us compare decision rules more
    objectively
  • Independent of the unknown \(\theta\)

4. Bayes Decision Rule

A decision rule \(\delta_B\) is called a Bayes
rule
(with respect to prior \(\pi\)) if:

$$\delta_B = \arg\min_{\delta \in \mathcal{D}} B(\pi, \delta)$$

This rule minimizes the average risk under the prior.

5. Calculating Bayes Rules

To find a Bayes rule:

  • Compute posterior \(\pi(\theta|x)\) using Bayes’ theorem
  • Minimize posterior expected loss:

$$\delta_B(x) = \arg\min_d \int L(\theta, d) \pi(\theta|x) d\theta$$

This depends on: – The loss function – The posterior distribution

6. Examples of Bayes Rules

Case 1: Squared Error Loss

$$L(\theta, d) = (\theta – d)^2$$

Bayes rule:

$$\delta_B(x) = \mathbb{E}[\theta | x] \quad \text{(posterior mean)}$$

Case 2: Absolute Error Loss

$$L(\theta, d) = |\theta – d|$$

Bayes rule:

$$\delta_B(x) = \text{median of } \pi(\theta|x)$$

7. A Full Example

Let \(X \sim \text{Uniform}(0,\theta)\), \(\theta > 0\)

Assume a prior: \(\theta \sim \text{Gamma}(2,1)\)

Loss function: \(L(\theta, d) = (\theta -d)^2\)

Steps:

  • Use Bayes theorem to get posterior
  • Compute \(\mathbb{E}[\theta|x]\) = Bayes rule

This gives us a data-informed estimate of \(\theta\)

8. Theoretical Properties of Bayes Rules

Theorem (Admissibility)

Under regularity conditions, Bayes rules are
admissible.

  • This means: No other rule is strictly better in
    terms of risk.

9. Important Aspects of Bayesian Statistics

  • Bayesian statistics uses probability to express
    uncertainty about parameters
  • A prior distribution \(\pi(\theta)\) represents our belief before
    observing data
  • After observing data \(x\), the
    posterior distribution is given by Bayes’ theorem:

$$\pi(\theta|x) = \frac{f(x|\theta) \pi(\theta)}{\int f(x|\theta) \pi(\theta) d\theta}$$

  • Bayesian inference focuses on the posterior:

Point estimates: posterior mean, mode (MAP), median
Interval estimates: credible intervals, e.g., 95%
interval such that

$$P(\theta \in [a,b] \mid x) = 0.95$$

Decision-making via posterior expected loss
minimization

  • Advantages:

Intuitive probabilistic interpretation
Flexibility in incorporating expert knowledge
Naturally handles uncertainty and prediction

  • Point estimates: posterior mean, mode (MAP), median
  • Interval estimates: credible intervals, e.g., 95%
    interval such that
    $$P(\theta \in [a,b] \mid x) = 0.95$$
  • Decision-making via posterior expected loss
    minimization
  • Intuitive probabilistic interpretation
  • Flexibility in incorporating expert knowledge
  • Naturally handles uncertainty and prediction

10. Example: Visualizing Prior and Posterior

library(ggplot2)
library(dplyr)
library(tidyr)

x <- 4
theta <- seq(0.01, 10, length.out = 500)

prior <- dgamma(theta, shape = 2, rate = 1)
likelihood <- ifelse(theta >= x, 1/theta, 0)
likelihood <- likelihood / max(likelihood)
posterior <- prior * likelihood
posterior <- posterior / max(posterior)

data <- tibble(
  theta = theta,
  Prior = prior,
  Likelihood = likelihood,
  Posterior = posterior
) %>%
  pivot_longer(cols = -theta, names_to = "Distribution", values_to = "Density")

ggplot(data, aes(x = theta, y = Density, color = Distribution)) +
  geom_line(size = 1.2) +
  labs(title = "Bayesian Updating: Prior, Likelihood, and Posterior",
       x = expression(theta), y = "Normalized Density") +
  theme_minimal()

plot of chunk unnamed-chunk-1

Bayesian Updating: Prior, Likelihood, and Posterior

11. Remarks on Bayes Rules

  • Depend on the prior: Different priors may lead to
    different decisions.
  • Can be non-unique: Multiple rules may yield the
    same Bayes risk.
  • Can simplify computations using conjugate
    priors.
  • Connect naturally with statistical decision
    theory.

12. Summary and Outlook

What we learned:

  • Bayes risk combines risk and prior beliefs
  • Bayes rules minimize Bayes risk
  • Easy to compute in many common problems
  • Often admissible and interpretable
  • Basic concepts of Bayesian statistics support the theory

What’s next:

In Edition 7, we’ll explore Bayesian Estimation under Weighted Quadratic Loss.

Stay tuned for the next part!

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.