1. From Classical to Bayesian Decision Making

In earlier editions, we defined:

Statistical decision problems
Point estimation, hypothesis testing, confidence intervals
Loss functions and risk functions
Concepts of admissibility, dominance, and
optimality

In Edition 6, we move toward a new class of decision
rules: Bayes rules.

2. Motivation: Why Bayesian Decision Rules?

In frequentist decision theory, we evaluated a decision rule $\delta$ using the risk
function:

$$
R(\theta, \delta) = \mathbb{E}[L(\theta, \delta(X))]
$$

But this depends on the unknown $\theta$.

Idea: If we have prior beliefs about $\theta$, we can average the risk using a
prior distribution $\pi(\theta)$.

3. Bayes Risk

Bayes Risk

$$B(\pi, \delta) = \int R(\theta, \delta) \pi(\theta)d\theta$$

This is the expected risk averaged over the prior.

Helps us compare decision rules more
objectively
Independent of the unknown $\theta$

4. Bayes Decision Rule

A decision rule $\delta_B$ is called a Bayes
rule (with respect to prior $\pi$) if:

$$\delta_B = \arg\min_{\delta \in \mathcal{D}} B(\pi, \delta)$$

This rule minimizes the average risk under the prior.

5. Calculating Bayes Rules

To find a Bayes rule:

Compute posterior $\pi(\theta|x)$ using Bayes’ theorem
Minimize posterior expected loss:

$$\delta_B(x) = \arg\min_d \int L(\theta, d) \pi(\theta|x) d\theta$$

This depends on: – The loss function – The posterior distribution

6. Examples of Bayes Rules

Case 1: Squared Error Loss

$$L(\theta, d) = (\theta – d)^2$$

Bayes rule:

$$\delta_B(x) = \mathbb{E}[\theta | x] \quad \text{(posterior mean)}$$

Case 2: Absolute Error Loss

$$L(\theta, d) = |\theta – d|$$

Bayes rule:

$$\delta_B(x) = \text{median of } \pi(\theta|x)$$

7. A Full Example

Let $X \sim \text{Uniform}(0,\theta)$, $\theta > 0$

Assume a prior: $\theta \sim \text{Gamma}(2,1)$

Loss function: $L(\theta, d) = (\theta -d)^2$

Steps:

Use Bayes theorem to get posterior
Compute $\mathbb{E}[\theta|x]$ = Bayes rule

This gives us a data-informed estimate of $\theta$

8. Theoretical Properties of Bayes Rules

Theorem (Admissibility)

Under regularity conditions, Bayes rules are
admissible.

This means: No other rule is strictly better in
terms of risk.

9. Important Aspects of Bayesian Statistics

Bayesian statistics uses probability to express
uncertainty about parameters
A prior distribution $\pi(\theta)$ represents our belief before
observing data
After observing data $x$, the
posterior distribution is given by Bayes’ theorem:

$$\pi(\theta|x) = \frac{f(x|\theta) \pi(\theta)}{\int f(x|\theta) \pi(\theta) d\theta}$$

Bayesian inference focuses on the posterior:

Point estimates: posterior mean, mode (MAP), median
Interval estimates: credible intervals, e.g., 95%
interval such that

$$P(\theta \in [a,b] \mid x) = 0.95$$

Decision-making via posterior expected loss
minimization

Advantages:

Intuitive probabilistic interpretation
Flexibility in incorporating expert knowledge
Naturally handles uncertainty and prediction

Point estimates: posterior mean, mode (MAP), median
Interval estimates: credible intervals, e.g., 95%
interval such that
$$P(\theta \in [a,b] \mid x) = 0.95$$
Decision-making via posterior expected loss
minimization
Intuitive probabilistic interpretation
Flexibility in incorporating expert knowledge
Naturally handles uncertainty and prediction

10. Example: Visualizing Prior and Posterior

library(ggplot2)
library(dplyr)
library(tidyr)

x <- 4
theta <- seq(0.01, 10, length.out = 500)

prior <- dgamma(theta, shape = 2, rate = 1)
likelihood <- ifelse(theta >= x, 1/theta, 0)
likelihood <- likelihood / max(likelihood)
posterior <- prior * likelihood
posterior <- posterior / max(posterior)

data <- tibble(
  theta = theta,
  Prior = prior,
  Likelihood = likelihood,
  Posterior = posterior
) %>%
  pivot_longer(cols = -theta, names_to = "Distribution", values_to = "Density")

ggplot(data, aes(x = theta, y = Density, color = Distribution)) +
  geom_line(size = 1.2) +
  labs(title = "Bayesian Updating: Prior, Likelihood, and Posterior",
       x = expression(theta), y = "Normalized Density") +
  theme_minimal()

Bayesian Updating: Prior, Likelihood, and Posterior

11. Remarks on Bayes Rules

Depend on the prior: Different priors may lead to
different decisions.
Can be non-unique: Multiple rules may yield the
same Bayes risk.
Can simplify computations using conjugate
priors.
Connect naturally with statistical decision
theory.

12. Summary and Outlook

What we learned:

Bayes risk combines risk and prior beliefs
Bayes rules minimize Bayes risk
Easy to compute in many common problems
Often admissible and interpretable
Basic concepts of Bayesian statistics support the theory

What’s next:

In Edition 7, we’ll explore Bayesian Estimation under Weighted Quadratic Loss.

Stay tuned for the next part!

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.

3 D Statistical Learning

We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.

Our core services include:

– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.

– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.

– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.

– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).

– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.

– Scientific Data Analysis:
Advanced analytical support for scientific research projects.