1. From Classical to Bayesian Decision Making
In earlier editions, we defined:
- Statistical decision problems
- Point estimation, hypothesis testing, confidence intervals
- Loss functions and risk functions
- Concepts of admissibility, dominance, and
optimality
In Edition 6, we move toward a new class of decision
rules: Bayes rules.
2. Motivation: Why Bayesian Decision Rules?
In frequentist decision theory, we evaluated a decision rule \(\delta\) using the risk
function:
$$
R(\theta, \delta) = \mathbb{E}[L(\theta, \delta(X))]
$$
But this depends on the unknown \(\theta\).
Idea: If we have prior beliefs about \(\theta\), we can average the risk using a
prior distribution \(\pi(\theta)\).
3. Bayes Risk
Bayes Risk
$$B(\pi, \delta) = \int R(\theta, \delta) \pi(\theta)d\theta$$
This is the expected risk averaged over the prior.
- Helps us compare decision rules more
objectively - Independent of the unknown \(\theta\)
4. Bayes Decision Rule
A decision rule \(\delta_B\) is called a Bayes
rule (with respect to prior \(\pi\)) if:
$$\delta_B = \arg\min_{\delta \in \mathcal{D}} B(\pi, \delta)$$
This rule minimizes the average risk under the prior.
5. Calculating Bayes Rules
To find a Bayes rule:
- Compute posterior \(\pi(\theta|x)\) using Bayes’ theorem
- Minimize posterior expected loss:
$$\delta_B(x) = \arg\min_d \int L(\theta, d) \pi(\theta|x) d\theta$$
This depends on: – The loss function – The posterior distribution
6. Examples of Bayes Rules
Case 1: Squared Error Loss
$$L(\theta, d) = (\theta – d)^2$$
Bayes rule:
$$\delta_B(x) = \mathbb{E}[\theta | x] \quad \text{(posterior mean)}$$
Case 2: Absolute Error Loss
$$L(\theta, d) = |\theta – d|$$
Bayes rule:
$$\delta_B(x) = \text{median of } \pi(\theta|x)$$
7. A Full Example
Let \(X \sim \text{Uniform}(0,\theta)\), \(\theta > 0\)
Assume a prior: \(\theta \sim \text{Gamma}(2,1)\)
Loss function: \(L(\theta, d) = (\theta -d)^2\)
Steps:
- Use Bayes theorem to get posterior
- Compute \(\mathbb{E}[\theta|x]\) = Bayes rule
This gives us a data-informed estimate of \(\theta\)
8. Theoretical Properties of Bayes Rules
Theorem (Admissibility)
Under regularity conditions, Bayes rules are
admissible.
- This means: No other rule is strictly better in
terms of risk.
9. Important Aspects of Bayesian Statistics
- Bayesian statistics uses probability to express
uncertainty about parameters - A prior distribution \(\pi(\theta)\) represents our belief before
observing data - After observing data \(x\), the
posterior distribution is given by Bayes’ theorem:
$$\pi(\theta|x) = \frac{f(x|\theta) \pi(\theta)}{\int f(x|\theta) \pi(\theta) d\theta}$$
- Bayesian inference focuses on the posterior:
Point estimates: posterior mean, mode (MAP), median
Interval estimates: credible intervals, e.g., 95%
interval such that
$$P(\theta \in [a,b] \mid x) = 0.95$$
Decision-making via posterior expected loss
minimization
- Advantages:
Intuitive probabilistic interpretation
Flexibility in incorporating expert knowledge
Naturally handles uncertainty and prediction
- Point estimates: posterior mean, mode (MAP), median
- Interval estimates: credible intervals, e.g., 95%
interval such that
$$P(\theta \in [a,b] \mid x) = 0.95$$ - Decision-making via posterior expected loss
minimization - Intuitive probabilistic interpretation
- Flexibility in incorporating expert knowledge
- Naturally handles uncertainty and prediction
10. Example: Visualizing Prior and Posterior
library(ggplot2)
library(dplyr)
library(tidyr)
x <- 4
theta <- seq(0.01, 10, length.out = 500)
prior <- dgamma(theta, shape = 2, rate = 1)
likelihood <- ifelse(theta >= x, 1/theta, 0)
likelihood <- likelihood / max(likelihood)
posterior <- prior * likelihood
posterior <- posterior / max(posterior)
data <- tibble(
theta = theta,
Prior = prior,
Likelihood = likelihood,
Posterior = posterior
) %>%
pivot_longer(cols = -theta, names_to = "Distribution", values_to = "Density")
ggplot(data, aes(x = theta, y = Density, color = Distribution)) +
geom_line(size = 1.2) +
labs(title = "Bayesian Updating: Prior, Likelihood, and Posterior",
x = expression(theta), y = "Normalized Density") +
theme_minimal()
Bayesian Updating: Prior, Likelihood, and Posterior
11. Remarks on Bayes Rules
- Depend on the prior: Different priors may lead to
different decisions. - Can be non-unique: Multiple rules may yield the
same Bayes risk. - Can simplify computations using conjugate
priors. - Connect naturally with statistical decision
theory.
12. Summary and Outlook
What we learned:
- Bayes risk combines risk and prior beliefs
- Bayes rules minimize Bayes risk
- Easy to compute in many common problems
- Often admissible and interpretable
- Basic concepts of Bayesian statistics support the theory
What’s next:
In Edition 7, we’ll explore Bayesian Estimation under Weighted Quadratic Loss.
Stay tuned for the next part!
We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.
We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.
Our core services include:
– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.
– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.
– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.
– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).
– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.
– Scientific Data Analysis:
Advanced analytical support for scientific research projects.