1. Minimax Rules: Example
Consider nonrandomized decision rules only. Randomized minimax rules will be discussed later.
$\begin{array}{c|cccccc}
& d_1 & d_2 & d_3 & d_4 & d_5 & d_6 \
\hline
R(\theta_1, d_i) & 17 & 19 & 14 & 10 & 9 & 9 \
R(\theta_2, d_i) & 14 & 4 & 4 & 6 & 8 & 16 \
\hline
\sup_{j = 1,2}R(\theta_j, d_i) & 17 & 19 & 14 & 10 & 9 & 16
\end{array}$
Nonrandomized minimax rule:
\(\sup_{\theta \in \{\theta_1, \theta_2\}}R(\theta, d_M) = \inf_{d_i} \sup_{\theta} R(\theta, d_i) = 9\)
Hence, \(d_M = d_5\) yields the minimum maximal (minimax) risk.
Notes
The minimax rule \(d_5 \neq\) Bayes rule except for \(w \in [\frac{2}{3}, 1]\).
Randomized minimax rules require least favorable distributions (to be discussed).
2. Convexity of Risk Set
Theorem: For finite \(\Theta = \{\theta_1, \ldots, \theta_k\}\), the set of risk points \(\mathcal{R}\) is convex in \(\mathbb{R}^k\).
Idea of proof: Show that \(\lambda u + (1 – \lambda)v \in \mathcal{R} \forall \lambda \in (0, 1)\).
3. Admissibility and Bayes Rules
Theorem: For finite \(\Theta\), every admissible decision rule is also a Bayes rule with some prior \(\pi\).
Definition: A positive Bayes rule uses a prior \(\pi\) such that \(\pi(\theta_j) > 0 \; \forall j\).
Theorem: Every positive Bayes rule is admissible.
4. Generalized Minimax Rules
Let:
\(\Theta_{\ast} = \{ \pi(\cdot) \mid \pi \text{ is a prior on } \Theta \}\)
Definition: A rule \(\delta_M \in \mathcal{D}\) is minimax if:
\(\sup_{\pi \in \Theta_{\ast}} B(\pi, \delta_M) = \inf_{\delta \in \mathcal{D}} \sup_{\pi} B(\pi, \delta)\)
\(\overline{V} = \inf_{\delta} \sup_{\pi} B(\pi, \delta)\): upper value
\(\underline{V} = \sup_{\pi} \inf_{\delta} B(\pi, \delta)\): lower value
5. Remarks
\(\delta_M\) is minimax if \(\sup_{\pi} B(\pi, \delta_M) = \overline{V}\)
\(\underline{V} \leq \overline{V}\)
Equivalent characterization:
\(\sup_{\theta} R(\theta, \delta_M) = \inf_{\delta} \sup_{\theta} R(\theta, \delta)\)
6. Least Favorable Prior
If \(\pi_0 \in \Theta_{\ast}\) such that:
\(\inf_{\delta} B(\pi_0, \delta) = \underline{V}\)
then \(\pi_0\) is a least favorable prior.
7. Questions
When is \(\underline{V} = \overline{V}\)?
When do minimax rules and least favorable priors exist?
8. Game Theory Application
Two-person zero-sum games: loss of one = gain of other.
Statistical decision problems = statistician vs. nature.
Nature chooses \(\theta\), statistician chooses \(\delta\).
9. Finding a Minimax Rule
Guess a least favorable prior \(\tau_0\)
Compute corresponding Bayes rule \(\delta_0\)
Check if \(\delta_0\) satisfies:
\(R(\theta, \delta_0) \leq B(\tau_0, \delta_0) ~ \forall \theta\)
If so: \(\delta_0\) is minimax.
10. Improper Priors and Limits
Even if \(\tau_0\) is improper, one can use a sequence of proper priors \(\pi_m\) such that:
\(c_m \pi_m(\theta) \to \tau_0(\theta)\)
Then limit theorems ensure minimaxity.
11. Equalizer Rules
Definition: A rule \(\delta\) is an equalizer rule if:
\(R(\theta, \delta) = c ~ \forall \theta\)
Theorem: If \(\delta\) is an equalizer rule and Bayes w.r.t. a proper prior, then \(\delta\) is minimax.
Example: Binomial Case
\(X_i \sim Bi(1, p), \quad Y = \sum X_i, \quad \hat{p} = Y/n\)
Loss: \(L(p, d) = \frac{(p-d)^2}{p(1-p)}\)
Then \(\hat{p}\) is:
an equalizer rule
Bayes under a proper prior
Counter Example
Casella & Strawderman (1981): \(\delta_m(X) = m \cdot \tanh(mX)\)
Bayes under 2-point prior on \(\{-m, m\}\)
Minimax but not an equalizer
12. Admissible + Equalizer = Minimax
Theorem: If \(\delta\) is admissible and equalizer, then \(\delta\) is minimax.
12. Stein Phenomenon
In \(\mathbb{R}^p\), for \(p \geq 3\), the standard estimator \(\delta(x) = x\) is inadmissible.
James-Stein Estimator:
\(\delta^{JS}(x) = x\left(1 – \frac{p-2}{\|x\|^2}\right)\)
dominates \(\delta(x) = x\)
shrinkage estimator
13. Risks of James-Stein
\(R(\theta, x) = p\)
\(R(\theta, \delta^{JS}) < p\)
For p = 1: \(\delta\) has lower risk for small \(\theta\)
For p = 2: equal risk (R-equivalent)
For p \geq 3: \(\delta^{JS}\) is better
Presented below is the graphical depiction of the James-Stein risk function.
plot of chunk setup2332122612prepa
Further readings
\bigskip
Berger (1985), Casella & Berger (1990), Casella & Strawderman (1981), Stein (1955)
Stay tuned for the next part!
We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.
We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.
Our core services include:
– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.
– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.
– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.
– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).
– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.
– Scientific Data Analysis:
Advanced analytical support for scientific research projects.