Content
Main topics:
Admissible decision rules
Bayes decision rules
Minimax decision rules
Context: Parameter estimation
1. Admissibility: The Idea
In statistical decision theory, we aim to choose good decision rules from all possible ones.
Let \(\mathcal{D}\) = the set of all decision rules.
We want to reduce \(\mathcal{D}\) to a sensible subset \(\xi\) of good decision rules.
We do this by comparing their risk — the average loss when using a rule.
1.1. What is a Complete Class?
Definition: A subset \(\xi \subset \mathcal{D}\) is:
- Complete, if every rule not in \(\xi\) is strictly worse than some rule in \(\xi\).
\(R(\theta, \delta_2) \le R(\theta, \delta_1) \ \text{for all} \ \theta \text{ and strictly less for some } \theta\)
If a rule is outside the set \(\xi\), it’s worse than some rule in \(\xi\) for at least one value of \(\theta\).
This helps us ignore bad rules and keep only sensible ones.
1.2. Essentially Complete Class
Definition: A class \(\xi\) is essentially complete if every rule not in \(\xi\) is at least matched by some rule in \(\xi\).
That is:
\(R(\theta, \delta_2) \le R(\theta, \delta_1) \ \text{for all} \ \theta\)
- Here, some rules may be just as good, but not better.
- We still eliminate inferior rules, but allow for ties.
1.3. Minimal (Essentially) Complete Classes
Minimal complete: \(\xi\) is complete, and no smaller set is also complete.
Minimal essentially complete: same logic, but for essential completeness.
These are the smallest groups of unbeatable decision rules. Nothing outside can compete, and nothing inside is redundant.
1.4. A Practical Example: Investing
Suppose:
\(\theta_1\): Bull Market
\(\theta_2\): Bear Market
6 investment strategies \(d_1, \dots, d_6\)
Loss matrix:
| d₁ | d₂ | d₃ | d₄ | d₅ | d₆ | |
|---|---|---|---|---|---|---|
| \(\theta_1\) (Bull) | 17 | 19 | 14 | 10 | 9 | 9 |
| \(\theta_2\) (Bear) | 14 | 4 | 4 | 6 | 8 | 16 |
Not convex: might benefit from randomizing decisions
Independent of observed data
1.5. Randomized vs Nonrandomized Decisions
Nonrandomized: Pick one strategy with 100% certainty. E.g., always choose \(d_1\).
Randomized: Mix strategies. E.g., pick \(d_1\) with 50%, \(d_2\) with 50%, etc.
Let \(\vec{p} = (p_1, \dots, p_6)^T\), with \(\sum p_i = 1\).
\(\delta_{\vec{p}}\): randomized rule based on probabilities \(\vec{p}\)
\(\delta_i\): deterministic rule that chooses \(d_i\) only
1.6. Calculating Risk for Randomized Rules
Under \(\theta_1\):
\(R(\theta_1, \delta_{\vec{p}}) = 17p_1 + 19p_2 + 14p_3 + 10p_4 + 9p_5 + 9p_6\)
Under \(\theta_2\):
\(R(\theta_2, \delta_{\vec{p}}) = 14p_1 + 4p_2 + 4p_3 + 6p_4 + 8p_5 + 16p_6\)
This is a linear combination:
\(\vec{R} = A \cdot \vec{p}\)
Where \(A\) is the loss matrix.
1.7. Interpretation
Each column in the matrix corresponds to a nonrandomized rule:
For instance,
\(R(\theta, \delta_1) = \begin{bmatrix} 17 \\ 14 \end{bmatrix} = \vec{a_1} \quad \text{with } \vec{p} = (1,0,0,0,0,0)^T\)
The region of all possible risk values (randomized or not) is visualized below:
1.8. Convex Risk Region
All risk combinations:
\(\mathcal{R} = \{ \vec{R} = A \cdot \vec{p} \mid \sum p_i = 1, \ p_i \ge 0 \}\)
- \(\mathcal{R}\) is a convex set
— it includes all mixtures.
We seek the boundary points that are unbeatable:
\(\xi = \{ \lambda \vec{a_5} + (1-\lambda)\vec{a_4}, \ \lambda \vec{a_4} + (1-\lambda)\vec{a_3} \}\)
These linear combinations define the minimal complete class of decision rules.
The decision rules in \(\xi\) are unbeatable
2. What is Admissibility?
Definition:
A rule \(\delta\) is admissible if no rule exists that is better in all scenarios.
Otherwise, it is inadmissible.
Admissible decision rules are unbeatable.
Usually, there are many admissible rules.
2.1. R-Incomparable Decision Rules
Definition:
Two rules \(\delta_1\) and \(\delta_2\) are R-incomparable if:
\(\delta_1\) is better for some \(\theta\)
\(\delta_2\) is better for some other \(\theta\)
Theorem:
If \(\delta_1, \delta_2\) are both admissible, then:
Either they are R-equivalent
Or R-incomparable
2.2. Role of Sufficient Statistics
Let \(T(X)\) be a sufficient statistic for \(\theta\).
Then:
A rule based on \(X\) can be replaced by an equivalent rule based on \(T(X)\)
We lose no performance, but simplify the rule
See Berger, p. 36, Theorem 1
3. A Deep Result: Minimal Complete Class = Admissible Rules
Theorem:
If a minimal complete class \(\xi\) exists, then:
\(\xi = \mathcal{A}\)
Where \(\mathcal{A} = \{ \delta \in \mathcal{D} : \delta \text{ is admissible} \}\)
Proof Outline:
Any rule not in \(\xi\) is inadmissible \(\Rightarrow \mathcal{A} \subset \xi\)
If any rule in \(\xi\) were inadmissible, \(\xi\) wouldn’t be minimal \(\Rightarrow \xi \subset \mathcal{A}\)
Final Notes
Minimal complete class = Set of admissible rules
Admissible rules link directly to Bayes rules (coming next!)
This concept filters out all “bad” or dominated decision rules
Next: Bayes decision rules and minimax decision rules.
Stay tuned for the next part!
We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.
We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.
Our core services include:
– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.
– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.
– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.
– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).
– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.
– Scientific Data Analysis:
Advanced analytical support for scientific research projects.