Content

Main topics:

  • Admissible decision rules

  • Bayes decision rules

  • Minimax decision rules

Context: Parameter estimation


1. Admissibility: The Idea

In statistical decision theory, we aim to choose good decision rules from all possible ones.

Let \(\mathcal{D}\) = the set of all decision rules.
We want to reduce \(\mathcal{D}\) to a sensible subset \(\xi\) of good decision rules.

We do this by comparing their risk — the average loss when using a rule.


1.1. What is a Complete Class?

Definition: A subset \(\xi \subset \mathcal{D}\) is:

  • Complete, if every rule not in \(\xi\) is strictly worse than some rule in \(\xi\).

\(R(\theta, \delta_2) \le R(\theta, \delta_1) \ \text{for all} \ \theta \text{ and strictly less for some } \theta\)

  • If a rule is outside the set \(\xi\), it’s worse than some rule in \(\xi\) for at least one value of \(\theta\).

  • This helps us ignore bad rules and keep only sensible ones.


1.2. Essentially Complete Class

Definition: A class \(\xi\) is essentially complete if every rule not in \(\xi\) is at least matched by some rule in \(\xi\).

That is:
\(R(\theta, \delta_2) \le R(\theta, \delta_1) \ \text{for all} \ \theta\)

  • Here, some rules may be just as good, but not better.
  • We still eliminate inferior rules, but allow for ties.

1.3. Minimal (Essentially) Complete Classes

  • Minimal complete: \(\xi\) is complete, and no smaller set is also complete.

  • Minimal essentially complete: same logic, but for essential completeness.

These are the smallest groups of unbeatable decision rules. Nothing outside can compete, and nothing inside is redundant.


1.4. A Practical Example: Investing

Suppose:

  • \(\theta_1\): Bull Market

  • \(\theta_2\): Bear Market

6 investment strategies \(d_1, \dots, d_6\)

Loss matrix:

d₁d₂d₃d₄d₅d₆
\(\theta_1\) (Bull)1719141099
\(\theta_2\) (Bear)14446816
  • Not convex: might benefit from randomizing decisions

  • Independent of observed data


1.5. Randomized vs Nonrandomized Decisions

  • Nonrandomized: Pick one strategy with 100% certainty. E.g., always choose \(d_1\).

  • Randomized: Mix strategies. E.g., pick \(d_1\) with 50%, \(d_2\) with 50%, etc.

Let \(\vec{p} = (p_1, \dots, p_6)^T\), with \(\sum p_i = 1\).

  • \(\delta_{\vec{p}}\): randomized rule based on probabilities \(\vec{p}\)

  • \(\delta_i\): deterministic rule that chooses \(d_i\) only


1.6. Calculating Risk for Randomized Rules

Under \(\theta_1\):
\(R(\theta_1, \delta_{\vec{p}}) = 17p_1 + 19p_2 + 14p_3 + 10p_4 + 9p_5 + 9p_6\)

Under \(\theta_2\):
\(R(\theta_2, \delta_{\vec{p}}) = 14p_1 + 4p_2 + 4p_3 + 6p_4 + 8p_5 + 16p_6\)

This is a linear combination:

\(\vec{R} = A \cdot \vec{p}\)

Where \(A\) is the loss matrix.


1.7. Interpretation

Each column in the matrix corresponds to a nonrandomized rule:

For instance,

\(R(\theta, \delta_1) = \begin{bmatrix} 17 \\ 14 \end{bmatrix} = \vec{a_1} \quad \text{with } \vec{p} = (1,0,0,0,0,0)^T\)

The region of all possible risk values (randomized or not) is visualized below:


1.8. Convex Risk Region

All risk combinations:
\(\mathcal{R} = \{ \vec{R} = A \cdot \vec{p} \mid \sum p_i = 1, \ p_i \ge 0 \}\)

  • \(\mathcal{R}\) is a convex set

— it includes all mixtures.

We seek the boundary points that are unbeatable:

\(\xi = \{ \lambda \vec{a_5} + (1-\lambda)\vec{a_4}, \ \lambda \vec{a_4} + (1-\lambda)\vec{a_3} \}\)

These linear combinations define the minimal complete class of decision rules.

The decision rules in \(\xi\) are unbeatable


2. What is Admissibility?

Definition:

  • A rule \(\delta\) is admissible if no rule exists that is better in all scenarios.

  • Otherwise, it is inadmissible.

Admissible decision rules are unbeatable.

Usually, there are many admissible rules.


2.1. R-Incomparable Decision Rules

Definition:

Two rules \(\delta_1\) and \(\delta_2\) are R-incomparable if:

  • \(\delta_1\) is better for some \(\theta\)

  • \(\delta_2\) is better for some other \(\theta\)

Theorem:

If \(\delta_1, \delta_2\) are both admissible, then:

  • Either they are R-equivalent

  • Or R-incomparable


2.2. Role of Sufficient Statistics

Let \(T(X)\) be a sufficient statistic for \(\theta\).

Then:

  • A rule based on \(X\) can be replaced by an equivalent rule based on \(T(X)\)

  • We lose no performance, but simplify the rule

See Berger, p. 36, Theorem 1


3. A Deep Result: Minimal Complete Class = Admissible Rules

Theorem:

If a minimal complete class \(\xi\) exists, then:

\(\xi = \mathcal{A}\)
Where \(\mathcal{A} = \{ \delta \in \mathcal{D} : \delta \text{ is admissible} \}\)

Proof Outline:

  1. Any rule not in \(\xi\) is inadmissible \(\Rightarrow \mathcal{A} \subset \xi\)

  2. If any rule in \(\xi\) were inadmissible, \(\xi\) wouldn’t be minimal \(\Rightarrow \xi \subset \mathcal{A}\)


Final Notes

  • Minimal complete class = Set of admissible rules

  • Admissible rules link directly to Bayes rules (coming next!)

  • This concept filters out all “bad” or dominated decision rules

Next: Bayes decision rules and minimax decision rules.

Stay tuned for the next part!

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.