Introduction

In the previous edition, we explored how decision theory provides rules for estimating unknown parameters using loss functions and Bayesian rules. Now, we go one step further by introducing weighted quadratic loss and how it changes the optimal Bayesian decision rule.


1. Loss Functions Revisited

Recall that in classical Bayesian estimation, the quadratic loss function is:

$$
L(\theta, d) = (\theta – d)^2
$$

Its Bayes rule is:

$$
\delta(\vec{x}) = \mathbb{E}[\theta \mid \vec{x}]
$$

This rule minimizes expected loss by taking the posterior mean of \(\theta\).


2. Introducing Weighted Quadratic Loss

But what if some values of \(\theta\) are more important to estimate accurately than others?

We use a weighted quadratic loss function:

$$
L(\theta, d) = w(\theta)(\theta – d)^2
$$

Here, \(w(\theta)\) is a non-negative weight function that reflects how much we care about making accurate decisions near \(\theta\).

  • Example: In finance, overestimating risk may be more costly when actual risk is low (so you might downweight small \(\theta\)).

3. Bayesian Rule Under Weighted Loss

To minimize the posterior expected weighted loss, we choose the decision \(d\) that minimizes:

$$
\int w(\theta)(\theta – d)^2 , \pi(\theta \mid \vec{x}) , d\theta
$$

This is a weighted mean, and the solution is:

$$
\delta(\vec{x}) = \frac{\int \theta , w(\theta) , \pi(\theta \mid \vec{x}) , d\theta}{\int w(\theta) , \pi(\theta \mid \vec{x}) , d\theta}
$$

This rule gives more weight to values of \(\theta\) where \(w(\theta)\) is high.


4. Practical Implications

This is especially useful in applications where:

  • Some estimation errors are more costly than others,
  • You want to emphasize certain parameter regions.

Example: In insurance fraud detection, missing high-risk customers may be more expensive than falsely flagging low-risk ones.


Summary

Loss FunctionBayes Rule \(\delta(\vec{x})\)
Quadratic\(\mathbb{E}[\theta \mid \vec{x}]\)
Weighted Quadratic\(\displaystyle \frac{\int \theta w(\theta) \pi(\theta \mid \vec{x}) d\theta}{\int w(\theta) \pi(\theta \mid \vec{x}) d\theta}\)

The weighted quadratic loss gives us flexible, context-aware decision rules.


What’s Next

In the next edition (8), we apply these rules concretely to well-known models:

  • The Poisson model for count data,

  • The Normal model for continuous data.

We’ll show how these rules work in practice and how the choice of weights affects decisions.

Stay tuned for the next part!

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.