Introduction

In the previous edition, we explored how decision theory provides rules for estimating unknown parameters using loss functions and Bayesian rules. Now, we go one step further by introducing weighted quadratic loss and how it changes the optimal Bayesian decision rule.

1. Loss Functions Revisited

Recall that in classical Bayesian estimation, the quadratic loss function is:

$$
L(\theta, d) = (\theta – d)^2
$$

Its Bayes rule is:

$$
\delta(\vec{x}) = \mathbb{E}[\theta \mid \vec{x}]
$$

This rule minimizes expected loss by taking the posterior mean of $\theta$.

2. Introducing Weighted Quadratic Loss

But what if some values of $\theta$ are more important to estimate accurately than others?

We use a weighted quadratic loss function:

$$
L(\theta, d) = w(\theta)(\theta – d)^2
$$

Here, $w(\theta)$ is a non-negative weight function that reflects how much we care about making accurate decisions near $\theta$.

Example: In finance, overestimating risk may be more costly when actual risk is low (so you might downweight small $\theta$).

3. Bayesian Rule Under Weighted Loss

To minimize the posterior expected weighted loss, we choose the decision $d$ that minimizes:

$$
\int w(\theta)(\theta – d)^2 , \pi(\theta \mid \vec{x}) , d\theta
$$

This is a weighted mean, and the solution is:

$$
\delta(\vec{x}) = \frac{\int \theta , w(\theta) , \pi(\theta \mid \vec{x}) , d\theta}{\int w(\theta) , \pi(\theta \mid \vec{x}) , d\theta}
$$

This rule gives more weight to values of $\theta$ where $w(\theta)$ is high.

4. Practical Implications

This is especially useful in applications where:

Some estimation errors are more costly than others,
You want to emphasize certain parameter regions.

Example: In insurance fraud detection, missing high-risk customers may be more expensive than falsely flagging low-risk ones.

Summary

Loss Function	Bayes Rule $\delta(\vec{x})$
Quadratic	$\mathbb{E}[\theta \mid \vec{x}]$
Weighted Quadratic	$\displaystyle \frac{\int \theta w(\theta) \pi(\theta \mid \vec{x}) d\theta}{\int w(\theta) \pi(\theta \mid \vec{x}) d\theta}$

The weighted quadratic loss gives us flexible, context-aware decision rules.

What’s Next

In the next edition (8), we apply these rules concretely to well-known models:

The Poisson model for count data,
The Normal model for continuous data.

We’ll show how these rules work in practice and how the choice of weights affects decisions.

Stay tuned for the next part!

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.

3 D Statistical Learning

We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.

Our core services include:

– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.

– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.

– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.

– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).

– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.

– Scientific Data Analysis:
Advanced analytical support for scientific research projects.