1. Point Estimation Example

Estimating Variance in the Normal Model

Let $X_1, \ldots, X_n \sim N(\mu, \sigma^2)$.

We are interested in estimating $\theta = \sigma^2$.

Well-known estimators:

Sample variance (unbiased):

$$\hat{\sigma}^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i – \bar{X})^2 = s^2$$

Maximum likelihood estimator:

$$\hat{\sigma}^2_{ML} = \frac{1}{n} \sum_{i=1}^n (X_i – \bar{X})^2$$

A class of estimators:

We define a family of estimators:

$$\delta_\lambda(\vec{X}) = \lambda s^2 \quad \text{with } \lambda \in \mathbb{R}^+$$

When $\lambda = 1$, we get the unbiased estimator
When $\lambda = \frac{n-1}{n}$, we get the MLE

Objective

We want to find the value of $\lambda$ that minimizes the mean squared error (MSE):

$$
M(\sigma^2, \delta_\lambda) = \mathrm{Var}(\delta_\lambda) + (\mathrm{Bias}(\sigma^2, \delta_\lambda))^2
$$

2. Convex Loss Functions in Estimation

Remarks

The quadratic loss $L(\theta, d) = (\theta – d)^2$ is convex in $d$.
Convex loss functions are important in decision theory:

Berger (1985), Theorem 4 (Rao-Blackwell), p.41:

“For estimation problems with a convex loss function and an existing sufficient statistic $T$, only nonrandomized decision rules based on $T$ need to be considered.”

3. Hypothesis Testing

Setup

Let $\Theta = \Theta_0 \cup \Theta_1$.

We test:

$H_0: \theta \in \Theta_0 \quad vs. \quad H_1: \theta \in \Theta_1$

Decisions:

$d_0$: Accept $H_0$
$d_1$: Accept $H_1$

A decision rule is a function:

$\delta: \mathbb{R}^n \rightarrow \{d_0, d_1\}$

Nonrandomized Test

Define:

$A = {\vec{x} \mid \delta(\vec{x}) = d_0 }$: Acceptance region
$C = {\vec{x} \mid \delta(\vec{x}) = d_1 }$: Critical region

Loss Function

0-1 loss:

$$L(\theta, d_0) = \begin{cases} 0 & \text{if } \theta \in \Theta_0 \\ 1 & \text{if } \theta \in \Theta_1 \end{cases}$$

$$L(\theta, d_1) = \begin{cases} 1 & \text{if } \theta \in \Theta_0 \\ 0 & \text{if } \theta \in \Theta_1 \end{cases}$$

Asymmetric loss:

$$L(\theta, d_0) = \begin{cases} 0 & \text{if } \theta \in \Theta_0 \\ l_0 & \text{if } \theta \in \Theta_1 \end{cases}$$

$$L(\theta, d_1) = \begin{cases} l_1 & \text{if } \theta \in \Theta_0 \\ 0 & \text{if } \theta \in \Theta_1 \end{cases}$$

Where $l_0$, $l_1$ are costs for type II and type I errors respectively.

Randomized Test

We now allow randomization:

$\vec{x} \in A$: Choose $d_0$
$\vec{x} \in C$: Choose $d_1$
$\vec{x} \in R$: Choose $d_1$ with probability $\gamma(\vec{x})$, otherwise $d_0$

The decision rule:

$$
\delta(\vec{x}) = \varphi(\vec{x}) = \begin{cases}
1 & \text{if } \vec{x} \in C \
0 & \text{if } \vec{x} \in A \
\gamma(\vec{x}) & \text{if } \vec{x} \in R
\end{cases}
$$

4. Risk Function Under Randomization

Assume $\Theta = {\theta_0, \theta_1}$

For $\theta = \theta_1$:

$$R(\theta_1, \varphi) = l_0 P_{\theta_1}(\vec{x} \in A) + l_0 P_{\theta_1}(\vec{x} \in R) (1 – \gamma(\vec{x}))$$

For $\theta = \theta_0$:

$$R(\theta_0, \varphi) = l_1 P_{\theta_0}(\vec{x} \in C) + l_1 P_{\theta_0}(\vec{x} \in R) \gamma(\vec{x})$$

5. Special Cases

Nonrandomized Tests

$\theta = \theta_1$:

$$R(\theta_1, \varphi) = l_0 P_{\theta_1}(\vec{x} \in A)$$

$\theta = \theta_0$:

$$R(\theta_0, \varphi) = l_1 P_{\theta_0}(\vec{x} \in C)$$

Under 0-1 Loss

$\theta = \theta_1$:

$$R(\theta_1, \varphi) = P_{\theta_1}(\vec{x} \in A) = 1 – P_{\theta_1}(\vec{x} \in C)$$

$\theta = \theta_0$:

$$R(\theta_0, \varphi) = P_{\theta_0}(\vec{x} \in C)$$

So the risk under 0-1 loss equals the power function of the test.

Summary

Point estimation can be seen as a decision problem under squared error loss
Estimator choice (e.g., scaling sample variance) affects the MSE
Convex loss functions lead to focusing on sufficient statistics
Hypothesis testing is a binary decision problem with associated losses
Randomized tests add flexibility, especially under asymmetric loss
Risk functions quantify performance and help compare decision rules

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.

Stay tuned for the next part!

We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.

3 D Statistical Learning

We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.

Our core services include:

– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.

– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.

– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.

– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).

– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.

– Scientific Data Analysis:
Advanced analytical support for scientific research projects.