1. Point Estimation Example
Estimating Variance in the Normal Model
Let \(X_1, \ldots, X_n \sim N(\mu, \sigma^2)\).
We are interested in estimating \(\theta = \sigma^2\).
Well-known estimators:
- Sample variance (unbiased):
$$\hat{\sigma}^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i – \bar{X})^2 = s^2$$
- Maximum likelihood estimator:
$$\hat{\sigma}^2_{ML} = \frac{1}{n} \sum_{i=1}^n (X_i – \bar{X})^2$$
A class of estimators:
We define a family of estimators:
$$\delta_\lambda(\vec{X}) = \lambda s^2 \quad \text{with } \lambda \in \mathbb{R}^+$$
- When \(\lambda = 1\), we get the unbiased estimator
- When \(\lambda = \frac{n-1}{n}\), we get the MLE
Objective
We want to find the value of \(\lambda\) that minimizes the mean squared error (MSE):
$$
M(\sigma^2, \delta_\lambda) = \mathrm{Var}(\delta_\lambda) + (\mathrm{Bias}(\sigma^2, \delta_\lambda))^2
$$
2. Convex Loss Functions in Estimation
Remarks
- The quadratic loss \(L(\theta, d) = (\theta – d)^2\) is convex in \(d\).
- Convex loss functions are important in decision theory:
Berger (1985), Theorem 4 (Rao-Blackwell), p.41:
“For estimation problems with a convex loss function and an existing sufficient statistic \(T\), only nonrandomized decision rules based on \(T\) need to be considered.”
3. Hypothesis Testing
Setup
Let \(\Theta = \Theta_0 \cup \Theta_1\).
We test:
\(H_0: \theta \in \Theta_0 \quad vs. \quad H_1: \theta \in \Theta_1\)
Decisions:
- \(d_0\): Accept \(H_0\)
- \(d_1\): Accept \(H_1\)
A decision rule is a function:
\(\delta: \mathbb{R}^n \rightarrow \{d_0, d_1\}\)
Nonrandomized Test
Define:
- \(A = {\vec{x} \mid \delta(\vec{x}) = d_0 }\): Acceptance region
- \(C = {\vec{x} \mid \delta(\vec{x}) = d_1 }\): Critical region
Loss Function
0-1 loss:
$$L(\theta, d_0) = \begin{cases} 0 & \text{if } \theta \in \Theta_0 \\ 1 & \text{if } \theta \in \Theta_1 \end{cases}$$
$$L(\theta, d_1) = \begin{cases} 1 & \text{if } \theta \in \Theta_0 \\ 0 & \text{if } \theta \in \Theta_1 \end{cases}$$
Asymmetric loss:
$$L(\theta, d_0) = \begin{cases} 0 & \text{if } \theta \in \Theta_0 \\ l_0 & \text{if } \theta \in \Theta_1 \end{cases}$$
$$L(\theta, d_1) = \begin{cases} l_1 & \text{if } \theta \in \Theta_0 \\ 0 & \text{if } \theta \in \Theta_1 \end{cases}$$
Where \(l_0\), \(l_1\) are costs for type II and type I errors respectively.
Randomized Test
We now allow randomization:
- \(\vec{x} \in A\): Choose \(d_0\)
- \(\vec{x} \in C\): Choose \(d_1\)
- \(\vec{x} \in R\): Choose \(d_1\) with probability \(\gamma(\vec{x})\), otherwise \(d_0\)
The decision rule:
$$
\delta(\vec{x}) = \varphi(\vec{x}) = \begin{cases}
1 & \text{if } \vec{x} \in C \
0 & \text{if } \vec{x} \in A \
\gamma(\vec{x}) & \text{if } \vec{x} \in R
\end{cases}
$$
4. Risk Function Under Randomization
Assume \(\Theta = {\theta_0, \theta_1}\)
- For \(\theta = \theta_1\):
$$R(\theta_1, \varphi) = l_0 P_{\theta_1}(\vec{x} \in A) + l_0 P_{\theta_1}(\vec{x} \in R) (1 – \gamma(\vec{x}))$$
- For \(\theta = \theta_0\):
$$R(\theta_0, \varphi) = l_1 P_{\theta_0}(\vec{x} \in C) + l_1 P_{\theta_0}(\vec{x} \in R) \gamma(\vec{x})$$
5. Special Cases
Nonrandomized Tests
- \(\theta = \theta_1\):
$$R(\theta_1, \varphi) = l_0 P_{\theta_1}(\vec{x} \in A)$$
- \(\theta = \theta_0\):
$$R(\theta_0, \varphi) = l_1 P_{\theta_0}(\vec{x} \in C)$$
Under 0-1 Loss
- \(\theta = \theta_1\):
$$R(\theta_1, \varphi) = P_{\theta_1}(\vec{x} \in A) = 1 – P_{\theta_1}(\vec{x} \in C)$$
- \(\theta = \theta_0\):
$$R(\theta_0, \varphi) = P_{\theta_0}(\vec{x} \in C)$$
So the risk under 0-1 loss equals the power function of the test.
Summary
- Point estimation can be seen as a decision problem under squared error loss
- Estimator choice (e.g., scaling sample variance) affects the MSE
- Convex loss functions lead to focusing on sufficient statistics
- Hypothesis testing is a binary decision problem with associated losses
- Randomized tests add flexibility, especially under asymmetric loss
- Risk functions quantify performance and help compare decision rules
We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.
Stay tuned for the next part!
We gratefully acknowledge Dr. Dany Djeudeu for preparing this course.
We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.
Our core services include:
– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.
– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.
– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.
– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).
– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.
– Scientific Data Analysis:
Advanced analytical support for scientific research projects.