Goal: to estimate unknown population parameters
model free (sample mean to estimate population mean, sample variance to estimate population variance…)
model based (start from a prob distribution then used data to estimate unknown pane.)
We will focus first on point estimation.
Notation:
\(\theta\) is the unknown population parameter of interest.
\(X_1,...X_n\) is the sample before values are observed.
Random variable.
\(x_1,...x_n\) are the observed sample values or data.
Fixed values (not random)
\(\hat \theta\) is a quantity used to estimate θ.
\(\hat \theta\) \(X_1,...X_n\) is called an estimator, a function of random variables
\(\hat \theta\) \(x_1,...x_n\) is called an estimate, a function of observed values.
For \(\theta=\mu\), \(\hat \theta(X_1,...X_n)={1\over n}\sum^n_{i=1}X_i\).
Often there are multiple ways of estimating a parameter.
How do we choose one estimator over another?
What are some properties of “good” estimators?
Definition: An estimator \(\hat \theta\) of \(\theta\) is unbiased for θ if
\[ E_{\theta}(\hat\theta)=\theta \]
Defnition: The bias of \(\hat \theta\) is
\[ bias(\hat\theta)=E_{\theta}(\hat\theta)-\theta \]
Examples:
\(\bar X\) is an unbiased estimator of \(\mu=E(X_i)\).
\(\hat p\) is an unbiased estimator of p.
Let \(X_i,..,X_n\) be iid with common variance \(\sigma^2\). Then
\[ S^2={1\over n-1}\sum^n_{i=1}(X_i-\bar X)^2 \]
is unbiased for \(\sigma^2\)
Note: S is a biased estimator of \(\sigma\)
Definition: The standard error of an estimator \(\hat \theta\) is
\[ \sigma_{\hat \theta}=\sqrt{Var_{\theta}(\hat \theta)} \]
The estimated standard error is denoted by \(S_{\hat\theta}\)
Example: Let \(\bar X_1\) and \(S_1^2\) be the sample mean and variance of a simple random sample of size \(n_1\) from a population with mean \(\mu_1\) and variance \(\sigma_1^2\). Let \(\bar X_2\) and \(S_2^2\) be the sample mean and variance of a simple random sample of size \(n_2\) from a population with mean \(\mu_2\) and variance \(\sigma^2_2\). Assume the two samples are independent.