Hypothesis tests allow us to assess evidence provided by the data in favor of some claim about a population parameter.
Often we want to decide if the observed value of a statistic (calculated from data) is consistent with some hypothesized value of a parameter (true population parameter is usually unknown).
Can the difference (often observed) between the observed and hypothesized values be attributed to random chance?
Example: A well established genetic model suggests that a particular experimental cross of lupine should have blue flowers with probability 0.75 and white flowers with probability 0.25. In an experiment with 200 seeds, 142 are blue flowering. Is this consistent with the model?
The hypothesis value of \(p=0.75\),
observed statistic \(\hat p={142\over 200}=0.71\)
A statistical hypothesis test involves two opposing hypotheses.
The null hypothesis, \(H_0\):
Often a statement of “no effect” or “no difference”, the status quo
A test is designed to assess the strength of evidence against \(H_0\).
The alternative hypothesis, \(H_a\):
* Often states that there is an effect or a difference.
* Test procedures do not treat the hypotheses equally.
Test procedures do not treat the hypotheses equally.
Favor/Reject the \(H_0\) (null hypothesis) assume \(H_0\) is true values we find evidence to reject \(H_0\).
\(H_a:p=0.75\), \(H_0:p\neq 0.75\)
Example:
A hypothesis test had two possible conclusions
Reject \(H_0\), and conclude \(H_a\) is true, there is sufficient evidence against the \(H_0\), and we can claim the \(H_a\) is true.
Fail to reject \(H_0\)
Either \(H_0\) or \(H_a\) could be true.
Or there is no sufficient evidence against \(H_0\).
Since we are using sample data to draw a conclusion about a population parameter, we may make an incorrect conclusion (by random chance)
A Type I error occurs when we reject \(H_0\) when \(H_0\) is true.
A Type II error occurs when we fail to reject \(H_0\) when \(H_0\) is false.
| Decision\Truth | Truth \(H_0\) is true | \(H_0\) is false (\(H_a\) is true) |
|---|---|---|
| Do not reject \(H_0\) | True negative | Type II error (false negative) |
| Reject \(H-0\) | Type I error (false positive) | True positive |
A (very) simplified example: Suppose we receive a large shipment of parts. We take a random sample of parts in the shipment in order to determine whether to accept or reject the shipment.
\(H_0\): shipment of parts is good
\(H_a\): shipment of parts is bad
We could accept a shipment of bad parts. (Type II error)
We could reject a shipment of good parts. (Type I error)
Can we prevent Type I and Type II errors form happening?
No. But we can control the probabilities that they occur.
\(\alpha =\) probability of having type I error
\(\beta =\) probability of having type II error
As \(\alpha\) decreases, \(\beta\) increases (and vice versa). In real life, we usually control alpha first.
It is easiest to control \(\alpha\).
We will call \(\alpha\) the level of significant of a hypothesis test. \(\alpha\) is determined by the research and is a subjective choice.
Example: Consider a criminal trial
\(H_0\): Defendant is innocent.
\(H_a\): Defendant is guilty.
Reject \(H_0\): Find the defendant is guilty.
Do not reject \(H_0\): Not sufficient evidence to show that the defendant is guilty. (not guilty)
Type I error: Find a innocent person guilty.
Type II error: Find a innocent person not guilty.
A hypothesis test allows us to asses the evidence provided by the data in favor of some claim about an unknown population parameter.
Our hypothesis tests will be performed with level of significance \(\alpha\). The most common choices are 0.01,0.05,0.1.
We will cover hypothesis tests for proportions, means, and variances.
Procedure:
State the hypotheses.
Compute a test statistic.
Based on an esimation of the parameter.
Has a known distribution of the estimation of \(H_0\),
Rejection rule or region
p-value
Example: In 1995, 40% of adults aged 18 years or older reported that they had “a great deal” of confidence in the public schools. On June 1, 2005, the Gallup Organization released results of a poll in which 372 of 1004 adults aged 18 years or older stated that they had “a great deal” of confidence in public schools. Does the evidence suggest at the \(\alpha\) = 0.05 significance level that the proportion of adults aged 18 years or older having “a great deal” of confidence in the public schools is lower in 2005 than in 1995?
\(p=\) percentage of adults having “a great deal” of confidence in the public schools.
Let \(X\sim Bin(n,p)\) and let \(\hat p={X\over n}\) be the sample proportion. (The number of adults being confidence in public school among the sampled 1004 adults)
We assume \(p_0\) be 40%
\[ H_0;p=p_0~~~H_0:p=p_0~~~H_0:p=p_0\\ H_a;p>p_0~~~H_a:p<p_0~~~H_a:p\neq p_0\\ \]
The first is called one-sided right/upper test
The second is called one-sided left/lower test
The third is called two-sided test/two tail test
Example:
\(H_0:p=0.4\)
\(H_a:p<0.4\)
\[ Z_{H_0}={\hat p-p_0\over \sqrt{p_0(i-p_0)\over n}} \]
\(\sqrt{p_0(i-p_0)\over n}\) is the standard error of \(\hat p\)
Condition: \(np_0\geq 5\) and \(n(i-p_0)\geq5\)
Example:
\[ \hat p={372\over 1004}=0.371\\ Z_{H_0}={\hat p-p_0\over \sqrt{p_0(i-p_0)\over n}}={0.371-0.4\over\sqrt{0.4(1-0.4)\over 1004} }=-1.876 \]
If \(H_0\) is true, \(Z_{H_0}\dot \sim N(0,1)\).
We want to determine if the observed value of \(Z_{H_0}\) is unusual for a \(N(0,1)\) random variable.
If the observed value of \(Z_{H_0}\) is unusual, then we reject \(H_0\).
i. Rejection rule or region:
$H_a:p>p_0$
$H_a:p<p_0$
$H_a:p\neq p_0$
ii. p-value:
- Using a p-value conveys the strength of the evidence agianst $H_0$.
- The p-value is the probability of observing what was obserbed if $H_0$ is true.
$H_a:p>p_0$
$H_a:p<p_0$
$H_a:p\neq p_0$