Introduction to Hypothesis Testing

Example: A well established genetic model suggests that a particular experimental cross of lupine should have blue flowers with probability 0.75 and white flowers with probability 0.25. In an experiment with 200 seeds, 142 are blue flowering. Is this consistent with the model?

The hypothesis value of \(p=0.75\),

observed statistic \(\hat p={142\over 200}=0.71\)

A statistical hypothesis test involves two opposing hypotheses.

*   Often states that there is an effect or a difference.

*   Test procedures do not treat the hypotheses equally.

\(H_a:p=0.75\), \(H_0:p\neq 0.75\)

Example:

A hypothesis test had two possible conclusions

Since we are using sample data to draw a conclusion about a population parameter, we may make an incorrect conclusion (by random chance)

Decision\Truth Truth \(H_0\) is true \(H_0\) is false (\(H_a\) is true)
Do not reject \(H_0\) True negative Type II error (false negative)
Reject \(H-0\) Type I error (false positive) True positive

A (very) simplified example: Suppose we receive a large shipment of parts. We take a random sample of parts in the shipment in order to determine whether to accept or reject the shipment.

\(H_0\): shipment of parts is good

\(H_a\): shipment of parts is bad

Can we prevent Type I and Type II errors form happening?

Example: Consider a criminal trial

Hypothesis Tests

A hypothesis test allows us to asses the evidence provided by the data in favor of some claim about an unknown population parameter.

Procedure:

  1. State the hypotheses.

  2. Compute a test statistic.

  1. Reach a conclusion to the test.
  1. Rejection rule or region

  2. p-value

  1. State your conclusion in the contest of the problem.

Hypothesis Tests for Proportions

Example: In 1995, 40% of adults aged 18 years or older reported that they had “a great deal” of confidence in the public schools. On June 1, 2005, the Gallup Organization released results of a poll in which 372 of 1004 adults aged 18 years or older stated that they had “a great deal” of confidence in public schools. Does the evidence suggest at the \(\alpha\) = 0.05 significance level that the proportion of adults aged 18 years or older having “a great deal” of confidence in the public schools is lower in 2005 than in 1995?

\(p=\) percentage of adults having “a great deal” of confidence in the public schools.

Let \(X\sim Bin(n,p)\) and let \(\hat p={X\over n}\) be the sample proportion. (The number of adults being confidence in public school among the sampled 1004 adults)

  1. State the hypotheses

We assume \(p_0\) be 40%

\[ H_0;p=p_0~~~H_0:p=p_0~~~H_0:p=p_0\\ H_a;p>p_0~~~H_a:p<p_0~~~H_a:p\neq p_0\\ \]

The first is called one-sided right/upper test

The second is called one-sided left/lower test

The third is called two-sided test/two tail test

Example:

\(H_0:p=0.4\)

\(H_a:p<0.4\)

  1. Computer the test statistic

\[ Z_{H_0}={\hat p-p_0\over \sqrt{p_0(i-p_0)\over n}} \]

\(\sqrt{p_0(i-p_0)\over n}\) is the standard error of \(\hat p\)

Condition: \(np_0\geq 5\) and \(n(i-p_0)\geq5\)

Example:

\[ \hat p={372\over 1004}=0.371\\ Z_{H_0}={\hat p-p_0\over \sqrt{p_0(i-p_0)\over n}}={0.371-0.4\over\sqrt{0.4(1-0.4)\over 1004} }=-1.876 \]

  1. Reach a conclusion
i.    Rejection rule or region:

  $H_a:p>p_0$

  $H_a:p<p_0$

  $H_a:p\neq p_0$

ii.   p-value:

  -   Using a p-value conveys the strength of the evidence agianst $H_0$.

  -   The p-value is the probability of observing what was obserbed if $H_0$ is true.

  $H_a:p>p_0$

  $H_a:p<p_0$
  
  $H_a:p\neq p_0$