Recap on exams

Exponential distribution is usually used to measure the distribution of ages. The supports of normal distribution includes negative values…

Law of Large Numbers

how well \(\bar X\) can estimate the \(E(X)\).

The Law of Large Numbers (LLN): Let \(X_1,...,X_n\) be iid and let g be a function such that \(-\infty<E[g(X_1)]<\infty\). Then for any $ > 0$

\[ P(|{1\over n}\sum^n_{i=1}g(X_i)-E[g(X_1)]|>\epsilon)\to 0~as~n\to \infty \]

For a large n, \({1\over n}\sum g(X_1)\) is very close to E(X) with a high probability.

  1. For very large n, there will be high. prob. that \(\bar X={1\over n}\sum X_i\) will be close to \(E(X_1)\).

  2. Reasonable to use \(\bar X\) to estimate \(E(X_1)\)

\(\hat p\) converges in probability to p.

Note: For many distributions \(-\infty<E(X)<\infty\), but there are exceptions.

Example: Let X be a random variable with PDF \(f(x)={1\over x^2}, 1\leq X<\infty\).

\[ E(X)=\int^{\infty}_{-\infty}xf(x)~dx=\infty \]

Limitation of LLN:

LLN does not provide the speed of convergence, or how close \(\bar X\) to \(E(X)\).

Convolutions

The convolution of two independent random variables refers to the distribution of their sum.

if X is independent of X+Y is convolution

Example: Let \(X \sim Bin(n_1,p)\) and let \(Y \sim Bin(n_2,p)\). If X and Y are independent, then \(X + Y \sim Bin(n_1 + n_2,p)\).

Let \(S_i\) \(T_j\) denote the success or failure in a Bernoulli trial.

\[ X=\sum^{n_1}_{i=1}S_i,Y=\sum^{n_2}_{i=1}T_i \]

\[ X+Y=\sum^{n_1}_{i=1}S_i+\sum^{n_2}_{i=1}T_i\sim Bin((n_1+n_2),p) \]

Other examples: Let X and Y be independent random variables. i. If \(X ∼Poisson(\lambda_1)\) and \(Y ∼Poisson(\lambda_2)\), then \(X+Y ∼Poisson(\lambda_1 +\lambda_2)\). ii. If \(X ∼N(\mu_1,\sigma^2_1 )\) and \(Y ∼N(\mu_2,\sigma^2_2 )\), then \(X + Y ∼N(μ_1+ μ_2.\sigma^2_1 + \sigma^2_2 )\).

Result 1: Let \(X_1,...,X_n\) be independent random variables with \(X_i ∼N(\mu_i,\sigma^2_i )\) and let \(Y = a_1X_1 + ···+ a_nX_n\) for constants \(a_1,...,a_n\). Then

  1. \(Y \sim N(\mu_Y ,\sigma^2_Y )\)

  2. \(\mu_Y = a_1\mu_1 + ···+ a_n\mu_n\)

  3. \(\sigma^2_Y = a^2_1\sigma^2_1 + ···+ a^2_n\sigma^2_n\)

Example: History suggests that scores on the Math portion of the SAT are normally distributed with a mean of 529 and a variance of 5732. History also suggests that scores on the Verbal portion of the SAT are normally distributed with a mean of 474 and a variance of 6368. Select two students at random. Let X denote the first student’s Math score, and let Y denote the second student’s Verbal score. What is P(X > Y )?

Since X is independent of Y

\[ P(X>Y)=P(X-Y>0) \]

\[ \mu=E(X)-E(Y)=529-474=55 \]

\[ \sigma^2=Var(X)+Var(Y)=5732+6368=12100 \]

Result 2: Let \(X_1,...,X_n\) be iid \(N(\mu,\sigma^2)\) and let \(\bar X\) be the sample mean. then

\[ \bar X\sim N(\mu_{\bar X},\sigma^2_{\bar X}) \]

where \(\mu_{\bar X}=\mu\) and \(\sigma_{\bar X}^2={\sigma^2\over n}\)

Example: Suppose we want to estimate the mean of a normal population whose variance is known to be 4. What sample size should be used to ensure that \(\bar X\) lies within 0.5 units of the population mean with probability 0.90? \[ \bar X\sim N(\mu,{4\over n})\to {\bar X-\mu\over \sqrt{4\over n}}\sim N(0,1) \]

\[ P({-0.5\over \sqrt{4\over n}}<{\bar X-\mu\over \sqrt{4\over n}}<{0.5\over \sqrt{4\over n}}) \]

In R:

qnorm(0.95,0,1)
## [1] 1.644854