recap from last week.

sampling with replacement

independent Bernoulli trails

\(X\sim Ber(p)\), \(E(X)=P\), \(Var(X)=p(1-p)\), \(P(X=1)=p\)

\(Y\sim Bin(n,p)\), \(P(X=k)=\begin{pmatrix}n\\k\end{pmatrix}p^k(1-p)^{1-k},k=0,1,2,3,...n\), \(E(X)=np\), \(Var(X)=np(1-p)\)

\(X\sim Geometric(p)\), \(P(X=k)=(1-p)^{k-1}p\), \(k=1,2,3,..\infty\), \(E(X)={1\over p}\), \(Var(X)={1-p\over p^2}\)

\(Y\sim NB(r,p)\) Negative binomial distribution, \(P(X=k)={\begin{pmatrix}k-1\\r-1\end{pmatrix}}p^{r-1}(1-p)^{k-r},k=r,r+1,....\infty\), \(E(X)={r\over p}\), \(Var(X)={r(1-p)\over p^2}\)

sampling without replacement

p changed, event dependent on each other

\(X\sim HyperG (M_1,M_2,n)\), \(P(X=k)={\begin{pmatrix}M_1\\k\end{pmatrix}\begin{pmatrix}M_2\\n-k\end{pmatrix}\over\begin{pmatrix}M_1+M_2\\n\end{pmatrix} },k=min(0,n-M_2),1,2,3,...max(n,M_1)\), \(E(X)={nM_1\over N}\), \(Var(X)={nM_1\over N}(1-{M_1\over N})({N-n\over N-1})\), n is the selected items, N-n is the number of non-selected items.

The hypergeometric distribution can sometimes be approximated using the binomial distribution.

For large population size N, the difference between sampling with and without replacement is very small.

Example: Suppose M1 = 100, M2 = 900, and n = 25. Find P(X = 3).

#Using the hypergeometric distribution:
dhyper(3,100,900,25)
## [1] 0.229574
#Using the binomial approximation:
dbinom(3,25,0.1)
## [1] 0.2264973

Possion Distribution

The PMF:

\[ p(x)=P(X=x)={e^{-\lambda}\lambda^x\over x!},~~~~~~~x=0,1,2,3...\lambda>0 \]

The mean and variance for \(X\sim Poisson(\lambda)\) are

\[ E(X)=\lambda~~~~~~~\sigma_X^2=\lambda \]

Example: Suppose that a person taking Vitamin C supplements contracts an average of three colds per year, and that this average increases to five colds per year for persons not taking Vitamin C supplements. Suppose further that the number of colds a person contracts in a year is a Poisson random variable.

  1. Find the probability of no more than two colds for a person taking sup- plements and a person not taking supplements.

\(X_1\)= number of colds for patient taking VC, \(X_1\sim Poisson(3)\)

\(X_2\)= number of colds for patient not taking VC, \(X_2\sim Poisson(5)\)

\(P(X_1\leq)=P(X_1=0)+P(X_1=1)+P(X=2)=\)

## first is x, second is lambda
ppois(2,3)
## [1] 0.4231901

\(P(X_2\le 2)=0.125\)

ppois(2,5)
## [1] 0.124652
  1. Suppose 70% of the population takes Vitamin C supplements. Find the probability that a randomly selected person will have no more than two colds in a given year.

\(A\) = taking VC, \(A^c\)= not taking VC

X = number of codes of a randomly selected person

\[ \begin{aligned} P(X\le 2)&=P(x\le2,A)+P(X\le2,A^c)\\ &=P(X\le 2|A)P(A)+P(X\le 2|A^c)P(A^c)\\ &=0.423\times 0.7+0.125\times 0.3\\ &=0.334 \end{aligned} \]

Exponential distribution

PDF:

\[ f(x)=\begin{cases}\lambda e^{-\lambda x},~~~x\geq0\\0,~~~~~~~~~~~x<0\end{cases} \]

The mean and variance for \(X\sim Poisson(\lambda)\) are

\[ E(X)=\lambda~~~~~~~\sigma^2_X=\lambda \]

CDF:

For \(0\leq X\)

\[ F(X)=P(X\le x)\int^X_{-\infty}f(x)dx=\int^x_0 \lambda e^{-\lambda t}dt\\ let~u=\lambda t =\int^{\lambda x}_0 e^{-u}du\\ =-e^{\lambda x}-1-e^0\\ =1-e^{-\lambda x} \]

the expected value and variance of \(X\sim Exp(\lambda)\) are

\[ E(X)={1\over \lambda}~~~~~~~\sigma_X^2={1\over \lambda^2} \]

Example: Let X be the amount of time (in minutes) a postal clerk spends with his or her customer. The time spent has an exponential distribution with λ = 0.25.

  1. Find the expected time spent with each customer.

\[ E(X)={1\over 0.25}=4 \]

  1. What is the probability the clerk spends between 2 and 4 minutes with a customer?

\[ P(2\le X<4)=\int^4_2 \lambda e^{-\lambda x}dx=\int_2^4 0.25e^{-0.25x}=0.239\\ P(2\le X<4)=F(4)-F(2)=(1-e^{-0.24\times 4})=0.239 \]

An exponential random variable X has the memory-less property:

\[ P(X > s + t |X > s) = P(X > t) \]