Binomial Distribution

The PMF:

\[ p(y)=P(Y-y)=\begin{pmatrix}n\\y\end{pmatrix}p^y(1-p^{n-y}),~~~y=0,1,2,3,4,...n \]

Example: Suppose 70% of all purchases in a certain store are made with a credit card. Let Y denote the number of credit card used in the next 10 purchases. What is \(P(5\leq Y \leq8)\)

\[ Y\sim Bin(10,0.7) \]

p=0.7
n=10
result=0
for (y in 5:8){
# R will computer the PMF, P(Y=y):
  result=result+dbinom(y,n,p)
}
result
## [1] 0.8033427
# R will also computer the CDF, P(y<=y):
pbinom(8,n,p)-pbinom(4,n,p)
## [1] 0.8033427

The expected value and variance of \(Y\sim Bin(n,p)\) are

\[ E(Y)=np~~~~~~~\sigma_Y^2=np(1-p) \]

Example: what is the expected number of credit card purchases? What is the variance?

\[ E(Y)=n\times p=10\times 0.7=7\\ Var(Y)=10\times 0.7\times(1-0.7)=2.1 \]

Geometric Distribution

Construct based on Bernoulli trails

The PMF:

\[ p(x)=P(X=x)=(1-p)^{x-1}p,~~~~~x=1,2,3.... \]

The \((1-p)^{x-1}\) is the probability of failure in x-1 trails.

The CDF: \[ F(x)=P(X\le x)=1-(1-p)^x~~~~~x=1,2,3,... \]

The expected value and variance of \(X\sim Geo(p)\) are

\[ E(X)={1\over p}~~~~~~~\sigma_X^2={1-p \over p^2} \]

Example: Suppose you need to find a store that carries a special printer ink. You know that of the stores that carry printer ink, 15% of them carry the special ink. You randomly call each store until one has the ink you need.

  1. What is the probability that you first find the special ink at the third store you call?

Let x denote the number of calls until you find the ink.

\[ X\sim Geo(0.15)\\ P(X=3)=(1-0.15)^20.15=0.108 \]

  1. What is the probability you first find the special ink in 3 calls or less?

\[ P(X\le 3)=1-(1-0.15)^3=0.386 \]

  1. What is the expected number of calls until you first find the special ink? What is the variance?

\[ E(X)={1\over0.15}=6.67\\ \sigma_x^2={1-0.15\over 0.15^2}=37.78 \]

Negative Binomial Distribution

The PMF:

\[ p(y)=P(Y=y)=\begin{pmatrix}y-1\\r-1\end{pmatrix}p^r(1-p)^{y-r} \]

Example: An oil company conducts a geological study that indicates that an exploratory oil well should have a 20% chance of striking oil.

  1. What is the probability the first strike comes on the third well drilled?

Let X denote the number of wells until first strike

\[ X\sim Geo(0.2) P(X=3)=(1-0.2)^20.2=0.128 \]

  1. What is the probability that the third strike comes on the seventh well drilled?

Let Y denote the number of wells until third strike.

\[ Y\sim NB(3,0.2)\\ P(Y=7)=\begin{pmatrix}6\\2\end{pmatrix}0.2^2(1-0.2)^{7-3}0.2=0.049 \]

y=7
r=3
# y-r is the number of failure to got the xth success
p=0.2
dnbinom(y-r,r,p)
## [1] 0.049152

The expected value and variance of \(Y\sim NB(r,p)\) are

\[ E(Y)={r\over p}~~~~~~~\sigma_Y^2={r(1-p)\over p^2} \]

  1. What is the expected number of wells to be drilled until striking oil for the third time?

\[ \mu_Y=E(Y)={3\over 0.2}=15 \]

Hypergeometrix distribution

An extension of geometric distribution, \(Geo(p)=NB(1,p)\)

The PMF:

\[ p(x)=P(X=x)={\begin{pmatrix}M_1\\x\end{pmatrix}\begin{pmatrix}M_2\\n-x\end{pmatrix}\over\begin{pmatrix}M_1+M_2\\n\end{pmatrix}} \]

\(\begin{pmatrix}M_1\\x\end{pmatrix}\) is the number of ways getting x failures from \(M_1\) failures in the population

\(\begin{pmatrix}M_2\\n-x\end{pmatrix}\) is the number of ways getting n-x successes from \(M_2\) successes in the population

\(\begin{pmatrix}M_1+M_2\\n\end{pmatrix}\) is the number of ways getting n items from \(M_1+M_2\) items in the population

The sample space of X:

\[ S_X=\{max(0,n-M_2),....min(n-M_1)\} \]

Example: A crate contains 50 light bulbs of which 5 are defective and 45 are not. A quality control inspector randomly samples 4 bulbs without replace- ment. Let X be the number of defective bulbs in the sample.

\[ X\sim HyperG(M_1,M_2,n),M_1=5,M_2=45,n=4\\ S_X=\{0,1,2,3,4\},0=max(0,4-45),4=min(4,5) \]

Find the probability that less than 3 bulbs are defective.

\[ P(X=x)={\begin{pmatrix}5\\x\end{pmatrix}\begin{pmatrix}45\\4-x\end{pmatrix}\over\begin{pmatrix}50\\4\end{pmatrix}},~~~x=0,1,2,3,4\\ P(X<2)=P(X=0)+P(X=1)=P(X\leq 1)\\ \]

dhyper(0,5,45,4)+dhyper(1,5,45,4)
## [1] 0.9550369
phyper(1,5,45,4)
## [1] 0.9550369

The expected value and variance of \(X\sim Hyp(M_1,M_2<n)\) are

\[ E(X)={nM_1\over N}~~~~~~~\sigma_X^2={nM_1\over N}(i-{M_1\over N})({N-n\over N-1}) \]

Where \(N=M_1+M_2\)