Objectives:
(1) To identify real life situations of negative binomial distribution. (2) To study the relationships between negative binomial distribution and geometric, Poisson distributions.
(3) Finding probabilities related to negative binomial distribution.
Negative Binomial Distribution
Introduction:
Sometimes we come across a situation where we repeat the trials till we get k (k fixed number) successes. For example, if we want to select 10 good articles from a lot we go on drawing articles one-by-one and inspect each. The procedure is continued till we get 10 good articles. In this case the number of articles inspected is a random variable. Its probability distribution is negative binomial. For k = 1, we have seen that the corresponding distribution is geometric, it is discussed earlier. Thus negative binomial distribution is a generalisation of geometric distribution and geometric distribution is a particular case of negative binomial distribution.
Another way of looking at negative binomial distribution is discussed below. Sum of two (or more) i.i.d. geometric variables is not a geometric, however it is negative binomial distribution.
Note: A random variable X follows negative binomial distribution with parameters k and p is symbollically written as X → NB (k, p). 1.2 Probability Mass Function of NB (k, p)
Suppose a trial results into two outcomes success and failures with a constant probability p and q at each trial respectively. Suppose X denotes the number of failures before getting kth success. In other words we get kth success at X + kth trial. We derive p.m.f. of a random variable X as follows:
P (X = x) = P(Getting kth success at x + kth trial)
= P (Getting k-1 successes in x + k-1 trials and kth success. at x + kth trial)
Near record
Since trials are independent:
P(X = x) = P(Getting k-1 successes in x + k-1 trials) XP (Getting kth success at x + kth trial)
Note that p (Getting k-1 successes in x + k − 1 trials) is equivalent to that of a binomial variable with parameters x + k − 1 and p assuming value k 1. Hence it it is
pk-1 qx
P (X = x) =
k-1
pk-1 qx p
pk qx
Thus, p.m.f. of negative binomial distribution is
P (X = x) =
= 0
Note:
pk qx x = 0, 1, 2,...
0<p<1, q=1-p, k≥ 0
; otherwise
1. Another form of p.m.f. of NB (k, p). Suppose we get kth success
at Yth trial then Y = X + k, hence p.m.f. of Y is
P(Y = y) = y-1c pkqy-k; y=k, k + 1, .......
k-1
;
; 0<p<1, q=1-p, k> 0 otherwise
The number of failures before kth success.
0
Thus,
X
Y
and Y
The number of trials required to get kth success = X + k
2. We can verifyΣ P(X = x)=1
0
Σ P(X = x)
=
pk qx
0
0
pk qx
Note that
-"C= (-1) -1c
Hence
Σ P(X = x) =
Σ -*c(-1)x pk qx
0
= pk Σ
Σ -kc2 (-a)x
(q< 1)
0
Page 13 / 169
Type here to search
= pk. (1-q) = 1
(:ૐ"c a = (1 + a)", n may be negative
3. The probabilities P (X = x) can be viewed as the terms in the binomial expansion with negative index, hence it is called as negative binomial distribution. It can be expresed as follows:
Note that: Σ P (X = x) = pk (1−q)-k
= (1-q)k p-k
... see note (2)
(p = 1-q)
-k
=
Put P =
q
p P
p
and Q = P
=
(Q-P)-k
(Q-P = 1)
Using binomial expansion we get
Σ -c Q-k-x(-P)x
Thus P (X = x) = *C Q-k-x (P); x = 0, 1, 2 ... Q-P=1
Hence the probabilities are the terms in the the binomial expansion of (QP)-k with -k as negative index.
The form of p.m.f.
P(X = x) = *CQ-k-x (P)x, x = 0, 1, 2...
is called as Pascal distribution.
Symbolically we writeX→ NB (-k, P), where, P
4. Since P (X = x) is probability of getting kth success at x + kth trial it is called as waiting time distribution.
5. For k = 1, it is geometric distribution.
6. In case of X → B (n, p), n is the number of trials which is fixed and X is the number of successes is a variable. On the other hand if X→ NB (k, p) the number of successes k is fixed and the number of trials is a variable.
7. P (X = x) =
1c pk qx is mathematically well defined for
fractional values of k also. However, usual iterpretation of k as number of successes cannot be applied.
Mean and Variance of NB (k, p)
Mean: Since
Σ P(X = x)
=
Σκακ
pk qx = 1
0
0
Σ
... (i)
C qx = p-k (1-q)-k
The series on L.H.S. of equation (i) is absolutely convergent, hence term by term differentiation w.r.t. q is valid.
..
dx+k-1c, q2 = dqō
d dq
(1-q)*
d
d
x+k-1Cx q* =
(1-q)*
dq
0
Σχ
qx-1=k(1-q)-k-1 (-1)
Multiplying both sides by qpk we get
pk qxk.qpk p-k-1
..
0
Variance: Here first we obtain E [X (X-1)], since E(X2) cannot be directly obtained.
From equation (i) we get,
Σx+k-1c qx = (1-q)-k
Σ x=1C, q = 2 (1-q) +
d2
dq2
Σ
Cx q
=
dq
dq
(k) (1-q)-1 (-1)
Σ***-1c x(x-1) qx-2 = (-k) (k − 1). (1 − q)-K-2 (-1)2
0
Σx+k-1c x(x-1) qx-2= k (k + 1) p-k-2
Multiplying both the sides by pk q2 we get,
Σ x (x-1)
x + k
1c pk qx = k (k + 1) q2/p2
..
E [X (X-1)]
= k (k + 1) q2/p2
Note that:
X2 = x2-X + X
X2 = X (X-1) + X
.. E(X)2= E(X (X-1)] + E (X)
Var (X) = E(X)2- (E(X))2
= k (k + 1) q2/p2 + kq/p
kq k2q2
= k (k + 1)
k9
+1 +1
p
p
p
p
= k = x[(k+1)q+p-kq]
= k
P
Σ x, P (X = x)
=
kq P
kq
E (X)
=
P
Note: 1. If
Q =
P
q and P = P
kq Mean = = kP p
Variance =
= KPQ
Page 13 / 169
Type here to search
which is similar to binomial distribution.
2. A noteworthy property is that Variance of NB (k, p) > Mean of NB (k,p)
Since,
p ≤ 1
1
≥ 1
p
kq 1
(kq
>
1
p
kq kq
>
p
P
Mean.
Varaince
3. Using similar procedure as we have used to find E(X) and E[X(X-1)] we can find higher order factorial moments.
M.G.F. of NB (k, P)
M.G.F. My (t) = E (etx)
=
Σ etxx**-1c pk qx
0
κ
pkΣ(-1)x-*C (qet)x
(x+k-1C=(-1)*-*Cx)
For q et < 1 i.e. t <- loge q the above binomial series converges.
.. My (t)
= pk (1-qet)-k
k
=
p
If P =
and Q= =212, Mx (t) = (Q-Pet)-k
p
Simplifying further so as to make it convenient for expansion we
Mx (t) pk (1-get)-k
• {1-ae [p+q-འ€]k
p
glet
=
=
P
1-9 (et.
p
-k
1-k
}
Thus,
My (t)
=
let P
k
or
Mx (t) = (Q-Pet)
Expanding equation (1) or (2) we get raw moments.
... (1)
... (2)
C.G.F. of NB (k, P): Taking loge of the equation (1) or (2) we get
cumulant generating function. Therefore,
k
Kx (t) = log. My (t) = log [1-(et - 1)] ̄*
=
-k log[1-(et-1)]
or Kx (t) = log (Q-Pet)-k-k log (Q-Pet)
(3)
... (4)
Expanding equations (3) in the powers of t, we get cumulants as
follows:
Kx (t) = -k log [1-(et-1)
P
=-k log [1-P (et-1)]
=k{P (et - 1) + p2 (et - 1)2 + p33 (et - 1)3 + ...
P
Since, et 1 = = (I++++
2!
-1 = t+++
We get, Kx (t) = k P
k P (t + + +
+ 3/2 k p2 (t + 12 + 2 + ...)
t3 2!* 3!
+...
.....
... (5)
ofusi
To find the cumulants of X we compute the coefficients of using
expression (5).
Term involving t is KPt, hence
K1
Term involving t KP + 1/2
2!
coefficient of t kP mean
KPP-KPP+1) k P2 = k P (P + 1)
t2
..
K2 coefficient of
at of
= k P (p + 1) = kPQ
t
Term involving
k P2.2
KP2.PP
k Pt3
kP
+k
= kP (1 + 3P+2P2).
= k P (1+P) (1 + 2P)
t3
t3
k3= coefficient of
3!
= k P (1+P) (1 + 2P)
= KPQ (Q + P)
Terms involving t4:
+ + 1 kp2
=
kP
KP4 14
kp++
+ KP25] + KP25] + KP351 + KPM5는
+kp3
kp4 14