DISCRETE PROBABILITY DISTRIBUTIONS


BINOMIAL DISTRIBUTION (BERNOULLI)

The binomial and Poisson distributions are the most common distributions with many applications. Key characteristics of the Binomial and Poisson distributions are:

  • Binomial distribution

    • Frequently used in engineering

    • Probability of success p and failure q

    • Combinations of p's and q's

  • Poisson Distribution

    • Rare successes, p very small

    • Large sample size

    • Limit of binomial

  • Population Parameters: characteristics of the population

    • Population size: N

    • Probability of success: p

    • Probability of failure: q = 1 - p

    • Mean: ¼

    • Variance: ƒ 2

  • Sample Statistics: characteristics of the sample

    • Random variable: X i

    • Sample size: n

    • Mean:

    • Variance: s 2

Samples taken without replacement are generally not independent since p is not constant. If sample size n < 0.05 N population size, we can consider p unchanged and "independent."

Binomial distribution ” Probability when EXACTLY x out of n events occur:

The assumptions are:

  1. Experiment of n independent events.

  2. Probability of a "success" is p.

  3. Probability p is constant for all events.

  4. Probability of "failure" is q = 1 -p.

  5. Parameters n and p are specified.

  6. Random variable — number of successes.

  7. Random variable is a discrete integer 0 x n.

  8. Order of success not important; combination.

Mean:

Variance:

Example 1
start example

Six tosses of a coin.

Experiment is six tosses of a coin, n = 6.

Probability of a head in one toss, p = 1/2.

Probability of a tail in one toss, q = (1 - p) = 1/2.

Find: probability of getting exactly 2 heads in 6 tosses.

Parameters: n = 6 coin tosses, a head with p = 1/2

Random variable: x = 0, 1, 2, 3, 4, 5, 6

r.v.x.

C(x ; n)

p x q n - x

B(x;n,p)

(6!)/(0! 6!) = 1

(1/2) (1/2) 6 = 1/64

1/64

1

(6!)/(1! 5!) = 6

(1/2) 1 (1/2) 5 = 1/64

6/64

2

(6!)/(2! 4!) = 15

(1/2) 2 (1/2) 4 = 1/64

15/64

3

(6!)/(3! 3!) = 20

(1/2) 3 (1/2) 3 = 1/64

20/64

4

(6!)/(4! 2!) = 15

(1/2) 4 (1/2) 2 = 1/64

15/64

5

(6!)/(5! 1!) = 6

(1/2) 5 (1/2) 1 = 1/64

6/64

6

(6!)/(6! 0!) = 1

(1/2) 6 (1/2) = 1/64

1/64

   

Sum = 64/64

If we were to graph this data, we would obtain the graph shown in Figure 16.29.

click to expand
Figure 16.29: Binomial distribution histogram ” six tosses of a coin.

Parameters: n = 6; p = 1/2; q = 1/2

Mean: ¼ = np = 6 · 1/2 = 3

Variance: ƒ 2 = npq = 6 · 1/2 · 1/2 = 3/2 = 1.5

Note that because p = 1/2, (1) the mean is equal to the mid-range, and (2) probability density is symmmetric about the mean.

end example
 
Example 2
start example

Square rod with four sides.

Sides are denoted 1, 2, 3, 4 respectively.

Experiment is six tosses of four-sided rod, n = 6.

Random variable is tossing the number 3.

Probability of a 3 for a single toss, p = 1/4.

Probability of "no 3" in a single toss, q = (1 - p) = 3/4.

Find: Probability of exactly two 3s in 6 tosses; x = 2.

Parameters: n = 6 tosses; success a "3," has p = 1/4.

Random variable: x = 0, 1, 2, 3, 4, 5, 6

r.v.x.

C(x;n)

p x q n-x

B(x;n,p)

%

(6!)/(0! 6!) = 1

(1/4) (3/4) 6= 3 6 /4 6

729/4 6

17.8

1

(6!)/(1! 5!) = 6

(1/4) 1 (3/4) 5 = 3 5 /4 6

1458/4 6

35.6

2

(6!)/(2! 4!) = 15

(1/4) 2 (3/4) 4 = 3 4 /4 6

1215/4 6

29.7

3

(6!)/(3! 3!) = 20

(1/4) 3 (3/4) 3 = 3 3 /4 6

540/4 6

13.2

4

(6!)/(4! 2!) = 15

(1/4) 4 (3/4) 2 = 3 2 /4 6

135/4 6

3.30

5

(6!)/(5! 1!) = 6

(1/4) 5 (3/4) 1 = 31/4 6

18/4 6

0.44

6

(6!)/(6! 0!) = 1

(1/4) 6 (3/4) = 1/4 6

1/4 6

0.02

 

Where 4 6 = 4096

SUM = 4096/4 6

100

The histogram for this data is shown in percent of B(x;n,p) in Figure 16.30.

click to expand
Figure 16.30: Histogram in percent of B(x;n,p).

Parameters: n = 6, p = 1/4, q = 3/4

Mean: ¼ = np = 6 · 1/4 = 3/2 = 1.5

Variance: ƒ 2 = npq = 6 · 1/4 · 3/4 = 18/16 = 1.125

Observations for p = 1/4: (1) the mean is skewed to the left of the mid-range, and (2) the probability density is nonsymmetric about the mean.

end example
 
Example 3
start example

Square rod with four sides.

Sides are denoted 1, 2, 3, 4, respectively.

Experiment is six tosses of four-sided rod, n = 6.

Random variable is tossing any number except 3.

Probability of "no 3" in a single toss, p = 3/4.

Probability of a 3 for a single toss, q = (1 - p) = 1/4.

The data are shown graphically in Figure 16.31.

click to expand
Figure 16.31: Binomial distribution for square rod.

Parameters: n = 6, p = 3/4, q = 1/4

Mean: ¼ = np = 6 · 3/4= 18/4 = 4.5

Variance: ƒ 2 = n p q = 6 · 3/4 1/4 = 18/16 = 1.125

Observations for p = 3/4: (1) the mean is skewed to the right of the mid-range, (2) the probability density is nonsymmetric about the mean.

end example
 

HYPERGEOMETRIC DISTRIBUTION

Overview

Traditional Notation

  • N = Total population

  • S = Population defined "success"

  • F = Population defined "failure"

Proability Density Function (p.d.f.)

This is shown mathematically as:

Note  

If total number of successes S < n then x = 0, ..., S, then the cumulative distribution frunction (CDF) is shown as:

General Comments

  1. Hypergeometric distribution applies to discrete samples taken from a finite population N without replacement.

  2. An integer r is often used in place of the variable x.

  3. Three parameters (N, S, and n) specify this distribution.

  4. Three alternate parameters (N, p, and n), where (p = S/N), may be used to define the distribution.

Alternate Parameters and Properties

  1. Three alternate parameters (N, p, and n), where p = S/N is proportion of success, q = F/N = 1 - p is proportion of failures

  2. Hypergeometric distribution (without replacement)

  3. Alternative representation of mean and variance

    Mean: ¼ = np

Comments

  1. As population size N increases (N - n)/(N - 1) goes to 1, and hypergeometric distribution approaches that of binomial.

  2. Best unbiased estimator of parameter p from actual data: p ‰ˆ = S / N; population N not sample n.

COMPARISON OF HYPERGEOMETRIC AND BINOMIAL DISTRIBUTIONS

Sample Size

Hypergeometric distribution is based on finite size population N with sample size n taken without replacement. Consequence: Probabilities can vary with sample size n. Binomial and Poisson distributions assume either a finite population N with sample n taken with replacement or a very large population N. Consequence: probabilities are those of population and do not vary with sample size n.

Discrete Two Options

Both hypergeometric and binomial distributions are based on only two kinds of outcomes : pass or fail.

Computations

Hypergeometric probability computations can be quite involved even for small populations. On the other hand, binomial probability or Poisson provide good approximations to hypergeometric and are more easily computed. For a fixed sample size n, this approximation improves with increasing population size N.

Example 1
start example

A supplier provides N = 25 precision motors; history shows that "on average," this company has 8% defects. An inspection sample of n = 8 motors is tested . The RV S is the successful selection of a defect or bad motor; F is a good motor.

B = S = 25(0.08) = 2, G = F = 25(0.92) = 23, n = 8

The probability of selecting no defective motors (i.e., x = 0) is:

The probability of selecting exactly one defective motor (i.e., x = 1) is:

The probability of selecting exactly two defective motors (i.e., x = 2) is:

The probability of selecting fewer than two defective motors from a sample of eight is:

P(X < 2) = P(X = 0) + P(X = 1)

= 0.4533 + 0.4533 = 0.9066

That is, there is a 90% certainty of observing fewer than two bad motors in a total sample size of eight.

Here is how to determine the mean and standard deviation of the number of defective motors in a sample size of n = 8 and of n = 12:

For n = 8:

For n = 12:

end example
 

HYPERGEOMETRIC DISTRIBUTION APPLICATIONS

Often, the meaning of the terms "success" and "failure" depends upon the context in which they are used. For example, selecting a "bad" motor could be considered a "success" for their removal. When there is an issue of success or failure or "bad" or "good," the hypergeometric distribution is applicable . For example, in the following sample space we have the good and bad motors for a given supplier segregated as shown:

A supplier delivers a total of N motors. This population of N motors is comprised of only two classes:

  1. G motors that will be "good" or pass the specification.

  2. B ( = N - G): motors that will be "bad" or fail specification.

Quality control inspects only a small sample of n motors

  • g motors are good

  • b (= n - g) motors are bad

Probability Considerations

  1. The inspection sample of n motors is taken without replacement from the finite population of N motors.

  2. Without replacement, the probabilities of successes and failures for each of inspection sample of n motors will not be constant but will depend upon what motors are selected.

  3. An inspection sample of n motors can have b bad motors and g good motors, where n = g + b.

  4. Using combination theory, the total number of unordered ways of selecting g good motors from a population containing a total of G good motors is:

  5. Also, the number of unordered ways of selecting b bad motors from a population containing a total of B bad motors is the combination:

    C(b;B) = B!/b!(B - b)!

  6. The total number of ways to get both b bad motors and g good motors is the product:

  7. The total number of ways of selecting n motors from a population N (without replacement and unordered) is:

  8. Quality inspectors assume that it is "equally likely" to select any sample of n motors. That is, a sample of n motors containing b 1 bad and g 1 = (n - b 1 ) good motors is as likely to be selected as a sample of n-motors containing b 2 bad and g 2 = (n - b 2 ) good motors.

  9. The probability of any combination of samples containing n motors is given by:

  10. The probability of selecting a sample of n-motors containing exactly g good motors and b bad motors is given by:

    click to expand

    [Recall: If independent P(A · B) = P(AB) P(B) = P(A) P(B)]

Random Variable

Generally a random variable X is used to define or describe a specific form of outcome. In our case we consider the random variable to be the "success" of selecting a number of bad motors from an inspection sample of n motors. (Note: "success" = bad.)

Quality inspectors set a threshold for the value of this random variable, say X = x b , that has be established to assure the delivered lot of N motors will be accepted. (This does not mean that all N motors "meet spec," only that some acceptable percentage do.)

x b = b represents some (integer value) lower limit for acceptable. The number of "failures" or good motors in sample is g = n - x b . The probability of exactly x motors passing in the sample of n motors is:

Example 2
start example

A supplier provides N = 10 precision motor;, history shows that "on average," this company's products have 10% defects. If an inspection sample of n = 1 motor is tested, what is the probability that exactly one defective motor is selected (i.e., x b = 1)? The RV X is the "successful" selection of a bad motor; G is a good motor.

end example
 
Example 3
start example

If an inspection sample of n = 2 motors is tested, what is the probability that exactly one defective motor is selected (i.e., x b = 1)?

This implies that with two selections the inspector has a 20% chance of detecting the bad motor.

As the number of samples increases, n = 3, 4, 5, ..., N, the probability of selecting the one defective motor increases as 0.1n. So for five samples the probability is 50% that the inspector will have selected one bad motor.

end example
 
Example 4
start example

In the previous examples we designated a success as the selection of exactly one bad or defective motor. It may appear counterintuitive to identify a "success " with a "defect." However, examine what would happen if we defined a "success" X as selecting one "good" motor from a sample size of three.

However, C(2; 1) is undefined since we cannot have (-1)!. The reason, of course, is that we only have one bad motor, and by only asking for one "success" or good motor to be selected in a sample of three means the other two selections must be bad motors. But because there is only a total of one bad motor, it is impossible to satisfy this probability. We exceeded our expectations since we only have one "bad" motor.

end example
 



Six Sigma and Beyond. Statistics and Probability
Six Sigma and Beyond: Statistics and Probability, Volume III
ISBN: 1574443127
EAN: 2147483647
Year: 2003
Pages: 252

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net