A1.2 Some Notions of Probability

Programs, people, and processes, as entities, have attributes. These attributes, in turn, are defined on particular measurement scales. For each attribute there is a set of values, or domain, that defines the possible values for that attribute. For example, the sex attribute of a person is defined on the set male and female, which is nominal. It will be useful to define the term "variable" to use on the instantiations of the attributes. A sex variable can then assume the values of Male or Female, depending on the sex of the person that the variable represents.

A random variable, or stochastic variable, is a variable that assumes each of its definite values with a definite probability. This concept of a random variable will be quite useful in our evolving view of the software process as a nondeterministic one. The domain on which a random variable is defined can be discrete or continuous. Thus, there are two distinct types of random variables. A discrete random variable is one that has only a countable number of possible values. This countable number can either be finite or infinite.

A1.2.1 Discrete Random Variables

Only events have probabilities. Thus, from a measurement perspective we can define an event that we have selected a male; or if x is a random variable, then we will denote this event as {x = Male}. Our random variable x is a real-valued function on the probability space. The random variable x makes correspond to each elementary event a real number x(a), the value of the random variable at the event a.

Now let us take this concept of a random variable and apply it to two different contexts. Let us assume that the event space is as above, where the random variable x was defined for the event space {x = Male} and {x = Female}. Now let us look at this concept as applied to two distinctly different populations of students at a typical university. First suppose that the population in question is the set of Home Economics majors at a large land-grant university. In this case, the Pr{x = Male} is very small indeed. Certainly it will be the case that Pr{x = Male} < Pr{x = Female}. Now, if we were to change the population so that we are now considering the population of mechanical engineering students, then the reverse would probably be true and Pr{x = Male} > Pr{x = Female}.

Now consider a set of discrete events {a₁,a₂,...,a_n} or {a₁,a₂,...}. The random variable x induces a probability space on the set of real numbers in which the finite set {a₁,a₂,...,a_n} and the infinite set {a₁,a₂,...} of real numbers has a positive probability. Let p_i be the probability that x will assume the value a_i; then, p_i = Pr(x = a_i) for i = l,2,...,n or i = 1,2,.... Further, . The probability distribution of x is defined by the values a_i and the probabilities p_i. The probability function of x is generally stated in terms of a variable x, where:

f(x) = Pr(x = x)

and x = a₁, a₂,...,a_n or x = a₁,a₂,.... Again, it follows that:

∑_if(x) = 1

The distribution function F(x) of the random variable x is defined by:

F(x) = Pr(x < x)

It represents the cumulative probability of the initial set of events a₁,a₂,...,a_k, where x = a_k₊₁.

A1.2.2 Continuous Random Variables

If the random variable is continuous, then the probability that the random variable will assume any particular value is zero, Pr{x = a} = 0. There are essentially no discrete events. In this case, it only makes sense that we discuss intervals for the random variable such as:

Pr[x < a]

Pr[a < x < b]

Pr[x ≥ a]

If f(x) is the probability density function of x, then:

Again, it follows that:

The distribution function F(x) of x is given by: