7.3. Checking for RandomnessAfter generating the random numbers as discussed in the previous sections, you may want to make sure that they really are random. To do this, check the distribution of the data inside the table of random data from a statistical point of view. SQL> SELECT MIN (balance), MAX (balance) 1 , AVG (balance), STDDEV (balance) 2 FROM accounts; MIN(BALANCE) MAX(BALANCE) AVG(BALANCE) STDDEV(BALANCE) ------------ ------------ ------------ --------------- 10008.03 99889.97 54948.4654 25989.9271 As shown, the average (often referred to in statistics as the mean) balance is 54,948.4654 and the standard deviation is 25,989.9271. As per statistical analysis, here is the distribution of values inside a table: Assume that A = average and S = standard deviation; thus:
If the pattern of distribution is such, it is said to be in normal distribution . In my case, however, I want an even spread of data, not normally distributed. Here I have:
Therefore, 68% of the data lies between 28,958.5383 and 80,938.3925 as shown in the following expression:
and
These numbers indicate that the list is well varied, not too crowded around the average value. It therefore satisfies our definition of a truly random sample. In creating a test bed to validate assumptions, you will have to build several random samples of data, and the type of analysis performed here will be very helpful in making sure the sample is truly random. I'll explore this topic more in the following section. |