Hack 18. Find Out Just How Wrong You Really Are

Anytime you have used statistics to summarize observations, you've probably been wrong. If you need to know how close you have come to the truth, use standard errors.

Statisticians are perhaps the only professionals who not only proudly admit that their answers are probably wrong, but will go to great lengths to tell you exactly how wrong they are. When you conduct a survey, record observations, or conduct some sort of experiment, your results describe only your samplethe customers, patients, students, goldfish, or pieces of Kryptonite that you have in front of you. Inferential statistics uses values computed for a sample to estimate what that value would be for the population it is meant to represent. For example, the mean of a sample is a pretty good guess for the mean of the population. The problem is knowing whether to trust your results.

Calibrating Error and Calculating Precision

It is unlikely that the mean of a sample is exactly the same as the mean of the population, but it is likely to be close. If you want to know how far wrong you are, you can calibrate your precision using standard errors. The standard error of the mean gives us an estimate of the distance between our sample mean estimate and the actual population mean.

"Measure Precisely" [Hack #6] discusses how to use standard errors in the case of measurement. Calculating the standard error of measurement allows you to know how close your test score is to your typical level of performance. Just as measurement allows us to produce 95 percent confidence intervals around individual observed scores, statisticians routinely produce 95 percent confidence intervals around a wide variety of sample values.

Fortunately for anyone curious to know how far a statistical finding is from the hidden truth, every popular statistical procedure provides a standard error. After introducing the following basic concepts, this hack will explain how to apply the following standard errors:

Standard error of the mean in descriptive statistics
Standard error of the proportion in survey sampling
Standard error of the estimate in regression

The Central Limit Theorem [Hack #2] is a key tool for knowing how wrong we are when we sample, because it provides the formula for calculating standard errors and suggests that all sample summary values are normally distributed.

There are three common ways that standard errors are used to verify the accuracy of results of statistical analyses. The particular tool you use depends on whether you want to know how close you are to correctly estimating:

The mean score of a population on some variable (e.g., average salary of untenured college professors)
The proportion of a population that have some characteristic (e.g., who will vote for my Uncle Frank as Chief Dogcatcher)
Future performance (e.g., probable college GPA for your pet monkey, whom you have trained to take multiple-choice tests)

Mean Estimates

The precision of a sample mean as an estimate of a population mean is based on sample size. Here's the formula:

As the sample size increases, the closer the sample mean is to the true population mean. This makes sense if you think of sample size as the number of independent observations; the more looks you get at something, the more accurate your description will be.

The standard error of the mean is the average distance of sample means from their population mean.

Proportion Estimates

When a sample of people is surveyed and the results are presented as some percentage or proportion (e.g., "72 percent of all sailors have knee trouble"), that percentage is some distance from the actual percentage you'd find if you surveyed the whole population. If the sample was selected randomly, the standard error of proportion indicates how close the sample percentage is to the population percentage.

The standard error of proportion is based on sample size and the size of the proportion. Here's the formula:

Like the standard error of the mean, as the sample size increases, the size of the standard error of the proportion decreases. If you are mathematically oriented, you might notice that as the proportion moves away from .50, the smaller that number in the top part of the formula becomes.

When the calculations are made, then, the further the sample proportion is from .50, the smaller the standard error of the proportion. Another point of interest is that the top part of the formula is an indication of the amount of variability in the sample. (proportion)(1 - proportion) is the standard deviation for proportions squared.

The standard error of the proportion is the average distance of sample proportions from the true proportion in the population.

Estimates of Future Performance

In regression analyses, scores on one or more variables are used to estimate scores on another variable [Hack #13]. However, that predicted score is unlikely to be exactly right.

Just as we can calculate how far an average sample mean is from a population mean or how far off our survey results are from theoretical population results, we can also say how far off, on average, our regression prediction will be from the actual score a person would get. Here's the formula:

The standard deviation used in the equation is the standard deviation of the criterion variable, which is the one you are predicting. The correlation is the correlation between your predictor(s) and the criterion variable.

In the interest of accuracy (the point of this hack, after all), I should point out that the standard error of the estimate formula given earlier isn't quite correct. However, it does provide almost the same result as this more complex, but correct, equation:

Notice with this formula that the larger the correlation, the smaller the standard error of the estimate. This makes sense, because if there is a lot of informational overlap between two variables, you can get a good sense of the score on one variable by looking at the other.

The standard error of the estimate is the average distance of the actual score from each predicted score.

Using Standard Errors

Here's how to use these tools to state with some confidence the range within which the truth lies. Because sampling errors are normally distributed, the standard error can be used just like a standard deviation to define specific proportions of scores under the normal curve.

For example, if we want to provide a range of values in which the population value falls 95 percent of the time, we can build a 95 percent confidence interval around our sample value. Based on the normal curve [Hack #23], 1.96 standard errors on either side of the sample value should provide a range of values that we can say with 95 percent certainty contains the population value.

Table 2-11 shows some examples of various standard errors and the use of sample data to produce these confidence intervals [Hack #6]. Notice how a larger sample size creates a sample estimate closer to the population value, and a larger sample size also points to a confidence interval that is more precise.

Table Building 95 percent confidence intervals
Type of standard error	Standard deviation	Sample size	Sample value	Standard error	95 percent confidence interval
Standard error of the mean	15	30	100	2.74	94.63-105.37
Standard error of the mean	15	60	100	1.94	96.20-103.80
Standard error of the proportion	.25	30	.50	.09	.32-.68
Standard error of the proportion	.25	60	.50	.06	.38-.62
Standard error of the estimate	15	30	100	14.81	70.97-129.03
Standard error of the estimate	15	60	100	14.65	71.29-128.71

The "Sample value" column in Table 2-11 for the standard error of the estimate is an example of an estimated or predicted score on some variable. The calculations in the example assume a correlation of .25 between the predictor and the criterion.

Uncle Frank's Campaign for Dogcatcher

As the campaign manager for my Uncle Frank in his recent campaign for dogcatcher, I had an opportunity to use standard errors. Several weeks before the election, I surveyed 30 randomly chosen voters in the town of Tonganoxie, Kansas, where Frank lives. My survey found that 50 percent of respondents said they would vote for him. I warned Uncle Frank that the sample was so small that it was not a very precise reflection of the entire population of voters.

After referring to Table 2-11, I determined that if we had surveyed all the voters in town, the percentage saying they would vote for Frank might reasonably be anywhere between about 32 percent and 68 percent, though the most likely value was 50 percent. Of course, the optimist that is my uncle interpreted this as meaning he might have 68 percent of the vote and a huge lead. He spent the rest of his campaign chest on a giant victory party the night before the election. I, being the realist that I am and knowing my uncle's reputation around town, assumed the true outcome would be in the other direction. It was. That's okay, though. It was a great party.

Why It Works

We can trust the accuracy of standard errors if we accept the following assumptions and apply some common sense:

Sampling errors are normally distributed: This means that the size of these errors range in value in a way that matches the normal curve. This allows us to produce those persuasively precise confidence intervals.
Sampling errors are nonbiased: This means that sample values are equally likely to be greater or less than the population value. This is convenient because it means that across repeated studies, one can zero in on the true population value.

The formulas are constructed in such a way that if you have little or no information about the population, then the size of the error in your sample estimate is about the size of the standard deviation of the population.

Look what happens with the standard error of the mean or the standard error of the proportion when the sample size is 1, or what happens with the standard error of the estimate when the correlation is 0.00. Intuitively, a good formula for figuring the standard error size should produce smaller errors when more is known about the population.