Hack 9. Show Cause and Effect

Statistical researchers have established some ground rules that must be followed if you hope to demonstrate that one thing causes another.

Social science research that uses statistics operates under a couple of broad goals. One goal is to collect and analyze data about the world that will support or reject hypotheses about the relationships among variables. The second goal is to test hypotheses about whether there are cause-and-effect relationships among variables. The first goal is a breeze compared to the second.

There are all sorts of relationships between things in the world, and statisticians have developed all sorts of tools for finding them, but the presence of a relationship doesn't mean that a particular variable causes another. Among humans, there is a pretty good positive correlation [Hack #11] between height and weight, for example, but if I lose a few pounds, I won't get shorter. On the other hand, if I grow a few inches, I probably will gain some weight.

Knowing only the correlation between the two, however, can't really tell me anything about whether one thing caused the other. Then again, the absence of a relationship would seem to tell me about cause and effect. If there is no correlation between two variables, that would seem to rule out the possibility that one causes the other. The presence of the correlation allows for that possibility, but does not prove it.

Designing Effective Experiments

Researchers have developed frameworks for talking about different research designs and whether such designs even allow for proof that one variable affects another. The different designs involve the presence or absence of comparison groups and how participants are assigned to those groups.

There are four basic categories of group designs, based on whether the design can provide strong evidence, moderate evidence, weak evidence, or no evidence of cause and effect:

Non-experimental designs: These designs usually involve just one group of people, and statistics are used to either describe the population or demonstrate a relationship between variables. An example of this design is a correlational study, where simple associations among variables are analyzed [Hack #11]. This type of design provides no evidence of cause and effect.
Pre-experimental designs: These designs usually involve one group of people and two or more measurement occasions to see whether change has occurred. An example of this design is to give a pretest to a group of people, do something to them, give them a post-test, and see whether their scores change. This type of design provides weak evidence of cause and effect because forces other than whatever you did to the poor folks could have caused any change in scores.
Quasi-experimental designs: These designs involve more than one group of people, with at least one group acting as a comparison group. Assignment to these groups is not random but is determined by something outside the researcher's control. An example of this design is comparing males and females on their attitudes toward statistics. At best, this sort of design provides moderate evidence of cause and effect. Without random assignment to groups, the groups are likely not equal on a bunch of unmeasured variables, and those might be the real cause for any differences that are found.
Experimental designs: These designs have a comparison group and, importantly, people are assigned to the groups randomly. The random assignment to groups allows for researchers to assume that all groups are equal on all unmeasured variables, thus (theoretically) ruling them out as alternative explanations for any differences found. An example of this design is a drug study in which all participants randomly get either the drug being tested or a comparison drug or a placebo (sugar pill).

Does Weight Cause Height?

Earlier in this hack, I mentioned a well-known correlational finding: in people, height and weight tend to be related. Taller males weigh more, usually, then shorter males, for example. I laughed off the suggestion that if we fed people more, they would get tallerbecause of what I think I know about how the body grows, the suggestion that weight causes height is theoretically unlikely. But what if you demanded scientific proof?

I could test the hypothesis that weight causes height using a basic experimental design. Experimental designs have a comparison group, and the assignment to such groups must be random. Any relationships found under such circumstances are likely causal relationships. For my study, I'd create two groups:

Group 1: Thirty college freshmen, who I would recruit from the population of the Midwestern university where I work. This group would be the experimental group; I would increase their weight and measure whether their height increases.
Group 2: Thirty college freshmen, who I would recruit from the population of the Midwestern university where I work. This group would be the control group; I would not manipulate their weight at all and would then measure whether their height changes.

In this design, scientists would call weight the independent variable (because we don't care what causes it) and height the dependent variable (because we wonder whether it depends on, or is caused by, the independent variable).

Because this design matches the criteria for experimental designs, we could interpret any relationships found as evidence of cause and effect.

Fighting Threats to Validity

Research conclusions fall into two types. They have to do with the cause-and-effect claim and whether any such claim, once it is established, is generalizable to whole populations or outside the laboratory. Table 1-9 displays the primary types of validity concerns when interpreting research results. These concerns are the hurdles that must be crossed by researchers.

Table Validity of research results
Validity concern	Validity question
Statistical conclusion validity	Is there a relationship among variables?
Internal validity	Is the relationship a cause-and-effect relationship?
Construct validity	Is the cause-and-effect relationship among the variables you believe should be affected?
External validity	Does this cause-and-effect relationship exist everywhere for everyone?

Even when researchers have chosen a true experimental design, they still must worry that any results might not really be due to one variable affecting another. A cause-and-effect conclusion has many threats to its validity, but fortunately, just by thinking about it, researchers have identified many of these threats and have developed solutions.

Researchers' understanding of group designs, the terminology used to describe them, the identification of threats to validity in research design, and the tools to guard against the threats are pretty much entirely due to the extremely influential works of Cook and Campbell, cited in the "See Also" section of this hack.

A few threats to the validity of causal claims and claims of generalizability are discussed next, along with some ways of eliminating them. There are dozens of threats identified and dealt with in the research design literature, but most of them are either unsolvable or can be solved with the same tools described here:

History: Outside events could affect results. A solution is to use a control group (a comparison group that does not receive the drug or intervention or whatever), with random assignment of subjects to groups. Another part of the solution is to control both groups' environments as much as possible (e.g., in laboratory-type settings).
Maturation: Subjects develop naturally during a study, and changes might be due to these natural developments. Random assignment of participants to an experimental group and a control group solves this problem nicely.
Selection: There might be systematic bias in assigning subjects to groups. The solution is to assign subjects randomly.
Testing: Just taking a pretest might affect the level of the research variable. Create a comparison group and give both groups the pretest, so any changes will be equal between the groups. And assign subjects to the two groups randomly (are you starting to see a pattern here?).
Instrumentation: There might be systematic bias in the measurement. The solution is to use valid, standardized, objectively scored tests.
Hawthorne Effect: Subjects' awareness that they are subjects in a study might affect results. To fight this, you could limit your subjects' awareness of what results you expect, or you could conduct a double-blind study in which subjects (and researchers) don't even know what treatment they are receiving.

The validity of research design and the validity of any claims about cause and effect are similar to claims of validity in measurement [Hack #28]. Such arguments are open and unending, and validity conclusions rest on a reasoned examination of the evidence at hand and consideration for what seems reasonable.

Designing Effective Experiments

Does Weight Cause Height?

Fighting Threats to Validity

Table Validity of research results

See Also