While everyday entire species of creatures become extinct, occasionally new species are identified that were previously unknown. Surprisingly, statistical tools, not biological tools, can do the trick.
A few years back, a new species, a type of possum, was identified. The new species was named trichosurus cunninghamii. Trichosurus means, um...possum (I guess), and the cunninghamii part refers to its discoverer, Ross Cunningham, a statistician at Australian National University. If you'd like to have a species named for you, here's how statistics can help.
Identifying Species with Statistics
There is a family of statistical analyses that looks at a bunch of variables and finds naturally occurring groupings among them. Typically, the groupings or clusters of variables are identified on the basis of the correlations among them [Hack #11].
One procedure that uses this strategy attempts to find underlying dimensions or invisible, giant basic variables that account for a bunch of less important variables. This procedure is factor analysis, and elsewhere we see how it can, among other things, be used to identify writers' styles [Hack #65].
Statistics is full of similar techniques that can identify dimensions, underlying causes, and groupings. The goal of identifying groupings is of greatest use to biologically inclined statisticians who wish to identify new species.
For some group of animals to technically be a separate species, it must share a unique set of biological characteristics that make it distinct from similar animals. Sure, animals within the same family all look a little different from each other, but then, people look a lot different from each other and we are all one species (my Uncle Frank being perhaps the exception that proves the rule).
If a group of animals, such as Dr. Cunningham's possums, have more in common with each other than they do with the other creatures in their species, they might be candidates for consideration as a species in their own right. Statistics can determine that "more like each other and more different from the rest of the species than chance alone would produce" point.
Using Cunningham's discovery as a model, there are a few steps to follow for you to make your own discovery.
Collect some data
This possum existed in Australia near people for more than 200 years and no one noticed. To be fair, it looked an awful lot like the other possums, the most common of which was the trichosurus caninus, now called the short-eared possum.
It was assumed for some time that there was really just this one species of the little guys. Part of Dr. Cunningham's job was to collect and organize descriptive data for the wildlife around him. Consequently, he had a ton of very specific quantitative descriptions of various possum partseyes, ears, nose, and throatand measurements of other physical characteristics.
Choose a statistical method
Cunningham's choice was a technique similar to factor analysis but with a more imposing name: canonical variate analysis. You can use any method that uses the variability in scores to create distinct groupings. Some of those are discussed in this booksuch as factor analysis, mentioned earlier in this hackbut there are many other procedures that would work.
Cunningham used this statistical procedure to examine the descriptive data for this presumably single species (you know, these trichosurus caninus fellers) and demonstrated that there were likely two different species.
Select a hypothesis and analyze the data
Statisticians test hypotheses, so you should begin your analysis with a guess about whether there is or is not a distinction between the groups of participates who supplied your data.
In the example of our hero, Cunningham assumed that there were two different groups of critters that accounted for the data. Then, the procedure (using a computer for the calculations, of course) identified which variables worked best as key distinguishing characteristics between the theoretical groups.
Here are the variables Cunningham used:
While other variables were considered, Cunningham chose these because they were eventually found to be most important in distinguishing one species from another and also because they were characteristics that would probably be unaffected by environment.
The last step in any statistical analysis is to describe and understand whatever you found. For discovering species, you need to be able to describe that new species in enough detail to differentiate it form other, similar species.
The procedures used by Cunningham identified a series of different equations that weighted each of the biological variables differently, to find the combination that best identified two separate groups. These equations (which the procedure labels variates) are similar to regression equations, with the outcome or criterion variable determining which group a possum belongs to.
Here's the single best equation that accounted for an astonishing 89 percent of the variability on these characteristics for all the possums in his database:
I've provided the standardized weights from the study, so we can compare them to each other. The larger weights indicate the possum parts that differed the most between the mathematically chosen two groups of possums.
In this data, you could find two groups of possums that differed the most based on ear length, tail length, and foot length. The amount of variability explained was so large that, statistically, Cunningham concluded that the mathematically identified groupings were real. The two groups of possums found in the data were actually two different species of possum, and the species could be defined by their ear length and a couple of other variables. The larger the weights in the equation shown earlier, the more the two species differed on these body parts.
Two Possum Species
Table 6-20 shows the official descriptions of the two possum species first identified as such by our statistician and his mathematics. Notice the names are even based on the key predictors found in the statistical analysis!
So, start collecting your own data on those odd, stinky bugs you find on your screen door and you are well on your way to greatness and immortality. Is there one species of stink bug or two? You tell me.