There are many ways to use data to make judgments about who is best in any sport. All the intuitive ways to compare performance in individual sports have validity concerns, however.
My friends and I are a competitive lot. Our arena of combat, most recently, has been poker. On a regular basis, my friends and I gather at my home and take part in a Texas Hold 'Em poker tournament. It's an informal affair, but we all take it very seriously. The way our poker tournaments work, everyone starts with the same amount of chips, and when they are gone they are gone. There is a first one out, a last one out, and everything in between. So, for example, if seven people play, someone comes in first, second, third, fourth, fifth, sixth, and seventh.
We all think of ourselves as pretty good and, being competitive, we have longed for an objective method of comparing performance across tournaments. As one of the statisticians in the group, I took it upon myself to devise various ways of producing some sort of objective index that would allow all participants to compare their performance with each other to decide once and for all who is the best player and who is only lucky now and again. This is the story of my quest and the statistical solutions I chose. Not to give the ending away, but I learned that there is no single best solution.
How to Rank Fairly
This business of how to identify the best is a common problem for competitive organizations such as sports leagues and associations. The problem is how to summarize performance across a variety of categories, venues, and occasions.
There are three methods commonly used in the world of sports to make determinations about who is the "best." All of the approaches make some intuitive sense, though each method has its own specific advantages and disadvantages.
First, let's take a look at the nature of the data I had to analyze. Your data will likely be similar, whether you run your weekly home Monopoly game or you run the Professional Golf Association. Though poker is not a sport, any organized competitive endeavor provides data for rankings. Table 5-16 shows the results from eight tournaments in my own summer poker league.
You can see that nine players took part in at least one tournament, but no event had participation from all players. If a person received no points on a given night, it was because she didn't play. This is commonly the case in sports such as golf and tennis as well.
On two occasions, seven people played, but on other occasions, as few as five sat down together. Four people have played in all eight tournaments. (These are the hard-core players who have to admit that they have a bit of a problem recognizing what is important in life.) One player, David, played in only one tournament.
The points under each player's name indicate the order in which they went out. If there are six players and you go out first, you get one point for taking last place. If you are the winner among six players, you get six points for taking first.
How, then, to rank players in the poker league? Here are three common solutions, all of which work to some extent.
The first thought that came to mind in my situation was to simply add up the points across tournaments and rank players based on their total points. This is the approach taken when celebrities are ranked by income or bank robbers are ranked by their number of crimes. Just participating a lot moves you up in these rankings. To be golfer of the year, you have to have played in many events, in addition to performing OK in them.
A second method is to average the points by dividing the total points by the number of tournaments in which a player participated. The beauty of producing an average is that you get a number that represents a typical level of performance. This is ideal for measuring something elusive, such as talent. Your average performance at poker (or anything else) should be the best single indicator of ability.
A third method, the simplest and most commonly used in team sports, is to count victories. The player who wins most often is the best player. This method works well for tournament-style poker (the kind we play) and any events in which there is one competitor who is the clear winner.
Comparing the Three Methods
Though each ranking approach has some clear advantages and does the job adequately, Table 5-17 shows the values for each player under all three ranking systems.
All three scoring systems make sense. But the question about who is the best has a different answer under each of the three systems! This is certainly a frustrating finding for a poker scientist like me. Because one could defend any of the three methods as the "best" way to rank, it is a bit of a paradox that each method produces a different "best" poker player. Table 5-18 shows how the rankings differ under each scoring method.
Notice how the "best player" is different under each system. BJ is the best under the Points system. Lisa is the best under the Mean system. Three people tie for first under the Wins system, but BJ and Lisa are not among them. The only real agreement across the three methods is that David is ranked as the worst player. (Sorry, David, but numbers don't lie. And sorry about the public ridicule. Maybe I can make it up to you with a free copy of this book?)
If three different scoring systems result in three different rankings, it is clear they cannot all be equally valid. They cannot all produce scores that truly reflect the variable of interest, which is poker-playing ability defined in the same way. The solution does not involve picking the single best approach. It was not my goal to identify the best system and go with it; my goal was to provide valid information and let others interpret the data how they want.
My solution was to provide all three rankings based on the three scoring methods. That way, players could choose to focus on the ranking results from the method that makes the most sense to them.
The End of the Story
The system that made the most sense to the players in my poker league turned out to be the one that ranked them the highest. Imagine that.
I sleep at night secure in the knowledge that any of the methods is probably acceptable and "accurate." After all, none of the three methods makes the mistake of identifying me as the one best player. That's got to be some sort of validity evidence in and of itself!
Real-life professional sports organizations have dealt with the advantages and disadvantages of each system by creating composite point systems. Some of the tinkering to improve ranking systems in tennis and golf (and tournament poker, too) includes:
It is a bit ironic that these systems that are likely fairer and more accurate are often perceived by the press and fans as overly complex and crazy. Attempts to make the ranking systems more valid have resulted, often, in a rejection of the systems by the public as invalid.