Section 22.3. Design Principles for a Referee Function | Open Sources 2.0: The Continuing Evolution

22.3. Design Principles for a Referee Function

Voluntarism is an important force in human affairs, and the open source software process would not work without it. But harnessing the efforts of volunteers is not enough to build a piece of software or, for that matter, anything else that is even moderately complex. As I've said elsewhere, the reason there is almost no collective poetry in the world is not because it is hard to get people to contribute words. Rather, it is because the voluntary contributions of words would not work together as a poem. They'd just be a jumble of words, the whole less than the sum of its parts.^[4]

^[4] Steven Weber, The Success of Open Source (Harvard University Press, 2004).

In my view, this implies that the bulk of social science research that tries to parse the motivations of open source developers, while interesting, basically aims at the wrong target. Noneconomic motivations (or at least motivations that are not narrowly defined by money in a direct sense) are a principal source of lots of human behavior, not a bizarre puzzle that requires some major theoretical innovation in social science. The harder and more interesting question is governance. Who organizes the contributions and according to what principles? Which "patches" get into the codebase and which do not? What choices are available to the people whose contributions are rejected?

The real puzzles lie in what I'll call the "referee function," the set of rules that govern how voluntary contributions work together over time.

In other words, what makes the open source process so interesting and important is not that it taps into voluntarist motivations per se, but rather, that it is evolving referee functions that channel those motivations, with considerable success, into a joint product and that it does so without relying on traditional forms of authority. No referee function is perfect, and among the variety of open source projects, we can see people experimenting with different permutations of rules. I believe I can generalize from that set of experiments to suggest seven discrete design issues that any referee system will have to grapple with. Certainly this is not a comprehensive list, and the seven principles I suggest are not sharply exclusive of each other. Each incorporates a tradeoff between different and sometimes competing values. And I am not proposing at this point where to find the "sweet spot" for any particular community or any particular problem-solving challenge; my goal is much more modest than that. The point here simply is to lay out more systematically what the relevant tradeoffs are so that experiments can explore the underlying issues that might cause groups to move or want to move the "levers" of these seven principles in one direction or another over time.

22.3.1. Weighting of Contributions

No problem-solving community is homogeneous (in fact, that's why it makes sense for individuals to combine forces). Not everyone is equally knowledgeable about a particular problem. Different people know different things. And they know them with different levels of accuracy or confidence. A referee system needs a means for weighting contributions and it should reflect these differences so that when information conflicts with other information, a more finely grained judgment can be made about how to resolve the conflict. Mass politics teaches us a great deal about bad ways to weight contributions (for example, by giving more credence to information coming from someone who is tall, or rich, or loud). One of the interesting insights from the open source process is the way in which relatively thin-bandwidth communicationsuch as email listsfacilitates removal of some of the social contextual factors in weighting which are ultimately dysfunctional. Tall, handsome men have a significant advantage in televised political debates, but not on an email list. Collaborative problem solving at a distance probably leans toward egalitarianism to start. But egalitarianism does not automatically resolve to meritocracy. The transparency of any algorithm is both desirable and riskydesirable because it makes visible whose contributions carry weight and why; and risky because, well, for exactly the same reasons.

22.3.2. Evaluating the Contributor Versus Evaluating the Contribution

A piece of information can in principle be evaluated on its own terms, regardless of its source. But in practice it is often easier to (partially) prequalify information based on the reputation of the person who contributes the information. Take this to an extremetrusted people get a free ride and anything they say, goesand you risk creating a winner-takes-all dynamic that is open to abuse. But ignore it entirely and you give up a lot of potential efficiencyafter all, there is almost certainly some relevant metadata about the quality of a piece of knowledge in both what we can know about the contribution and what we can know about the contributor. eBay strongly substitutes the reputation of the person (seller or buyer) for information about what is at stake in a particular transaction. I suspect that software patches submitted to Linux from well-known developers with excellent reputations are scrutinized somewhat less closely than patches from unknown contributors, but that's only a hypothesis or a hunch at this point. We don't really have a good measure of how large, open source projects actually deal with this issue, and it would be a very useful thing to know, if someone could develop a reasonable set of measurements.

22.3.3. Status Quo Versus Change Bias

The notion of a refereed repository, whether it is made up of software code or social rules or knowledge about how to solve particular problems, is inherently conservative. That is, once a piece of information has passed successfully through the referee function, it gains status that other information does not have. Yet we know that in much of human knowledge (individual and collective), the process of learning is in large part really a process of forgettingin other words, leaving behind what we thought was correct, getting rid of information that had attained special status at one time. The design issue here is just how conservative a referee function should be, how protective of existing knowledge. There are at least two distinct parameters that bear on that: the nature of the community that produces the knowledge, and the nature of the environment in which that community is operating. Consider, for example, a traditional community that is culturally biased toward the status quo, perhaps because of an ingrained respect for authority. This community might benefit from a referee function that compensates with a bias toward change. If the community is living in a rapidly shifting environment, the case for a change bias is stronger still. The parameters could point in the other direction as well. Too much churn in a repository would rapidly reduce its practical usefulness, particularly in a problem environment that is relatively stable.

22.3.4. Timing

Separate from the issue of status quo versus change bias is the question of timing. How urgently should information be tested, refereed, and updated? The clear analogy in democratic electoral systems is to the question of how frequently to hold electionswhich is obviously a separable question from whether incumbents have a significant electoral advantage. A major design consideration here follows from a sense of just how "bursty" input and contributions are likely to be. Will people contribute at a fairly regular rate, or will they tend to contribute in short, high-activity bursts followed by longer periods of quiet? We know from the open source process that contributors want to see their work incorporated in a timely fashion, but we also know that speeding up the clock makes increasing demands on the referee. This is probably one of the most difficult design tradeoffs because it is so closely tied to levels of human effort. And it's made harder by the possibility that there may be elements of reflexivity in itthat is, a more rapidly evolving system may elicit more frequent input, and vice versa.

22.3.5. Granularity of Knowledge

Modular design is a central part of open source software engineering. The question is where to draw the boundaries around a module. And that is almost certainly a more complicated question for social knowledge systems than it is for engineered software. No referee function can possibly be effective and efficient against many different configurations of claims of knowing things. And there is likely to be a significant tradeoff between the generality of information, the utility of information, and the ease and precision of evaluation. Put differently, rather general knowledge is often more difficult to evaluate precisely because it makes broader claims about a problem, but it is also extremely useful across a range of issues and for many people if it is in fact valid. Highly granular and specific knowledge is often easier to evaluate, but it is often less immediately useful to as many people in as many different settings precisely because it is specific and bounded in its applicability.

22.3.6. System Failure Mode

All systems, technical and political, will fail and should be expected to fail. In the early stages of design and experimental implementation, failures are likely to be frequent. At least some failures and probably most will present with a confusing mix of technical and social elements. How failures present themselves, to whom, and what the respective roles of systems designers, community members, and outsiders are at that moment, are critical design challenges. In Exit, Voice, and Loyalty, Albert Hirschman distinguished three categories of response to failureyou can leave for another community (exit), you can stick with it and remain loyal, or you can put in effort to reform the system (voice). One of the most striking features of the Linux experience is that this community, by empowering exit and more or less deriding loyalty, has had the effect of promoting the use of voice. It is precisely the outcome we wanta system that fails transparently in ways that incentivize voice rather than exit (which is often extremely costly in political systems) or loyalty (which is not a learning mode).

22.3.7. Security

How to design and implement security functions within a referee system depends sensitively on the assumptions we make about what the system needs to guard against. In other words, what level and style of opportunism or guile on the part of potential attackers or "gamers" do we believe we ought to plan for. This is simply a way of saying that no system can be made secure against all potential challenges. Security is always a tradeoff against other considerations, in particular ease of use, privacy, and openness. And security likely becomes a greater consideration as the value that the system provides rises over time. Hackers and crackerswhether benign or malicious in their intentionsare an important part of software ecologies precisely because they test the boundaries of security and force recognition of weaknesses. Can political communities be designed to tolerate (and benefit from) this kind of stress testing on a regular basis?