Section 16.2. Network-Enabled Collaboration | Open Sources 2.0: The Continuing Evolution

16.2. Network-Enabled Collaboration

To understand the nature of competitive advantage in the new paradigm, we should look not to Linux, but to the Internet, which has already shown signs of how the open source story will play out.

The most common version of the history of free software begins with Richard Stallman's ethically motivated 1984 revolt against proprietary software. It is an appealing story centered on a charismatic figure, and leads straight into a narrative in which the license he wrotethe GPLis the centerpiece. But like most open source advocates, who tell a broader story about building better software through transparency and code sharing, I prefer to start the history with the style of software development that was normal in the early computer industry and academia. Because software was not seen as the primary source of value, source code was freely shared throughout the early computer industry.

The Unix software tradition provides a good example. Unix was developed at Bell Labs, and was shared freely with university software researchers, who contributed many of the utilities and features we take for granted today. The fact that Unix was provided under a license that later allowed AT&T to shut down the party when it decided it wanted to commercialize Unix, leading ultimately to the rise of BSD Unix and Linux as free alternatives, should not blind us to the fact that the early, collaborative development preceded the adoption of an open source licensing model. Open source licensing began as an attempt to preserve a culture of sharing, and only later led to an expanded awareness of the value of that sharing.

For the roots of open source in the Unix community, you can look to the research orientation of many of the original participants. As Bill Joy noted in his keynote at the O'Reilly Open Source Convention in 1999, in science, you share your data so that other people can reproduce your results. And at Berkeley, he said, we thought of ourselves as computer scientists.^[7]

^[7] I like to say that software enables speech between humans and computers. It is also the best way to talk about certain aspects of computer science, just as equations are the best ways to talk about problems in physics. If you follow this line of reasoning, you realize that many of the arguments for free speech apply to open source as well. How else do you tell someone how to talk with their computer other than by sharing the code you used to do so? The benefits of open source are analogous to the benefits brought by the free flow of ideas through other forms of information dissemination.

But perhaps even more important was the fragmented nature of the early Unix hardware market. With hundreds of competing computer architectures, the only way to distribute software was as source! No one had access to all the machines to produce the necessary binaries. (This demonstrates the aptness of another of Christensen's "laws," the law of conservation of modularity. Because PC hardware was standardized and modular, it was possible to concentrate value and uniqueness in software. But because Unix hardware was unique and proprietary, software had to be made more open and modular.)

This software source code exchange culture grew from its research beginnings, but it became the hallmark of a large segment of the software industry because of the rise of computer networking.

Much of the role of open source in the development of the Internet is well known: the most widely used TCP/IP protocol implementation was developed as part of Berkeley networking; BIND runs the DNS, without which none of the web sites we depend on would be reachable; Sendmail is the heart of the Internet email backbone; Apache is the dominant web server; Perl the dominant language for creating dynamic sites; and so on.

Less often considered is the role of Usenet in mothering the Net we now know. Much of what drove public adoption of the Internet was in fact Usenet, that vast distributed bulletin board. You "signed up" for Usenet by finding a neighbor willing to give you a newsfeed. This was a true collaborative network, where mail and news were relayed from one cooperating site to another, often taking days to travel from one end of the Net to another. Hub sites formed an ad hoc backbone, but everything was voluntary.

Rick Adams, who created UUnet, which was the first major commercial ISP, was a free software author (though he never subscribed to any of the free software idealsit was simply an expedient way to distribute software he wanted to use). He was the author of B News (at the time the dominant Usenet news server) as well as Serial Line IP (SLIP), the first implementation of TCP/IP for dial-up lines. But more importantly for the history of the Net, Rick was also the hostmaster of the world's largest Usenet hub. He realized that the voluntary Usenet was becoming unworkable and that people would pay for reliable, well-connected access. UUnet started out as a nonprofit, and for several years, much more of its business was based on the earlier Unix-Unix Copy Protocol (UUCP) dial-up network than on TCP/IP. As the Internet caught on, UUNet and others like it helped bring the Internet to the masses. But at the end of the day, the commercial Internet industry started out of a need to provide infrastructure for the completely collaborative UUCPnet and Usenet.

The UUCPnet and Usenet were used for email (the first killer app of the Internet), but also for software distribution and collaborative tech support. When Larry Wall (later famous as the author of Perl) introduced the patch program in 1984, the ponderous process of sending around nine-track tapes of source code was replaced by the transmission of "patches"editing scripts that update existing source files. Add in Richard Stallman's GNU C compiler (gcc), and early source code control systems like RCS (eventually replaced by CVS and now Subversion), and you had a situation where anyone could share and update free software. The early Usenet was as much a "Napster" for shared software as it was a place for conversation.

The mechanisms that the early developers used to spread and support their work became the basis for a cultural phenomenon that reached far beyond the tech sector. The heart of that phenomenon was the use of wide area networking technology to connect people around interests, rather than through geographical location or company affiliation. This was the beginning of a massive cultural shift that we're still seeing today.

This cultural shift may have had its first flowering with open source software, but it is not intrinsically tied to the use of free and open source licenses and philosophies.

In 1999, together with Brian Behlendorf of the Apache project, O'Reilly founded a company called CollabNet to commercialize not the Apache product but the Apache process. Unlike many other OSS projects, Apache wasn't founded by a single visionary developer but by a group of users who'd been abandoned by their original "vendor" (NCSA) and who agreed to work together to maintain a tool they depended on. Apache gives us lessons about intentional wide-area collaborative software development that can be applied even by companies that haven't fully embraced open source licensing practices. For example, it is possible to apply open source collaborative principles inside a large company, even without the intention to release the resulting software to the outside world.

While CollabNet is best known for hosting high-profile, corporate-sponsored, open source projects like OpenOffice.org (http://www.openoffice.org), its largest customer is actually HP's printer division, where CollabNet's SourceCast platform is used to help more than 3,000 internal developers share their code within the corporate firewall. Other customers use open source-inspired development practices to share code with their customers or business partners or to manage distributed worldwide development teams.

But an even more compelling story comes from that archetype of proprietary software, Microsoft. Far too few people know the story of the origin of ASP.NET. As told to me by its creators, Mark Anders and Scott Guthrie, the two of them wanted to re-engineer Microsoft's ASP product to make it XML aware. They were told that doing so would break backward compatibility, and the decision was made to stick with the old architecture. But when Anders and Guthrie had a month between projects, they hacked up their vision anyway, just to see where it would go. Others within Microsoft heard about their work, found it useful, and adopted pieces of it. Some six or nine months later, they had a call from Bill Gates: "I'd like to see your project."

In short, one of Microsoft's flagship products was born as an internal "code fork," the result of two developers "scratching their own itch," and spread within Microsoft in much the same way as open source projects spread on the open Internet. It appears that open source is the "natural language" of a networked community. Given enough developers and a network to connect them, open source-style development behavior emerges.

If you take the position that open source licensing is a means of encouraging Internet-enabled collaboration, and focus on the end rather than the means, you'll open a much larger tent. You'll see the threads that tie together not just traditional open source projects, but also collaborative "computing grid" projects like SETI@home (http://setiathome.ssl.berkeley.edu), user reviews on Amazon.com, technologies like collaborative filtering, new ideas about marketing such as those expressed in The Cluetrain Manifesto (http://www.cluetrain.com/book.html), weblogs, and the way that Internet message boards can now move the stock market. What started out as a software development methodology is increasingly becoming a facet of every field, as network-enabled conversations become a principal carrier of new ideas.

I'm particularly struck by how collaboration is central to the success and differentiation of the leading Internet applications.

eBay is an obvious examplealmost the definition of a "network effects" businessin which competitive advantage is gained from the critical mass of buyers and sellers. New entrants into the auction business have a hard time competing, because there is no reason for either buyers or sellers to go to a second-tier player.

Amazon is perhaps even more interesting. Unlike eBay, whose constellation of products is provided by its users and changes dynamically day to day, products identical to those Amazon sells are available from other vendors. Yet Amazon seems to enjoy an order-of-magnitude advantage over those other vendors. Why? Perhaps it is merely better execution, better pricing, better service, and better branding. But one clear differentiator is the superior way that Amazon has leveraged its user community.

In my talks, I give a simple demonstration. I do a search for products in one of my publishing areas, JavaScript. On Amazon.com, the search produces a complex page with four main areas. On the top is a block showing the three "most popular" products. Down below is a longer search listing that allows the customer to list products by criteria such as best-selling, highest-rated, by price, or simply alphabetically. On the right and the left are user-generated "ListMania" lists. These lists allow customers to share their recommendations for other titles related to the given subject.

The section labeled "most popular" might not jump out at first, but as a vendor who sells to Amazon.com, I know that it is the result of a complex, proprietary algorithm that combines not just sales but also the number and quality of user reviews, user recommendations for alternative products, links from ListMania lists, "also bought" associations, and all the other things that Amazon refers to as the "flow" around products.

The particular search that I like to demonstrate is usually topped by my own JavaScript: The Definitive Guide. The book has 192 reviews, averaging 4 ¹/₂ stars. Those reviews are among the more than 10 million user reviews contributed by Amazon.com customers.

Now contrast that with the #2 player in online books, Barnesandnoble.com. The top result is a book published by Barnes & Noble itself, and there's no evidence of user-supplied content. JavaScript: The Definitive Guide has only 18 comments, and the order-of-magnitude difference in user participation mirrors the order-of-magnitude difference in sales.

Amazon doesn't have a natural network-effect advantage like eBay, but it has built one by architecting its site for user participation. Everything from user reviews, to alternate product recommendations, to ListMania, to the Associates program that allows users to earn commissions for recommending books, encourages users to collaborate in enhancing the site. Amazon Web Services, introduced in 2001, takes the story even further, allowing users to build alternate interfaces and specialized shopping experiences (as well as other unexpected applications) using Amazon's data and commerce engine as a back end.

Amazon's distance from competitors and the security it enjoys as a market leader is driven by the value added by its users. If, as Eric Raymond said in The Cathedral & the Bazaar, one of the secrets of open source is "treating your users as co-developers," Amazon has learned this secret. But note that it's completely independent of open source licensing practices! We start to see that what has been presented as a rigidly constrained model for open source may consist of a bundle of competencies, not all of which will always be found together.

Google makes a subtler case for the network-effect story. Google's initial innovation was the PageRank algorithm, which leverages the collective preferences of web users, expressed by their hyperlinks to sites, to produce better search results. In Google's case, the user participation is extrinsic to the company and its product, and so can be copied by competitors. If this analysis is correct, Google's long-term success will depend on finding additional ways to leverage user-created value as a key part of its offering. Services such as orkut (http://www.orkut.com) and Gmail (https://gmail.google.com) suggest that this lesson is not lost on them.

Now consider a counter-example. MapQuest is another pioneer that created an innovative type of web application that almost every Internet user relies on. Yet the market is shared fairly evenly among MapQuest (now owned by AOL), Maps.yahoo.com, and Maps.msn.com (powered by MapPoint). All three provide a commodity business powered by standardized software and databases. None of them has made a concerted effort to leverage user-supplied content, or engage its users in building out the application. (Note also that all three are enabling an Intel Inside-style opportunity for data suppliers such as NAVTEQ, now planning a multibillion-dollar IPO!)

The Architecture of Participation

I've come to use the phrase the architecture of participation to describe the nature of systems that are designed for user contribution. Larry Lessig's book, Code and Other Laws of Cyberspace (http://www.code-is-law.org), which he characterizes as an extended meditation on Mitch Kapor's maxim, "architecture is politics," made the case that we need to pay attention to the architecture of systems if we want to understand their effects.

I immediately thought of Kernighan and Pike's description of the Unix software tools philosophy (http://tim.oreilly.com/articles/paradigmshift_0504.html). I also recalled an unpublished portion of the interview we did with Linus Torvalds to create his essay for the 1998 book, Open Sources (http://www.oreilly.com/catalog/opensources). Linus too expressed a sense that architecture may be more important than source code. "I couldn't do what I did with Linux for Windows, even if I had the source code. The architecture just wouldn't support it." Too much of the Windows source code consists of interdependent, tightly coupled layers for a single developer to drop in a replacement module.

And of course, the Internet and the World Wide Web have this participatory architecture in spades. As outlined earlier in the section on software commoditization (http://tim.oreilly.com/articles/paradigmshift_0504.html), a system designed around communications protocols is intrinsically designed for participation. Anyone can create a participating, first-class component.

In addition, the IETF (http://www.ietf.org), the Internet standards process, has a great many similarities to an open source software project. The only substantial difference is that the IETF's output is a standards document rather than a code module. Especially in the early years, anyone could participate simply by joining a mailing list and having something to say, or by showing up at one of the three annual face-to-face meetings. Standards were decided by participating individuals, irrespective of their company affiliations. The very name for proposed Internet standards, Request for Comment (RFCs), reflects the participatory design of the Net. Though commercial participation was welcomed and encouraged, companies, like individuals, were expected to compete on the basis of their ideas and implementations, not their money or disproportional representation. The IETF approach is where open source and open standards meet.

And while there are successful open source projects like Sendmail, which are largely the creation of a single individual and have a monolithic architecture, those that have built large development communities have done so because they have a modular architecture that allows easy participation by independent or loosely coordinated developers. The use of Perl, for example, exploded along with CPAN (http://www.cpan.org), the Comprehensive Perl Archive Network, and Perl's module system, which allowed anyone to enhance the language with specialized functions, and make them available to other users.

The Web, however, took the idea of participation to a new level, because it opened that participation not just to software developers but to all users of the system.

It has always baffled and disappointed me that the open source community has not claimed the Web as one of its greatest success stories. If you asked most end users, they are most likely to associate the Web with proprietary clients such as Microsoft's Internet Explorer than with the revolutionary open source architecture that made the Web possible. That's a PR failure! Tim Berners-Lee's original web implementation was not just open source, it was public domain. NCSA's web server and Mosaic browser were not technically open source, but source was freely available. While the move of the NCSA team to Netscape sought to take key parts of the web infrastructure to the proprietary side, and the Microsoft-Netscape battles made it appear that the Web was primarily a proprietary software battleground, we should know better. Apache, the phoenix that grew from the NCSA server, kept the open vision alive, keeping the standards honest, and not succumbing to proprietary embrace-and-extend strategies.

But even more significantly, HTML, the language of web pages, opened participation to ordinary users, not just software developers. The "View Source" menu item migrated from Tim Berners-Lee's original browser, to Mosaic, and then on to Netscape Navigator and even Microsoft's Internet Explorer. Though no one thinks of HTML as an open source technology, its openness was absolutely key to the explosive spread of the Web. Barriers to entry for "amateurs" were low, because anyone could look "over the shoulder" of anyone else producing a web page. Dynamic content created with interpreted languages continued the trend toward transparency.

And more germane to my argument here, the fundamental architecture of hyperlinking ensures that the value of the Web is created by its users.