The distinction between infrastructure and application has been emphasized repeatedly in earlier chapters. As elsewhere in the economy, infrastructure is an important economic enabler for the software industry. Economics can provide some important insights into the role of infrastructure and help clarify its value.
The definition of infrastructure used earlier was "everything that supports a number of applications and is specific to none." Here we use an expansive interpretation that includes capabilities directly assisting the execution of an application and also the development of applications and the provisioning and administration of applications. These latter elements are economically significant, given that development, provisioning, and operation are significant contributors to the total cost of ownership (see section 5.2). A capable infrastructure makes it quicker and cheaper to develop, provision, and operate software. This lowers the barrier to entry for individual applications and enables a greater diversity of applications serving more special needs. A significant trend is toward more diverse and specialized applications (see section 3.1), and infrastructure is a significant enabler.
Infrastructure is an essential underpinning of many industries, but there are three particularly important economic foundations for infrastructure supporting software: the sharing of resources, the reuse of designs, and productivity-enhancing tools.
A classic purpose for infrastructure (not only software) is resource sharing. An important economic benefit of this is lowered marginal costs for provisioning and operating a new application based on an infrastructure already operational in the customer's environment.
Another economic motivation for sharing is benefiting from economies of scale. If, as a resource is provisioned in larger quantities the unit cost decreases, it becomes attractive to provision and share larger quantities. The motivation for sharing in the case of software is, however, more complex than this. Software, like information, is freely replicated and nonrival in use (see section 2.1). Thus, there is no economic motivation to share software code itself—it can be freely replicated and executed in as many places as make sense. However, software also requires a material infrastructure to support its execution, and sharing that material infrastructure can exhibit economies of scale. There are two primary examples of this. The first is sharing a single host over two or more users or over two or more applications. In fact, this is the central idea of both the mainframe and time-sharing computing (see table 2.3) and is also a motivation for the client-server architecture (see section 4.4). The second is the sharing of the network by many hosts. The economic justification for these two instances of sharing is somewhat different.
A primary motivation for host sharing is the convenience and efficiency of administration of information that is common to users and applications. A secondary motivation is sharing of the processor resource. There are economies of scale in processing, because equipment costs increase more slowly than performance (at least up to whatever technological limits are in place). A bigger factor is the locality of data access: if a common repository of data is accessed, then as shared data are split over two or more hosts, efficiency is lost and the administrative costs of maintaining the data increase substantially.
These considerations are different for the network, where resource sharing is much more common; in fact, it is almost universal. Higher-bandwidth switches and communication links exhibit economies of scale like processors. A much bigger consideration is the sharing of real estate and radio spectrum: in a wide-area network, it is necessary to use either the radio spectrum (typically licensed from the government) or else accumulate the geographic right-of-way to run fiber optic cables. There are large economies of scale in the latter because the cost of acquiring right-of-way and the cost of laying trenches and cable conduits is substantial, whereas the aggregate bandwidth capability is effectively unlimited for a given right-of-way.
Relieving congestion is another motivation for sharing. Congestion occurs when the load on a resource (either processing or network) is irregular and variable with time (see section 2.3.1). The variable load results in the arrival of work (tasks to processor or packets to communicate) temporarily exceeding capacity, and the excess work has to be deferred until there is an opportunity to complete it, creating an excess congestion-induced delay. Performance considerations or service-level agreements typically limit the acceptable delay (congestion-induced or otherwise) and hence place a limit on the resource utilization. When a single source is shared over multiple workloads, it turns out that a higher utilization can be accommodated for the same congestion-induced delay. The reason is simple: the service time of a higher-performing resource is reduced, and congestion delay is proportional to the service time. This is called a statistical multiplexing advantage and is another source of scale economy.
Example If irregular automobile traffic arrives at a bridge, occasionally the traffic arrival rate will exceed the bridge capacity, and there will be a backup in front of the bridge. To limit the average congestion delay, the average incoming traffic rate has to be limited. If the same traffic is divided into two equal pieces, each crossing two parallel bridges with half the speed limit and hence half the capacity, then the total incoming traffic will have to be further limited to maintain the same average congestion delay.
Because of a statistical multiplexing advantage, provisioning and sharing a larger resource (processor or communication link) instead of a number of smaller resources with the same total capacity increases the average utilization for the same average congestion delay.
Example Assuming the simplest model for irregular loads described in section 2.3.1, the total average delay D is related to the average service time Ds and the average utilization u by
Now compare two situations: N identical servers have the same utilization uN and service time Ds, and a single high-capacity server has N times the capacity, utilization u1, and the same average delay D. The higher-capacity server, because it completes tasks by a factor N faster, has a smaller service time by the same factor Ds/N. This is the source of the statistical multiplexing advantage. These two situations will yield the same average congestion delay when
This increasing return to scale is in addition to any capital cost advantage of a single server over multiple lower-capacity servers and any reduction in administrative costs from operating a single server.
Example Suppose we have 100 identical servers operating at a 75 percent utilization, and assume the simple congestion model of the last example. This utilization was chosen to achieve an acceptable average delay (four times as large as the average service time). If these 100 servers are replaced by a single server that is 100 times faster, then the utilization can be increased to 1 − (1 − 0.75)/100 = 0.9975, or 99.75 percent. At the same overall average delay, the result of sharing this single server is the accommodation of a 33 percent higher throughput (by increasing utilization from 75 percent to 99.75 percent) for the same total capacity. The relation is plotted in figure 9.5 for two through five slower servers replaced by a single faster server.
Figure 9.5: The utilization that can be achieved for a single faster server (y-axis) in comparison to N slower servers (x-axis) for the same average delay.
There are also disadvantages to gaining efficiency by this scale-up (replacement of servers by fewer faster ones) rather than scale-out (adding additional slower servers). For example, availability may be adversely affected, since single outages affect more users and can be more disruptive. Also, and important for organizations facing unpredictable growth, it is easier as demand increases to scale out an existing installation (by adding more servers) than it is to scale up (by replacing servers with faster ones).
Another challenge with resource sharing is that one user's increasing resource consumption causes increased congestion experienced by other users. If there is no compensating payment, this is a negative congestion externality. Another way to state this: How do we maintain control over the utilization, thereby controlling the congestion delay? There are a number of forms of congestion control. One is admission control, which is just another form of access control (see section 5.4.2) based on the current level of congestion rather than access privileges.
Another approach often advocated by technologists is overprovisioning, which is simply provisioning an adequate resource that throughput fluctuations can never credibly overwhelm it. An advantage often cited so that hardware and communication resources are cheap (and getting cheaper, per Moore's law), whereas congestion control increases complexity and cost. Often lost in this argument is the value that congestion control may bring to users: the cost of congestion control can only be evaluated in relation to any added value that it may bring. Of course, overprovisioning adds costs, too, and it is not clear what economic incentives service providers have to expend these costs. Unlike congestion control, overprovisioning also leaves resources vulnerable to abusive use. One security hole is the denial-of-service attack, in which a vandal generates many spurious requests on a resource with the goal of creating artificial congestion and denying that resource to legitimate users. Once such an attack is identified, some form of admission control can be invoked to block these abusive requests. A more sophisticated attack is mounted simultaneously from various locations, making defense more difficult. The recent incidents of Internet worms (rogue programs that replicate across vast numbers of machines) demonstrate the vulnerability of any shared resources made available on the Internet, and the difficulty of mounting defenses. Overprovisioning is an ineffective defense for any but the largest-capacity operations, whereas other congestion control mechanisms can help.
Economists have their own prescription (seldom seen in practice today) for congestion control through congestion pricing. Each user is charged a fluctuating price based on the current congestion and reflecting the cost imposed by that use on other users. Accurately implemented, congestion pricing allows a resource to be allocated optimally based on the user's willingness to pay, taking into account both the benefit of resource utilized and the impairment of congestion-induced delay. Congestion pricing has two important advantages. First, users who have genuine need to use more of the resource (and corresponding willingness to pay) are able to get it. This is not a capability of involuntary admission control. Second, the revenue from congestion pricing increases as congestion increases, and this extra revenue is available to the service provider to add capacity and relieve congestion. Of course, a skeptic would assert that the provider might just pocket this revenue. In fact the economic theory indicates not. If the congestion price is high enough, the provider can increase profits by adding capacity. This price threshold is identical to the condition under which there is a collective social benefit to users and providers in adding capacity (MacKie-Mason and Varian 1995). Congestion pricing does require an especially complex infrastructure, although it can make use of billing and payment mechanisms shared by usage-based pricing. Weaker forms of congestion pricing, such as pricing based on time-of-day (where statistically predictable congestion exists) are easier (but also less effective).
While the sharing of software code considered as a resource is not advantageous, using the same software code in multiple designs is of intense interest. This reuse or multiple use sense of infrastructure is shared with many products in the material world (e.g., automobile platforms share common components over multiple models). Reusable modules, components and frameworks seek to contain one of the most important costs, that of development (see section 7.3). On the other hand, designing and developing reusable software modules, components, and frameworks invariably take considerably more time, effort, and expense than single-purpose designs. Thus, there must be a compelling expectation of multiple uses before this becomes economically feasible.
Reuse and multiple use apply in other contexts as well. Much of operation and administration involves setting up organization-specific processes and partly automating those processes through various programs and scripts. If those processes and scripts can be shared with other administrators, as they often are, this constitutes reuse.
Tools are another form of software infrastructure that reduce the time and effort required to develop, provision, and operate applications (see section 4.4.7). By automating many tasks that would otherwise have to be performed manually, tools reduce time, effort, and expense, and contribute to quality. Integrated development environments include rich toolkits for software development (see section 7.3.6). Tools are crucial to operation as well as development.
Example Network management is a category of software tool that allows system and network administrators to keep track of and configure various equipment resources (e.g., computers, network switches) from a central location as well as signal and trace the causes of problems that may arise. Not only can such tools increase worker productivity, but they also contribute to effectiveness by allowing each worker to encounter and deal with more problems and challenges.