|
Businesses that adapt and use the J2EE programming model to write their enterprise applications have specific requirements from the application server. The following are some of the most common requirements:
Generally, customers give very high importance to the first four items in the listperformance, scalability, high availability, and fault tolerance. There is very high expectation that application servers perform well and stay operational 24/7. Performance is the capability to sustain "good" or "acceptable" response time to user requests. In some cases, this expectation is driven by the nature of the business. Some businesses just cannot afford to be unresponsive, not even for a minute or so. This is true of time-critical systems. Good examples of these businesses are stock trading, online auctions, banking, and airline reservations. Some people consider good performance to include the capability to withstand catastrophic situations, which can be categorized under fault tolerance and high availability. In some cases, businesses are motivated by the quality of service they can give to customers. Response time and user experience are very important to businesses such as merchandisers and retailers. A useful performance metric that shows how well an enterprise is doing is the system's overall throughput (the amount of work done over a period of time). Of course, the higher the throughput, the better for the company's bottom line. The quality of service is also important to ensure that there are repeat customers (users who come back to the web site after a successful transaction). A simple yet useful metric for determining the popularity or success of a web site is the average number of hits to the site every day. The more users who visit the site, the higher the likelihood that serious transactions from customers occur. The business motivation, therefore, is to be able to accommodate as many visitors as possible without sacrificing excellent user experience. Before analyzing the performance characteristics of an application server, it is helpful to first identify some of the application server's characteristics with regard to its usage and environment. Characteristics of the Application ServerThe following eight characteristics define the typical application server:
Application Servers on LinuxCommercial application servers that run in Linux exist today and have become undeniably popular. Vendors of application server products, including those known to support only the powerhouse servers, now have Linux in their platform portfolio. Table 18-1 is just a short list of these vendors. Notice that even the major application server vendors, such as IBM, BEA, and Oracle, are major supporters of Linux.
Enterprise businesses are investing seriously on Linux-based applications. Future predictions from Giga and Gartner are all in favor of Linux. For example, the Gartner group thinks that by 2006, Linux will be a key foundation for a strategic cross-development platform environment creating a powerful alternative to Microsoft's .NET on Windows. Reports like these are just one of the strong reasons why people migrate to Linux, not to mention the attractive cost-effectiveness of the platform. The good thing is that although application servers are not lightweight servers, they are small enough to run in small and less-expensive machines. Thus, a network of Linux servers can provide the necessary horsepower. The Software Performance StackThis section focuses on the performance needs of application servers and is centered on three major topics:
Unlike the performance characterization of other systems, such as network servers and file server, the performance characterization of application servers is much more involved because of the complex nature of application servers. Network and file servers are easier to understand and control because they are stand-alone and very lightweight, and the interactions involved with them are very minimal or limited only to the operating system. The J2EE specification continues to evolve to include more services and functional requirements and enhancements. For example, the latest specification, J2EE 1.4, includes web services, among other things, which necessitates more computations and processing (for example, SOAP protocol processing, XML parsing, and JAX-RPC, to name a few). For more information on J2EE specifications, visit the J2EE web site at http://java.sun.com/j2ee. Application servers are also very high-level applications whose scope encompasses many, if not all, of the subsystems discussed in this book. For that matter, application servers can serve as excellent end-to-end benchmark applications for Linux servers. The Complexity of the J2EE FrameworkUnder the hood, the J2EE-based application server is a complex aggregate of components that correspond to numerous services. To characterize the application server's overall performance, the individual performance of these components and their underlying infrastructure must be taken into account. Because this is not an easy task, the most common and accepted approach is to benchmark the application server with a representative J2EE application. Different benchmark applications can be written, with each focusing on a certain aspect of the application server. For example, one application can exercise HTTP session performance, while another can exercise persistence performance of entity beans. An industry-standard benchmark application by the Standard Performance Evaluation Corporation (SPEC; http://www.spec.org), called SPECjAppServer, is frequently used to benchmark application servers. What happens beneath the application is transparent when benchmarking an application server. Figure 18-1 shows the layers of components that are involved in a J2EE environment where the performance of one layer affects that of the next layer. We call this layered model the software performance stack. Figure 18-1. Software performance stack. The stack is intended to show which components affect the performance of another component.At the very top of the stack are the J2EE enterprise applications that are executed by the application server. The application server strictly provides the runtime and threads of execution for the applications as defined by the J2EE specifications. Notice that applications can interact with other back-end resources either through the application server or directlythe application can call special libraries that provide a wrapper for accessing databases, or the application can use data sources via JDBC. Nothing prohibits application developers from writing some parts of their applications in native code. The same is true with back-end resources and other software services. The application server itself is a Java application (although some parts of it can be written in native code), which means that the JVM's performance plays a major part. The enterprise applications coexist with the application server in the JVM's runtime environment. From the JVM's perspective, the enterprise applications are just part of the application server because the latter executes the former. The remaining layers, of course, are typical in many other systems. Note that as the method or function calls go down the stack, some overhead is accrued and the overall performance of the application is affected by how good or bad the underlying layers performed. The Enterprise ApplicationMany performance analysts say that at least 75% of the time the leading cause of bad performance is the enterprise application itself. If the problem in the performance stack is at the uppermost layer, the lower layer cannot do much to help improve it. In our experience with many customers in the field, many of the problems uncovered can be fixed by making changes in the application alone. The two main reasons for a badly performing application are as follows:
The Application ServerThe application server can be written in native code, but J2EE is essentially a Java framework. For easier interaction and integration with other Java-based frameworks, such as JMX, JDBC, and JNDI, most application servers are written in Java. This is also favorable for application server vendors because they can easily port their product to different platforms. As a Java application, the application server is subjected to the same assessments as enterprise applications. A badly designed application server will very likely exhibit poor performance. In this market, vendors compete as to who has the best-performing application server. Thus, every ounce of improvement in the design and implementation of an application server can go a long way. A good design is one that is performance-oriented from the start. Figure 18-2 shows the generic architecture of a J2EE application server. Note that the diagram is not a software architecture; instead, it simply shows what components are included and the interactions of those components. Figure 18-2. The application server that is shown here is based mainly on the J2EE specifications, which require two containers: the web and EJB containers.Requests can be channeled to the application server from two possible types of clients: the HTTP client and the thick Java client. The HTTP client is usually done through a web browser that transmits an HTTP request to a web server that is a front end to the application server. A plug-in is typically installed into the web server to enable communication with the application server. This installation can be done during the installation of the application server. When an application server receives an HTTP request from the web server, it determines which web application must handle the request by examining the Uniform Resource Identifier (URI). For dynamic HTML pages, the request is served by a servlet (Java Server Pages are converted to servlets after compilation). Per the J2EE specification, web requests are processed in a web container inside the application server. Meanwhile, servlets can also call an EJB, which resides in the EJB container of the application server. Depending on the business logic, the request may involve accessing resources, such as a data source, through a JDBC provider or a message queue through JMS. A thick client is a complete Java application running somewhere in its own Java runtime environment. EJB lookups, creations, and calls to EJB methods are coded in the application. This is similar to Remote Procedure Calls (RPC), a mechanism used in distributed systems prior to the creation of Java. The protocol used to execute the remote method is typically the Remote Method Interface over Internet Inter-Orb Protocol (RMI/IIOP), which allows the capabilities of the Common Object Request Broker Architecture (CORBA) for distributed systems to be implemented on the Java 2 platform. CORBA is a specification by the Object Management Group (OMG). Some vendors, such as BEA, use a proprietary protocol, but RMI/IIOP is the expected protocol by most J2EE application servers. Java clients communicate directly with EJBs; requests are routed through the EJB container of the application server. The dominant type of client remains the HTTP client (or web client). Most J2EE enterprise applications have a web module and EJB modules. The Java-based clients are catching up, but the web clients are likely to remain the dominant interface in the coming years. This is especially true now that web services are the cutting-edge technology. Most web services use the SOAP protocol with HTTP as the transport mechanism. In summary, the critical, major components in an application server for performance are the web container, the web server plug-in, the EJB container, the ORB, and the interfaces within the various resources. The Java Virtual MachineFrom a performance perspective, the Java Virtual Machine (JVM) is one of the most critical components of the application server because most, if not all, of the application server's code is executed by the JVM. The application server must be written in such a way that it leverages the best features of the JVM. Some vendors offer several JVMs to run the application server and allow the user to select his preference and others ship and prepackage their JVM. Ideally, however, users should be able to manually change their JVM (possibly with extra steps that are specific to the application server configuration). The ability to change the JVM is advantageous because it lets you use the best-performing JVM there is currently. The SPEC organization provides a standard benchmark application called SPECjbb2000 that can be used to evaluate the performance of servers running typical Java business applications. The evaluation takes into consideration both hardware and software aspects of JVM servers. SPECjbb2000 represents an order-processing application for a wholesale supplier. For more information, visit SPEC at http://www.spec.org/jbb2000. The specific components of the JVM that are critical to application servers include the networking libraries, the garbage-collection mechanism, the memory management subsystem, and the JIT compiler. Extensions to JVM, such as RMI/IIOP and Java security, are also important components. Another important area to consider is the set of flags that are available for the JVM. You can use these flags to tune the JVM based on the characteristics of the J2EE application and the application server itself. Native CodeIn the performance stack, some native code has to be executed one way or another. Native code is closest to the operating system. In the case of the application server, the JVM is written in a third-generation language such as C or C++ and is directly compiled to produce the native code that is then executed by the host machine. Other components, such as back-end resources and enterprise applications, might contain some native code as well. We believe that native application performance also depends heavily on how the application is written. The JVM itself is an application program that has been architected, designed, and implemented. A critical component of the JVM is the JIT (just-in-time) compiler, in which some methods of a Java application might be compiled to native code at runtime. In general, compilers (the JIT compiler and C++ compiler, for example) contain code optimization features so that the code they produce performs as well as possible. To achieve high performance, compiler writers need to have a deep understanding of the host machine's architecture as well as the underlying operating system. They also need to understand the resources and system calls available from the operating system, as well as its policies. In this case, understanding how the Linux kernel works and the major features and fixes that come with every release of the kernel is very important. The Hardware and the Operating SystemThe choice of hardware platform on which to run your application server is a big factor to consider because it is the hardware that executes all instructions. It is almost superfluous to discuss hardware in this section, but some things need to be emphasized. First, choosing a hardware platform is a task that every business owner must do at an early stage. This task may or may not be easy, depending on many factors and their impact, such as the existing hardware the business currently has and the legacy applications on these assets. Second, business owners are constantly faced with the many challenges of rapid advances in hardware performance. For example, processor speeds continue to go up, and performance techniques, such as hyperthreading, are on the rise. Computer architecture continues to evolve toward better performance and higher capacity. 64-bit machines are on their way, Non-Uniform Memory Access (NUMA) is getting a lot of attention, and blade servers are currently very popular because they have the advantage of being small and yet are very powerful and scalable. The third and probably most important issue is the choice of operating system on these machines. The purpose of the operating system is to manage the hardware resources and the processes that run on the machines. Bad and inefficient operating systems can waste a lot of what the hardware has to offer. Thus, the operating system is a key contributor to the performance stack. An operating system is essentially a software application, and its performance depends heavily on its overall design and implementation. The Linux operating system started as a UNIX-like operating system for the personal computer. It has now matured to the operating system for many enterprise-level systems, such as IBM's eServers that span the Intel, PowerPC, and 390 architectures; the Sun Fire systems from Sun; the SGI Altix systems; the HP ProLiant Servers; and the HP Integrity Superdome (based on Itanium processors). One of the advantages of application servers using Java is portability. Portability is ideal for a heterogeneous environment. Portability lets you write applications without worrying about the target platform ahead of time. Heterogeneity gives you the flexibility to mix and match available hardware and operating system platforms that you may already have or are planning to requisition. Because the application server is just one component of a bigger enterprise system, you need to plan which platform and operating system each component should run on. For example, most systems use Intel servers for their web servers and big UNIX systems for their application servers. The back-end data sources are typically mainframe systems or legacy systems. The following sections examine the Linux operating system as the platform of choice for application servers. The growing popularity of Linux in the open-source community has helped improve its performance and reliability. With the support of big companies like IBM, Red Hat, HP, and Sun, the Linux operating system has evolved from an inexpensive UNIX-like desktop operating system into a powerful enterprise server. Many Linux boxes are now used as web servers, mail servers, network gateways, and DNS servers. In the application server space, big companies have already begun migrating their J2EE applications to Linux. At the same time, more and more application server vendors are supporting different versions of Linux. Application Server Hot SpotsThis section discusses some of the performance "hot spots" common across most application servers. JVMThe Java Virtual Machine that comes with the application server interprets and executes both the application server and the enterprise applications. A poorly performing JVM can cause the whole system to degrade. A JVM can perform well on a different operating system than Linux on the same hardware. A possible reason for this is that the Linux implementation might not have used the right system calls for a given function. The JIT compiler is a critical piece of the JVM because it translates Java byte code to the native code. Again, if the JIT compiler does not use or misuses the services available in Linux, performance can be affected significantly. Performance differences can also be seen even for the same JVM on Linux from the same vendor but on two different platforms. A classic example is the use of the system call usleep() emitted by the JIT compiler for the PowerPC platform using Linux kernel 2.4. It turns out that on the IA32 platform, the JIT compiler emits sched_yield() instead of usleep(). The semantics of usleep(n) are to make the calling thread sleep for n microseconds, but usleep() calls nanosleep(), which has a minimum granularity of 10 milliseconds (longer than 100 microseconds). Thus, the thread sleeps longer than intended. As an effect, this has caused tremendous scalability problems on the PowerPC platform. The JIT compiler must be fixed to use the sched_yield() system call instead. It is important not only to choose a high-performing JVM, but also to tune the JVM properly. Performance tuning is discussed later in this chapter in the "Performance Tuning" section. NetworkingApplication servers, because they are servers, imply network communication. However, application servers are not just plain files or timer servers. Application servers are highly complex, and consequently, high network traffic is an inherent characteristic of application servers. ThreadsAnother important aspect of the application server is thread management. As stipulated in the J2EE specifications, both the web and EJB containers are responsible for providing threads of execution for every request that comes in. Threads are hot spots because the application server gets to manage the thread in its own way. Some application servers may have some user parameters available pertaining to threads, and you should pay close attention to what these parameters mean. Moreover, the thread management characteristics of the underlying operating system are very critical to performance. Memory UsageApplication servers tend to use a lot of memory. Most implementations cache objects, sessions, connections, results, and many other artifacts to expedite executions. The typical virtual memory used for data by commercial application servers ranges from 300MB to 400MB. The Linux process size in real memory ranges from 50MB to 125MB. Memory usage is critical in Java because of garbage collection. Garbage collection affects performance because it freezes the application server for a while; thus, no real work is done during this time. The amount of memory allocated to the JVM must be carefully measured, and the right balance must be determined for your applications, depending on their usage patterns. SynchronizationAnother hot spot is the handling of synchronization. Although application servers process requests concurrently, each request is not entirely independent or exclusive from other requests. Some requests are processed by the same application code or might share the same resources. Thus, synchronization is necessary for proper functioning. Points of synchronization are normally found with data and resource sharing. Pools are also places of synchronization because there are limited objects in the pool. Some threads (a running thread is always associated with a particular request) are forced to wait until some objects are placed back in the pool. String ManipulationAn application server is a huge string manipulation program. The profile of an application server typically shows that the majority of the objects being created are String objects. This is not surprising, considering that most inputs to the program are HTTP requests and the outputs are HMTL, which is also String extensive. With XML coming into the picture, application servers spend a significant amount of time parsing XML files, constructing trees, and the like. Protocols for communication with resource providers are also mostly done with string objects, such as SQL queries, messages, and URLs. Web ServerBecause an application server is a back-end server for the web server, the first point of potential bottleneck is the web server. The function of the application server depends on the number of requests channeled by the web server. Properly tuning the web server is therefore a critical step in the entire process. File SystemsThe Linux file system is another hot spot because an application server performs a lot of reading and writing to the local file system. Commercial application servers typically log their activities. For the JVM, classes or jar files are loaded from the file system. When class garbage collection is enabled, the same class or jar file can be loaded several times, depending on whether it was garbage-collected. |
|