Chapter 3: WebSphere 4 and 5 Component Architectures | Maximizing Performance and Scalability with IBM WebSphere

Throughout The Book, I talk about the major components of WebSphere for versions 4 and 5. It's therefore important to highlight and detail these key component areas, which in later chapters are the focal points, or performance levers, used for tuning and optimizing WebSphere.

Component Architecture Overview

Both versions of WebSphere are complex software platforms, comprised of many components and differing technologies. Ensuring that users are being offered the fastest response times and greatest availability of your application are the most obvious reasons why WebSphere performance and optimization is so important, especially given the myriad component setting combinations.

For a single software platform to be able to efficiently handle hundreds of thousands (and quite probably millions) of user transactions per day with maximum availability ”and given the sheer number of interface possibilities available with WebSphere ”you need a good optimization strategy.

It's important, however, to understand the two platforms from a component level. Although at the end of the day both WebSphere 4 and 5 perform the same tasks , by virtue of ever-evolving application software, they're different.

So you don't confuse the high-level components and to gain a clear view of the subtle differences in each platform, the followings sections explore the major components of the two versions and their differences.

WebSphere Common Components between Versions 4 and 5

The following sections give you an overview of the major components that are common between WebSphere 4 and WebSphere 5. Please note, however, that although at a high level the components in this chapter are the same, there are subtle differences between version 4 and version 5. Be sure to read the notes and cautions in subsequent chapters regarding specific features.

Caution

One setting used in a component of WebSphere 4 may have disastrous effects if used blindly in WebSphere 5!

Session Database

One of the most useful capabilities associated with server failover and general load-balanced performance is the ability to share user sessions between application servers, server groups, domains, cells , clones , and nodes.

The session database is used for what's known as session persistence , which allows a user's session details to be persisted , or stored, on a common, accessible database. Within WebSphere, you can tune and optimize session persistence in many areas, as well as the database used to store the persisted user session details.

Not all applications can take advantage of session persistence; however, if you're looking to better support users during failover situations, session persistence is a sound solution.

I'll address session databases in detail during later chapters.

Virtual Hosts

The virtual host capability of WebSphere is somewhat analogous to virtual hosts in a static Web server world. A virtual host is essentially a configuration that provides the system manager with the ability to house multiple, logically defined virtual hosts within the same physical server. By default, WebSphere provides several base configurations such as http://YourLocalServerName:9080/ and http://YourLocalServerName:9443/ for testing and example purposes.

Note

The term virtual hosts is a concept that allows you to configure a single physical machine to appear, listen, and process requests to multiple logical (as opposed to physical) servers. Depending on the underlying technology, this may be multiple logical server instances, or it may be a single server process. In WebSphere, virtual host configurations allow you to make your WebSphere server appear as multiple servers. For example, it could appear as both www.mydomain.com and www.theotherdomain.com .

The invocation process is the same as it is for static Web server requests; basic content routing ”through the Hypertext Transfer Protocol (HTTP) plug-in or by other means ”sends the user request for a particular Uniform Resource Locator (URL) to the WebSphere node associated with the URL (for example, YourLocalServerName ). WebSphere is configured through the virtual host configuration to listen on the configured port and act on it.

In essence, virtual hosts give WebSphere the ability to house multiple Web-based application domains on a single physical WebSphere application server, and they, combined with WebSphere server groups, allow for vast clusters of WebSphere servers to operate and process requests in a WebSphere farm-type configuration.

Virtual hosts simply allow a more compartmentalized management and operational model for your WebSphere application server environment.

What this means is that when combined with the server group capability, virtual hosts allow the system manager to isolate and manage ”in a compartmentalized manner ”any Web-based application on a WebSphere cluster.

Administrative Consoles

The system manager has several standard options and interfaces to control and manage the vast number of components available via the administration consoles. Generally speaking, however, there are not many options available to tune these aspects of WebSphere. Furthermore, given that they don't tend to be user facing , they don't rank high in the list of potential areas of performance improvement!

It's important to remember that WebSphere version 4 is managed via a Java/Swing-based Graphical User Interface (GUI), and WebSphere version 5 is managed via a Web-based console. These consoles are where a great deal of your performance and optimization work will occur, so it's best to be familiar with them if you aren't currently.

HTTP Server

The HTTP server is an external-based server, facilitating HTTP requests and redirecting them to the appropriate backend WebSphere application server host and port. Operating a stand-alone Web server can be advantageous in many respects (which you'll see in Chapter 5), but it may introduce unnecessary complexity or security risks into your environment. Obviously, this is an architect's issue and requires resolution prior to implementation!

That said, when you use an external HTTP server, the common topology design is to operate one or more (for redundancy and performance) static Web server, which is configured to use the WebSphere HTTP plug-in (see the next section). This model is typical because it allows integrators and system managers to house their business logic and data behind protected firewall environments, leaving noncritical, commodity Web servers within a secured Demilitarized Zone (DMZ).

By using this approach, traffic coming into the Web server traverses a firewall (or series of firewalls) through port 80 or 443 for Secure Sockets Layer (SSL)/Secure HTTP (HTTPS) to the specific Web server Internet Protocols (IPs). The WebSphere HTTP plug-in, configured using Extensible Markup Language (XML)-defined rules, then takes the HTTP requests from the customer browser and directs them to the configured backend server IP and port number.

The external HTTP server can be one of several third-party HTTP server platforms, including the following:

The Apache Web server
The IBM HTTP server
The Microsoft Internet Information Services (IIS) server
The Netscape Web server
The SunOne/iPlanet Web server

WebSphere HTTP Plug-In

The WebSphere HTTP plug-in is a key component in a Web-based WebSphere environment. The plug-in routes and directs user requests from one or a number of Web servers to backend WebSphere application servers.

Essentially the plug-in controls where a user request goes ”which port, which server, and so forth ”so it doubles as a load balancer, routing requests to the most appropriate application server based on the configured settings.

Figure 3-1 shows a basic load-balanced topology that utilizes the WebSphere HTTP plug-in to distribute requests between multiple back-end application servers.

Figure 3-1: Example WebSphere HTTP plug-in topology

To achieve load distribution to more than one HTTP Web server, typically you can deploy a hardware-based load balancer in front of the HTTP server to farm out inbound HTTP requests to all configured servers.

Each HTTP server is configured with a copy of the WebSphere HTTP plug-in configuration file, which details the various URL and Universal Resource Identifier (URI) context path names , the associated route group, and the associated server and port to which the HTTP request should be routed.

Listing 3-1 shows an example WebSphere HTTP plug-in configuration file.

Listing 3-1: Example HTTP Plug-in Configuration File

 <?xml version="1.0"?> <Config>     <Log LogLevel="Error" Name="/opt/WebSphere/AppServer/logs/native.log"/>     <VirtualHostGroup Name="default_host">         <VirtualHost Name="*:80"/>         <VirtualHost Name="*:5500"/>     </VirtualHostGroup>     <UriGroup Name="MySampleApplication/myApplication_URIs">         <Uri Name="/myJsps/examples/*"/>     </UriGroup>     <ServerGroup Name="mySampleAppServerGrp">              <Server Name="MyApp">                  <Transport Hostname="AppServer1" Port="5500" Protocol="http"/>             </Server>    </ServerGroup>    <Route ServerGroup="mySampleAppServerGrp"        UriGroup="MySampleApplication/myApplication_URIs"        VirtualHostGroup="default_host"/> </Config>

The WebSphere HTTP plug-in configuration file shown in Listing 3-1 has five main sections. Let's now look at the important parts of this file.

Section One

The first section is the noncontext configuration section. In the example, it's as follows :

 <Log LogLevel="Error" Name="/opt/WebSphere/AppServer/logs/native.log"/>

I'll discuss the meaning of the various directives in later chapters; however, for the purpose of this chapter, know that this line declares the location for the plug-in log file and the debug or logging level required.

Section Two

The second section lists the directive that specifies the default virtual host. Again, I'll discuss this in more detail later in the book; however, for now, know that the following directive establishes the default host name and the ports that are applicable to the backend WebSphere application server:

 <VirtualHostGroup Name="default_host">          <VirtualHost Name="*:80"/>          <VirtualHost Name="*:5500"/>      </VirtualHostGroup>

This directive lists two virtual host ports: port 80 and port 5500. Port 80 is the default port implemented, and port 5500, as you'll see shortly, is the port for the example application.

What this directive means is that for the default host (whatever that may be), two ports are valid and configured.

This configuration must also match that which is configured in the HTTP Transport settings of the backend WebSphere application server node.

Note	I discuss the WebSphere application server HTTP Transport settings in more detail in later chapters. Briefly , however, the HTTP Transport settings, along with other parameters, manage the queue and connections from into the WebSphere application server port (in other words, port 5500).

Section Three

The third section declares the possible URI groups. The UriGroup directive declares the URL contexts that this WebSphere plug-in should trap as part of user requests. A URL context is defined as being the nonhost component of the URL. For example:

 http://www.mysite.com/myJsps/examples/HelloWorld.jsp

The URL context is the defining characteristic of what the HTTP plug-in should do with the request.

A UriGroup directive looks like this:

 <UriGroup Name="MySampleApplication/myApplication_URIs">          <Uri Name="/myJsps/examples/*"/>      </UriGroup>

This directive basically defines a URI group with the name of MySampleApplication/myApplication_URIs and declares that any URL context that matches the expression of http:// ./myJsps/examples/* , such as the previous URL, will be trapped and associated with this URI group.

Section Four

The fourth section has the responsibility of setting up the routing of the requests. That is, now that the plug-in has the user's URL request, a route association needs to be defined so that the user request can be forwarded to the appropriately configured server.

The Route directive looks like this:

 <Route ServerGroup="mySampleAppServerGrp"          UriGroup="MySampleApplication/myApplication_URIs"          VirtualHostGroup="default_host"/>

What this is declaring is that a server group known or defined as mySampleAppServerGrp is associated with a URI group name of mySampleApplication/myApplication_URIs . As noted previously, there's a URI group defined with the group name of mySampleApplication/myApplication_URIs . Therefore, this Route directive associates the server group name with the URI group name.

This is then used for the final section.

Section Five

This final section of a basic WebSphere HTTP plug-in configuration file sends the user's request, based on the data built up over the four previous sections, to an appropriate application server and port.

This final section is a server group definition, and it looks like this:

 <ServerGroup Name="mySampleAppServerGrp">        <Server Name="MyApp">            <Transport Hostname="AppServer1" Port="5500" Protocol="http"/>        </Server> </ServerGroup>

Because there can be (and most like will be) many ServerGroup directives in a HTTP plug-in configuration file, the ServerGroup directive name is matched via the Route directive as detailed in the fourth section.

To summarize, the final action on the part of the plug-in itself is to match the server group name with the route, as defined by the Route directive, and then forward (proxy) the request to the application server and port, defined by the Transport declaration, within the ServerGroup directive.

Depending on your configuration and how many application servers are operating in your backend, there may be multiple Server or Transport declarations defined (for example, multiple Hostname , Port , and Protocol statements).

Note	Please note that this is a basic plug-in configuration file. You'll need to configure this plug-in file specifically for your environment ”the previous example will not function in your WebSphere server environment.

Application Server

The application server is the key component of any WebSphere implementation.

The application server is the encapsulating construct that the Java Virtual Machine (JVM) provides the runtime from for deployed Java 2 Enterprise Edition (J2EE) and Java-based applications. The application server comprises various containers, such as Enterprise JavaBean (EJB) and Web containers, as well as other communications-based components. Further, the application server services requests for other WebSphere and J2EE technologies such as the following:

Tracing services
JavaMail services
Java Database Connectivity (JDBC) and pooling
Messaging interfaces ”for example, Java Message Service (JMS)
Java Management Extensions (JMX) administrative services
CORBA Object Request Broker (ORB) services
Java Naming and Directory Interface (JNDI) services
JVM Performance Interface (JVMPI)

Essentially, the application server calls on these services as the deployed application components request them. Although many of the services listed aren't technically operating within the JVM space that the application server is operating within, the client, or stub, components exist for the purposes of service request and activation when needed by your deployed applications.

Given the overall platform importance and centralization of the WebSphere application server, this is one of the key areas I'll focus on for optimizing, tuning, and configuring WebSphere.

Finally, the application server also manages the key containers that are central to J2EE environments, including the EJB container, the Web container, and, for WebSphere 5, the Java Connector Architecture (JCA) container.

I'll discuss each of these containers in more detail in later sections of this chapter.

Web Container

The Web container is one of the well-known components of a J2EE environment.

Quite simply, the Web container is responsible for processing presentation-based components such as Java Server Pages (JSPs), servlets, and other types of presentation such as static content from Hypertext Markup Language (HTML) and Extensible HTML (XHTML) files.

Because the Web container operates in its own JVM (to which you can have multiple), it provides several Java object management services such as garbage collection and object allocation/deallocation for servlets and JavaBeans.

Although the Web container is synonymous with Web-based technologies , it's capable of running standard Java technologies such as JavaBeans and data connectivity services such as JDBC. In smaller environments, many application architects choose not to use EJBs ”and hence the EJB container ”and elect to operate all code and functionality from within the Web container.

There's nothing technically wrong with this approach; however, a few limitations from both a performance and an application functionality point of view affect this decision. These mainly relate to application distribution and integration to legacy systems. This type of approach is acceptable because you can use several alternative application and platform design options.

Figure 3-2 shows an example Web container implementation within WebSphere.

Figure 3-2: Web container services within WebSphere

As depicted, the system manager configures the Web container with options and settings such as ports, type and setup of connectivity between the Web server and the Web container (for example, queuing and threads), and session management.

Embedded HTTP Server

Within the application server exists an embedded HTTP server. This component provides a good platform for testing code during deployments as well as in the event that connectivity is lost between the front and static Web server(s) and the application servers.

Caution

Although the embedded HTTP server makes Web server-like services available, you shouldn't use this component alone as the frontend Web server. This should be handled by something such as SunOne/iPlanet Web server, Apache, or IBM HTTPD. Essentially, the embedded HTTP server doesn't have the configuration capability or performance to manage the vast numbers of incoming connections that are synonymous with static Web servers. Security is also another reason not to operate the HTTP server in this manner. Always use a standard Web server (or many) to rate limit and control the inbound connections to your Web containers.

EJB Container

The EJB container is synonymous with the Web container in that it provides and facilitates the runtime requirements and services for operating EJBs.

One of the big selling points of EJBs is that an EJB container handles all the low-level work that the developer typically needs to do. This includes what I like to call the plumbing , and it includes all the things that relate to file management, database connectivity (this depends of course on CMP, BMP, or JDBC direct), threading, transaction management, and so on. This type of development is typically the most developer- intensive aspect of coding applications that require any of these services.

The other important point to note is that the EJB container operates from within the application server. Essentially, the EJB container provides an operational construct for the EJBs.

The EJBs communicate to the outside world via an intermediate layer. Figure 3-3 shows the way in which EJB clients communicate to EJBs within the EJB container.

Figure 3-3: EJB container insulation and implementation

It's important for system managers to get an overview of the EJB technology in the context of the WebSphere EJB container. Therefore, as a high-level overview of how the EJB works in relation to the WebSphere EJB container, you'll look at a simple EJB transaction.

As depicted, once the client has obtained a reference to the business components' home object (Point 1), the client/client application requests the home object to find (Point 2) or create (Point 3) an EJB.

The home object creates or finds the EJB, and a reference to the remote object (in other words, the remote EJB object) is returned to the client (Point 4). The client then calls a business method to the EJB object. This EJB object works with the container to manage the transaction, the communications, and the thread between the client and the EJB itself. The EJB object at this point proxies the business method call and the associated values to the bean for processing.

The EJB itself then processes the request using properties and values stored in the JNDI context, and then, once processing has been completed, the return values are proxied back to the EJB object, which, in turn , returns the values to the client.

The stage where the proxy between the client, EJB object, and the EJB itself takes place is a fairly complex phase of the transaction. You can tweak many performance levers at this point, both in code and in the EJB container and application server, to optimize or tune the transaction.

Overall, EJBs are one of the more complex areas of J2EE application development. The bulk of the complexity lies in the application code itself; however, the system manager should be familiar with what's actually going on in the EJB container for obvious reasons!

Application Database

The application database, or databases, provide the deployed applications with their long- and short-term application storage. This could be anything, including application configuration, profile information, customer information, or application data. I'll cover the myriad parameters and options available for managing performance with these application databases in later chapters.

Although the application database isn't specifically a WebSphere component, it's important to note that the application database (or the accessing of it) is one of the main causes of poorly performing WebSphere application environments. Poorly modeled data schemas and poorly tuned database configurations can quickly be the demise of a deployed WebSphere application's performance.

Session Database

The session database is the critical data storage mechanism for multiapplication server environments. User session information ”that is, information pertaining to a user's environment and session state ”needs to be persisted or distributed so multiple application server instances or nodes can reference that data. If you're looking to load balance users between multiple nodes or if you're simply operating multiple application servers within a singular node, you need to use session persistence if the information that a user builds up in their session is important for using the application.

Typically, the data is persisted to a database such as Oracle, DB2, or Sybase. Through the JSESSION session ID, which is the standard session identifier, the WebSphere environment can re/dereference information that's persisted from multiple nodes or application servers that are configured to interface with a common session persistence data store.

Obviously, if the relational database that's used to maintain this persisted data isn't tuned correctly, or the JDBC parameters on the WebSphere end aren't correctly configured, then user response times will degrade almost exponentially as the load increases .

WebSphere 5 provides a new technology known as the Data Replication Service (DRS). It's possible to use this memory-to-memory type replication capability between multiple application server nodes in place of using a relational database to persist the session information. Depending on your availability requirements (and needs), a hybrid solution may be the chosen option where session information is persisted and published onto the configured DRS queue.

I'll discuss more of these options in later chapters.

Web Module

WebSphere 4 introduced a new component definition, known as a Web module . The Web module is effectively a WebSphere representation of a Web Archive (WAR) file. A Web module provides an ability to stop, start, and manage Web-specific applications (contained in WAR files) independently of an entire Enterprise Application Resource (EAR) file.

Even when you deploy WAR files that are contained within a J2EE EAR file, the WAR files are extracted and can be managed via the Web module services. The Web module allows a system manager to change settings specific to a WAR file's contents without affecting the rest of the application (for example, EJBs, other WARs, and so on). Within each Web module or WAR, you'll typically house servlets, JSPs, and static HTML content.

EJB Module

The EJB module isn't unlike a Web module. The EJB module is used to compartmentalize one or more EJB. Typically, the module is encapsulated as an EJB Java Archive (JAR) and includes all the deployment descriptors and the EJBs themselves .

By compartmentalizing EJBs into EJB modules, it allows the system manager to stop and start specific, usually associated, EJB groups during runtime.

How you package the EJB modules has some bearing on the operational integrity of your system, so, therefore, there are best practices associated with this. There aren't a great deal of performance options associated with the EJB modules themselves; however, given they operate within the EJB container, I'll focus on this aspect of EJB technology within this book.

WebSphere 4-Specific Component Architecture

IBM WebSphere 4 arrived on the market boasting some significant architectural differences from its predecessor, WebSphere 3.54. With the large number of fairly fundamental changes from version 3.54 to version 4 came the added benefit of a much improved platform in the way of performance and robustness.

Prior to version 4 of WebSphere, BEA WebLogic's application server was the market leader, touted as having the best high-availability features. With version 4 of WebSphere, the gap closed significantly, and now both BEA WebLogic and IBM WebSphere are on an equal playing field.

As discussed in earlier chapters, WebSphere 4 follows the standard J2EE component architecture. This is good for J2EE compliance, and IBM has been able to obtain market differentiation from other vendors with WebSphere's underlying engine components (a.k.a. containers) and some proprietary features such as domains, server groups, and clones (which I'll discuss later in this chapter).

The following sections present the major components specific to WebSphere 4.

WebSphere Control Program

As an alternative to the graphical administration console of WebSphere, WebSphere 4 offers a command-line management interface known as WebSphere Control Program (WSCP). WSCP is a command-line interface providing access to all settings and commands available under the graphical-based consoles.

You'll realize the power of WSCP for tasks such as deployment and bulk work. You can write scripts to interface with WSCP that allow you to noninteractively install, modify, or manage your WebSphere environment.

For example, if you wanted to install a new EAR at 3 a.m., you could create a script, Unix shell, or Windows/DOS batch and trigger the shutdown of a particular application server, the removal of an existing EAR, the installation of a new or updated version of the EAR, and a restart of that EAR. From time to time in the book, I'll refer to using the WSCP interface for its ease of changing parameters. If you haven't used WSCP before, I highly recommend you try it. You can find it in your WebSphere bin directory under the Windows ( <WebSphere_HOME>/bin/wscp.bat ) or Unix ( <WebSphere_HOME/bin/wscp.sh> ) platforms.

XMLConfig

The XMLConfig tool in WebSphere provides an alternative system management facility to the WebSphere administration console. You can configure all the components of WebSphere 4 in a batch-like mode using this tool. You can't do much to tune or optimize the performance of this component, but I mention it here for completeness.

Server Groups

WebSphere 4 improved the concept of server groups, and it provides additional management capabilities of the server group platform capability over that of version 3.54.

A server group is a logical grouping of near-identical copies of an application server and its configuration, deployed applications, and resources. In fact, a server group consists of the same structure and constructs as a normal application server. That is, the server group, like an application server, consists of components such as EJB and Web containers and the deployed configuration that's part of an application server (see the "Application Server" section for more details).

The key difference between a server group and an application server is that a server group isn't associated with any particular node. The logical representation of an application server through a server group provides the system manager with the ability to maintain groups of application servers distributed across multiple nodes.

The grouping also allows the system manager to stop, start, and make changes to common application servers distributed across a WebSphere cluster of common components, rather than having to configure similar application servers on disparate physical nodes one by one.

Figure 3-4 depicts the association between server groups and other components within WebSphere 4.

Figure 3-4: Server group association with WebSphere 4 components

A server group consists of a single application server configuration template; however, it can be operating or distributed to multiple WebSphere nodes. In Figure 3-4, there are three application server clones within a single application server group. The server group is a logical template and configuration mapping construct for cloned application servers (JVMs).

Each application server operating in the environment can be configured as a clone (see the next section), which allows the association to extend to one server group consisting of multiple application server clones, all operating on multiple physical nodes. This capability within WebSphere provides the ability to distribute common WebSphere components ”that is, application servers ”to multiple physical nodes, hence providing greater levels of availability and heightened performance.

The server group is the key component in giving WebSphere 4 its ability to scale both vertically and horizontally. I'll discuss the various options of WebSphere server groups in greater detail throughout the book.

Clones

Not unlike version 3.5, WebSphere 4 provides the ability to clone components of an environment. With WebSphere 4 cloning, system managers can create clones of server groups based on preconfigured operating instances. The key difference between a clone and a server group is that a clone ”unlike a server group ”consists of actual operating processes and components.

When used in conjunction with server groups, system managers are able to change the master server group configuration, and via the server group capability, the configuration change is then replicated (or synchronized) to all other clones operating within that server group.

What this capability allows the system manager to do is distribute application servers, or clones of an application server, across multiple servers to provide horizontal scaling or create multiple clones of an application server within a single node to provide vertical scaling.

Cloning supports both horizontal and vertical cloning simultaneously .

Note	It's possible to use clones without server groups; however, more effort is required to maintain multiple independent configurations. Server groups provide the encompassing, centralized management control of multiple clones.

One of WebSphere's key features in workload management is the ability to load balance (a.k.a. workload management) between multiple servers and clones. For example, Figure 3-5 shows a high-level process of how you can use cloning (via server groups) to extend the load-balancing capability of WebSphere.

Figure 3-5: WebSphere 4 cloning and workload management flowchart

Figure 3-5 shows the flow of an inbound request from the frontend Web server HTTP plug-in. The request, based on a context of /myApplication/logon.jsp , is routed through to one of the clones on the application server. Clone 3 responds with the result of processing logon.jsp .

If clone 3 in Figure 3-4 became unavailable (for example, if the JVM crashed), clones 1 and 2 would continue to operate. The HTTP plug-in on the front of the Web server would detect the "dead" clone and route further requests through to either clone 1 or clone 2 available in the Work Load Management (WLM) pool.

Caution

It's possible to modify a clone directly; however, in doing so you immediately introduce risk into your environment with having clones operating with different configurations. You should make changes to clones through the server group settings. This method then synchronizes the changes down to the associated clones.

Administrative Repository

WebSphere 4 operates using an administrative repository to store and facilitate global WebSphere configuration settings. Unless you operate many WebSphere 4 nodes within a domain, the repository database doesn't require a high-performance database environment. It does help, however, to ensure that it's operating on a highly available database, typically a high availability or active cluster.

In a basic WebSphere environment, the repository can operate on the local server; however, larger environments should look toward operating the repository on a dedicated database server or, preferably, a high-availability database cluster.

I'll discuss options for tuning the repository in later chapters of the book; however, the basic topology for a repository is almost trivial and can operate on Microsoft SQL, Oracle, DB2, Sybase, or Informix database platforms.

Administration Server

The WebSphere 4 administration server is the centralized management server for WebSphere.

The administration server is a critical component of WebSphere. It provides, among other features, functions such as stopping and starting application servers, nodes, server groups, and so forth. The administration server is also responsible for controlling, operating, and facilitating the operational runtime of the workload management and the general runtime control of the WebSphere server.

In a multinode deployment of WebSphere 4, each node operates an independent administration server that also manages the interaction and state of the other nodes in a WebSphere domain.

WebSphere 5-Specific Component Architecture

As mentioned earlier, WebSphere 5 includes some fairly significant changes in its architecture from that of WebSphere 4. The most noticeable changes pertain to JMS services, data stores (repositories and so on), and the concept of a cell , node, and deployment manager.

Overall, there are essentially two WebSphere 5 server implementation types. The first, as shown in Figure 3-6, is the standard WebSphere 5 and is typically associated with simple, single server installations.

Figure 3-6: WebSphere 5 standard deployment component architecture

The second implementation type, as shown in Figure 3-7, is the WebSphere network deployment configuration. This implementation type is used for multiserver configurations or when your environment requirements become more complex (horizontal and vertical scaling).

Figure 3-7: WebSphere 5 network deployment component architecture

Like WebSphere 4, version 5 follows a standard J2EE deployment architecture but also includes some additional advanced proprietary features to extend key features such as high availability, clustering, and redundancy. Through the following sections, I'll discuss the various key components of the WebSphere 5 platform.

Cell

The cell is a new technology in WebSphere 5. It provides one of the key combined high-availability and management (administrative) features of WebSphere 5. Basically, a cell is analogous to a WebSphere 4 domain. In WebSphere 4, a domain was a grouping of associated application servers in a commonly administered group, or domain . This group capability allowed system managers to effectively split their WebSphere environment into separated administrative zones that removed single points of failure associated with common administrative repositories.

The technology allowed a system manager to shut down a section of an operating WebSphere environment and upgrade that section independently of the still-functioning environment.

The alternative is to have a single domain environment. In that case, regardless of how many application servers, clones, and server groups are configured, if the WebSphere 5 environment is configured as a singular administrative domain, then an outage in the administrative repository (in other words, in a deployment manager, node agent, or something in between) would mean the entire application environment would fail. Therefore, for WebSphere 4, it's advantageous for larger environments where horizontal as well as vertical scaling is used for multiple administrative domains.

In WebSphere 5, the concept of a domain has been replaced with a cell and has a few additional features over that of a domain. A WebSphere cell encompasses the entire WebSphere 5 environment from an administrative point of view.

A cell is an autonomous administrative WebSphere zone; because of this, if you increase or partition your WebSphere operating environment into more administrative cells, you'll incur management overhead.

What this means is that because each cell is administered independently, you must deploy software to each cell, hence requiring more checks and effort for each application change. Obviously, the trade-off, albeit positive, is that you can isolate hardware and software failures using cells. With the alternative, each major software or hardware fault in a common component (for example, the repository database) will cause a total outage.

You can extend the multicell approach to more than two cells, but as indicated previously, the more cells, the more overhead.

Note

You can reduce this management overhead associated with multiple cells by configuring WebSphere semi/automated scripts (which I'll also discuss in this book). I've seen one WebSphere 5 implementation configure a local administrative portal page, which allows operations staff to administer and manage a multicell environment from a single location. Without this approach, management of the environment increases proportionally to the number of cells!

If the deployment manager process for one particular cell failed, on most occasions those particular WebSphere application servers in the fault cell would also fail.

Note	This isn't always the case; I'll cover the details regarding this in future chapters.

However, all application services (for example, the applications themselves, the WebSphere application servers, and so on) would continue to operate as normal, within the remaining cell.

Caution

For this type of environment topology to function correctly, you must ensure that if cell A failed, all physical servers in cell B would be able to continue operating, conducting business as usual, with the extra load. Furthermore, you'll need to ensure that as part of your server capacity modeling, you capture not only the runtime load but the added overhead of the initial failover load (in other words, servers in cell B handling the additional peak requests as all traffic and load is diverted).

This is an important configuration and tuning aspect of WebSphere and therefore will be discussed in more detail during later chapters.

Node

A node is typically associated with a physical server entity that operates, on that physical server, WebSphere application servers, and other common server processes. A node will contain the following:

One or more application servers, possibly consisting of multiple Web, EJB, and JCA containers
JMS server(s)
Administrative services
JNDI server(s)
Security services

In a network deployment configuration, a node will also consist of a node agent (see the next section). A network deployment implementation of WebSphere will also consist of multiple nodes, with each node consisting of one or more application servers, all managed and orchestrated within an administration cell by a deployment manager.

Node Agent

A node agent is the coordinating agent process for a network deployment configuration of WebSphere. The node agent has no management services available to applications, essentially making it transparent to deployed J2EE components. The node agent also facilitates performance monitoring (used by JVMPI) and deployed configuration management (synchronization between nodes for the deployment manager). The agent communicates to the application, JMS, and other servers, as well as the deployment manager, to manage and coordinate the operations of the node on which the agent resides.

This should highlight the agent's role in configuration synchronicity between the various application servers within a node.

Deployment Manager

The deployment manager, like the node, is fundamental in a network deployment configuration of WebSphere. The deployment manager in essence provides the sole central point of control for all administrative functions pertaining to all components within a cell.

The deployment manager governs the content mastered within the various repositories on each node within a call. The synchronization is managed via the node agent operating within each node in the cell.

Given that the deployment manager governs many of the centralized administrative functions in a cell, it therefore hosts the administration console.

Web Service Engine

The Web Service engine provides J2EE draft-compliant Web Services. It's important to briefly discuss the status of the Web Service components on the WebSphere 5 platform.

Because Web Services, at the time of writing this book, weren't completely ratified, only draft specifications of the technology exist. Therefore, IBM ”in my opinion, correctly ”used an open -source series of Application Programming Interfaces (APIs) available from the Apache Axis project.

Axis is a series of APIs that provide Web Services, such as capabilities that provide the key ingredients to a Web Service implementation. They include the following:

Universal Description, Discovery, and Integration (UDDI)
Simple Object Access Protocol (SOAP)
Web Services Inspection Language (WSIL)
Web Services Description Language (WSDL)
Web Services Invocation Framework (WSIF)

The Axis framework APIs and components are instantiated , on request, within the Web container. Future releases of WebSphere will possibly operate the Web Service engine in some other form (for example, either as a Web Service container or as a Web Service server analogous to the JMS server).

UDDI Registry

Given the hype around Web Services and their potential increased integration into the enterprise, IBM has included a UDDI Registry as part of WebSphere 5 to help sites better use and integrate Web Service technologies into legacy environments.

The UDDI Registry is in fact a J2EE application developed by IBM that the system manager needs to deploy into each application server for custom or third-party applications to use.

Note	If you're not familiar with what UDDI is, think of it as the Yellow Pages for SOAP or any of the Web Service protocols. UDDI is somewhat analogous to NIS/NIS+ in Unix environments.

JCA Container

The JCA container is a lesser-known component of the J2EE world. JCA allows components to communicate, primarily from EJBs, to myriad legacy systems typically associated with enterprise information systems. One example is to communicate from an EJB to some form of transaction processing-based legacy mainframe.

IBM and other J2EE application server vendors provide many forms of connectivity solutions such as JDBC and MQ Series; however, there may be a case where an enterprise houses disparate legacy backend systems that include specialized schemas or highly normalized database systems such as when working with Online Analytical Processing (OLAP) cubes.

In essence, the JCA container is a "catch all" for connectivity within a J2EE environment where native drivers may not exist.

Name Services Server

The name services server is pivotal to J2EE environments. JNDI provides a distributed contextual tree for storing all types of information relevant to J2EE resources such as JMS queues, general application properties, EJB states, JDBC locators, and much more.

Each application server operates its own JNDI namespace and stores the previous information.

Distributing JNDI information between multiple nodes is an area of interest for performance and scalability. I'll go through the options in later chapters of this book.

Security Services Server

WebSphere 5 has a number of security services available to developers. The security server provides the hooks between applications and the WebSphere application containers to facilitate authentication and authorization into varying levels of the WebSphere environment (in other words, administration, application access, fine-grain and coarse-grained access control, and so on).

JMS Server

JMS is a technology that allows communication on both a point-to-point and publish/subscribe model. The key areas in WebSphere 5 where JMS is used are internal communications (messages) within the components of a WebSphere cell and when using message-driven beans.

JMS has a multitude of uses ”it integrates well with IBM MQ Series, and many Enterprise Application Integration (EAI) vendors use JMS as the entry protocol into their proprietary EAI bus architectures.

The JMS server also is integrated into the transaction management services provided by WebSphere. This provides the integration layer into heavy transactions such as MQ Series. In context, when a series of MQ requests makes up an entire transaction, one failing will cause the transaction management service to roll back the entire transaction.

In a network deployment configuration, the JMS server operates under its own JVM space, independent of the application server JVM where, under a base configuration of WebSphere, the JMS server operates within the same JVM as the application server.

Administrative Services

The administration service provides the interface and capability for all the configuration services within WebSphere. Under a base configuration deployment, an administration service runs within the core application server, differing from a network deployment whereby the administration service operates in each of the four key servers: the node service, the deployment manager, the application servers, and the JMS server.

The administration service stores the WebSphere configuration in a set of XML files stored natively on the operating server's file system. This is different from the WebSphere 4 platform where the repository or configuration information is stored in a relational database.

You can also configure the administration service to use differing levels of access to the service. Please see your WebSphere administration manual for more information about this.

Note

As discussed in the "Deployment Manager" section, once under a network deployment configuration, the deployment manager takes control of all administrative functions. So, although an administration service may be actively operating on each of the four key servers (in other words, JMS, application, node service, and deployment manager), under a network deployment configuration the deployment manager will transparently manage the administrative functions on all four servers via the deployment manager.

Data Stores and Repositories

An environment such as WebSphere has several needs for short- and long-term storage of information. The following four sections present an overview of the key repositories used in WebSphere 5.

Configuration Repository

The configuration repository is the replacement for the WebSphere 4 relational database-stored configuration repository. IBM has chosen for WebSphere 5 to replace a relational schema with flat-XML configuration files.

These files are managed by the administration service operating under the deployment manager (under a network deployment configuration) and by the administration service operating under the core application server (under a base deployment configuration).

All aspects of the server's configuration are stored in these XML files.

Caution

Although it's technically possible to edit the XML-based repositories by hand, unless you're completely sure about what you're editing, you should use the Web-based administration console to manage these configuration repositories.

Master Repository

In a network deployment configuration where one or more cells exist, the master repository maintains the entire cell's configuration information.

As discussed in the "Configuration Repository" section, the configuration repository stores all relevant configuration data for a particular node's environment. Under a network deployment configuration, the configuration repository changes to maintain a synchronized view of the master repository with the deployment manager of each cell mastering all data.

Node-specific information is still mastered in the configuration repository; however, the changes for node-specific items are routed via the cell's deployment manager and then synchronized back down to each node. Under the network deployment configuration mode, the local configuration repositories are read-only, and only the deployment manager can write to the configuration repository.

Note

Throughout the book I use both the terms configuration repository and node repository . When I talk about a configuration repository, I'm referring to the configuration repository of a WebSphere implementation using a base configuration model. Node repository refers to a configuration repository under a network deployment implementation model to highlight that I'm referring to the node-specific configuration rather than the cell or master repository.

Other Components

A number of products are available from IBM and other vendors that can aid in promoting high availability and robustness. Load balancers, content distributors , and content proxy solutions are just some of the available platform tools. You'll look at some of them now.

IBM Edge Components

IBM's Edge components include software-based network services such as load balancers and caching servers. In this book, I briefly cover some design and environment considerations relating to Edge components; however, because these aren't technically WebSphere, I've excluded the finer details.

Effectively what this method achieves is to load balance user requests between multiple frontend Web servers. Common alternatives to this software option are to use hardware appliances from vendors such as Cisco Systems/ArrowPoint (for example, the CS-11500 content switch), Radware, Intel, and many more.

In summary, there are many ways to fulfill your frontend load-balancing requirements; IBM WebSphere Edge components can provide you with a near turnkey solution to do this if you're happy with operating these layer 4 “8 services (of the Open Systems Interconnection (OSI) model) via software on low-end systems. Chapter 5 covers the concept and technology associated with frontend load balancers such as the Edge server.

WebSphere Cluster Environment

In WebSphere 5, the notion of a WebSphere cluster has somewhat dwindled because the concept of a cell and the services pertaining to cells.

In WebSphere 4, server groups and clones are the foundation of a cluster. In WebSphere 5, server groups and clones have been replaced with other technologies; however, the term cluster is still loosely attributed to a situation where you have a cluster of physical servers, all operating the same J2EE applications. These physical servers in the cluster aren't necessarily aware of one another. The glue that allows them to be termed a cluster is typically mastered through frontend standard or geographical load balancing. The session state may or may not be persisted in this situation.

In summary, the term cluster in WebSphere 5 is a logical grouping rather than a capability. It's the same as a cell of nodes participating in application workload management, without the umbrella management of the cell and deployment manager. I'll discuss this in detail in Chapter 5.