The TriMont web site team reviewed your list of questions, and provided additional information prior to your second visit.
Now that you have some more information about the site, you may run some additional estimates prior to your follow-up visit with the customer. Let's begin with the throughput calculations to complete the Capacity Sizing worksheet, and then we'll move on to the network analysis. Table 10.3 shows the remaining input data for the Capacity Sizing worksheet. Table 10.3. Capacity Sizing Worksheet, Remaining Input Data
Calculating Throughput (Page Rate and Request Rate)Page RateThe customer tells us the average user views five pages during her visit. Let's use this number to determine how many pages per second the site must serve at peak. The peak page rate calculation looks like this: Users arriving/hour * Pages each user requests / minutes/hour / seconds/hour = Page arrival rate/second 63,000 users/hour * 5 pages/user visit 87.5 pages/second Network Traffic MixTriMont told you something very interesting about its web pages: Each page contains 15 static elements, on average. So, for every page the site returns, the browser asks for 15 static elements from the site to complete the page. At peak, the site receives the following request burden : 87.5 pages/second * 15 static requests/page = 1,312.5 static requests/second Assuming the pages don't contain framesets (which might result in multiple dynamic calls), each page request results in a call to one dynamic element, such as a servlet or JSP. This means the site takes the following total request rate during peak: 1,312.5 static requests/second + 87.5 dynamic requests/second = 1,400 total requests/second You might encourage this customer to consider a caching proxy server for the static content, as we discussed in Chapter 3. Such a server might improve the overall performance of the web site by returning static elements more quickly to the users. Table 10.4 shows the completed throughput section of the Capacity Sizing worksheet. Table 10.4. Capacity Sizing Worksheet, Throughput
Network AnalysisNow let's complete a rough network analysis as described in Chapter 9. For this example, we'll go through the steps; later we'll show you how to use the Network Capacity worksheet in Appendix A to build the estimate. This example assumes that all of the components of network traffic (outbound, inbound, database transfers, and so on) share the same network at some point. Some web sites use different networks for inbound versus outbound traffic. Likewise, database traffic sometimes moves on a separate network as well. In practice, you need to consider the size of data elements sharing the same network in order to size the network in question. Always use peak load for your estimates. You must build a site capable of handling the largest anticipated traffic volume. Calculating Network Analysis ComponentsMost networks rate their transmission capabilities in terms of the bits or bytes per second they transmit. Therefore, we need to break our data into bytes and focus on reducing our data to the peak second. User Arrival RateA key estimate required for subsequent network analysis calculations is the user arrival rate. This estimate tells us how many users come onto the site during any given second in the peak hour. On some sites, the user's first page request "weighs" more than subsequent visits . For example, the user might trigger a download of a frameset, JavaScript, or graphics on her first visit. The browser caches this information for subsequent requests, so the subsequent requests generate less network traffic. In addition, the server sets up HTTP session data and perhaps retrieves preference data from a database on the first viewed page. For this site, we know some users make use of a heavyweight applet. We need the user arrival rate for later estimates involving the impact of this applet on the network traffic. If you didn't do this calculation before, the user arrival rate in seconds is Users arriving at the peak hour / Minutes/hour / Seconds/minute = User arrival rate/second 63,000 users at peak/hour 17.5 users/second The input data for the Network Capacity Sizing worksheet is shown in Table 10.5. Table 10.5. Network Capacity Worksheet, Input Data
Outbound Sustained TrafficFirst, let's look at the traffic generated by the page requests. As we discussed in Chapter 5, e-Commerce sites typically do not benefit significantly from browser caching. Thus, let's assume most of the data on each page requires loading on each request. The calculation in this case looks as follows : Page rate * Average page size = Bytes/second 87.5 pages/second * 60KB/page = 5.25MB/second Inbound Sustained TrafficOutbound traffic is always larger than the inbound request traffic. As a generalization, we assume inbound traffic at 20% of the outbound traffic. For an e-Commerce web site, this might be a bit generous, given the overall size of the pages, but we always prefer to overestimate rather than underestimate. (Slightly too much capacity is less problematic than slightly insufficient capacity.) This is our estimate for inbound traffic in this case: Outbound sustained traffic * 20% = Inbound traffic/second 5.25MB/second * 20% = 1.05MB/second Database TransferWe must consider other data moving in the network as well, such as data returned from application databases, in our calculations. The customer tells us the database returns about 2KB of data. About 90% of the site's requests generate a database call. The database transfer calculation looks like this: Page rate * Data retrieved/page * Weighting factors = Bytes from db/second 87.5 pages/second * 2KB/page * 90% = 157.5KB/second Notice that TriMont did not provide any information about the database transfers from the account and order databases. These data transfers may be very small relative to other traffic (such as the outbound HTML pages), but you need to verify this with the customer at your next meeting. HTTP Session TransferThe customer wants to use a database for HTTP session failover purposes. For this calculation, we like to assume the full contents of the HTTP session transfers between the application server and the persistent HTTP session database. This transfer occurs on every request, depending on the customer's application's characteristics. We also assume some type of caching mechanism at the server, so we do not consider a round trip for this information in our calculations. This means that the session database is not read on every request, but is updated on every request, at a minimum to update the last access timestamp. If the caching scheme is not in place or does not work effectively, you must account for this in your calculation. The HTTP session transfer calculation looks like this: Page rate * Average HTTP session data transferred/page = Bytes to the HTTP session db/second 87.5 page requests/second * 2KB/request = 175KB/second Applet TransferThis web site contains a large applet (750KB). Depending on the frequency of the applet's download, the applet might impact the network capacity significantly. The customer tells us that about 10% of the site's users access this applet, so let's use the user arrival rate to give us a feel for the impact of the applet at any given second. The applet transfer calculation looks like this: User arrival rate * Applet size * Weighting factors = Applet bytes transferred/second 17.5 users/second * 750KB/user transfer * 10% = 1.313MB/sec Total TransfersNow we add up all of the elements of the network traffic to get a rough total of sustained network traffic: Outbound traffic + Inbound traffic + Database transfer + HTTP session database transfer + Applet transfer = Total known data transferred/second 5.25MB/second + 1.05MB/second + 157.5KB/second + 175KB/second + 1.313MB/second = 7.946MB/second total known data transfer In Chapter 9, we estimate how much sustained traffic three popular network types support. These estimates include some allowance for protocol overhead (packet headers, connection "handshaking," and so on). This web site probably needs a Gigabit network to give us a comfortable operating margin in production. If we use a switched Ethernet, we might experience a site slowdown if our estimates prove even just a little too low. Also, the switched network does not give us a lot of room for future web site growth. Using the Network Planning WorksheetLet's complete the calculations in the worksheet from Appendix A in Table 10.6. Table 10.6. Determining a Minimum Network Requirement of 1Gb
Dial-up User ConsiderationsIf TriMont receives a lot of dial-up traffic, the page sizes might become a factor over slow phone lines. The average page size is 60KB. Over a 24.4Kbps phone line, this requires at least 24,400 bits/second / 8 bits/byte = 3,050 bytes/second 60,000 bytes / 3,050 bytes/second = 19.6 seconds! This well exceeds the five-second response time threshold the TriMont team set for themselves . You discuss this with the TriMont team. They believe most of their traffic comes from high-speed connections or users equipped with 56Kps modems. This brings the transfer time on the large pages to the sub-10-second threshold. They believe this is a reasonable response time expectation for people using this level of equipment. You consider the TriMont team's attitude a bit optimistic. In case they change their minds in the future, you give them the following suggestions to consider:
It is also worth mentioning to the customer that the applet requires very long download times over a 24.4Kbps modem (more than four minutes). Even doubling the modem speed does not move the applet into the desired response time range. HTTP Session PressureAfter completing the network analysis, you turn your attention to memory issues. Since the TriMont site uses HTTP sessions, you need an estimate for how much memory the HTTP sessions require during the peak period. As we discussed in Chapter 4, you need to make sure the Java application server has enough memory to operate . The TriMont site removes the user's HTTP session when the user logs out. Do not count on this in your estimations of HTTP session pressure! Most users do not log out at the end of their visits; they just move on to another web site. So, when you calculate HTTP sessions, do not assume the web site gets significant benefit from the logout function. (Likewise, do not write your test scripts so that each virtual user dutifully logs out of the test case.) See Chapter 7 for further information. TriMont's HTTP session timeout is 30 minutes. Given that the average user visit is 7 minutes, let's assume the user's session lasts 37 minutes (7 minutes for the actual visit and 30 minutes for HTTP session timeout). Given our previous user arrival rate estimate of 17.5 users per second, the web site might have the following simultaneous HTTP session in memory during the peak: User arrival rate (seconds) * 60 seconds/minute * Number of minutes user's session lasts = Maximum HTTP sessions 17.5 users/second * 60 seconds/minute * 37 minutes = 38,850 HTTP sessions TriMont HTTP sessions average 2KB of data in size. At peak load, the web site requires the following memory to hold the HTTP sessions: 38,850 simultaneous HTTP sessions * 2KB/HTTP session = 77.7MB Using the JVM Sizing WorksheetLet's put all of the HTTP session sizing into the worksheet from Appendix A in Table 10.7. Table 10.7. HTTP Session Impact Worksheet
Given the number of users on the web site during the peak time, this is a very reasonable HTTP session footprint. In fact, the HTTP sessions for all these users could, depending on the web application footprint, fit inside a single JVM. Keep in mind that a small increase in the average HTTP session on this site means a significantly larger HTTP session footprint. If the session size grows to 10K (still not very much data), we'd need 385MB just for session data at peak usage. Your JVM might not have enough heapsize to handle this. Make a note to recheck the average HTTP session size with the TriMont development team before the testing begins. The TriMont team's usage of HTTP sessions puts them in a great position regarding their memory footprint. Of course, not every customer manages their HTTP session footprint as well as TriMont apparently has. See our discussion in Chapter 2 for more details on techniques to reduce HTTP session pressure if you find yourself working with a web site with large HTTP sessions. Test ScenariosThe customer gave you a lot of information about visitor usage patterns. You can use this information to develop a rough breakout of the test scenarios you need to test the web site. Potential Test ScriptsWe know of the following scripts and their relative weighting:
We also know of some special functions of the web site:
Notice our "bread and butter" functions of Browse, Check Order, and Check Account do not add up to 100% of the traffic. Maybe the TriMont team left some buffer for users of Boat Selector function and River Conditions applet who do not go further into the site to use other features such as Browse. You need to double-check this with the TriMont team (we don't want to miss a significant function). Also, you might want to request a further breakdown of the Browse activity. Do most users take advantage of the Search functionality discussed earlier while browsing, or do most users browse by category rather than searching? Finally, will it make a significant difference in our performance test if the users search versus walking through a series of categories? Again, clarification on small details might make a significant difference in the success of the test. Test Scenario ConsiderationsYou need a very large shopping list. The catalog database contains one million items. Avoid reusing the same items repeatedly in your scripts, or the items you use repeatedly might be cached at the host database. This results in a lower response time for the test than you will see in the field. See Chapter 7 for details. You also need to interact with a production account and order system. TriMont plans to exercise the site fully, including placing orders and checking on their status. However, you don't want to trigger delivery of merchandise to the test lab! You want to operate outside the order and delivery system, but keep the shopping experience as authentic as possible for the performance tests. In these cases, you need to work closely with the TriMont team. As we discussed in Chapter 9, the customer has probably resolved this problem at some time in the past. TriMont may use a special set of account numbers to generate test orders. The order system recognizes these special accounts and does not reduce or ship inventory. Other options include building a special order database just for the test that bypasses the standard order and inventory system. (You also need a set of order numbers for the Check Order function. Again, the customer might generate a set of dummy orders just for this part of the test.) Moving AheadThe Java application server vendor probably provides some capacity planning information to customers. Now that you know more about the TriMont application, you should consult the vendor's web site for white papers to help you pick the server capacity the application needs. The test needs hardware. TriMont needs to pick a hardware vendor for their new site, if they haven't already. They also need to pick machines for testing that match the eventual production configuration. Capacity planning at this stage is only an educated guess. The vendor's guidance probably narrows the range of possible servers. For example, the application might require something in an 8 “12 CPU configuration, but it definitely needs more than a 4 CPU server provides. However, the estimates give you a starting point for machine sizing, which you will confirm as part of the performance test. (See Chapter 15 for more details on determining machine capacity.) |