Although you should consider performance in each phase of your CMS project, the design phase is certainly the most important. Designing a site with performance in mind will help you code and deploy a fast site.
ASP versus ASP.NET
When you are planning your CMS site, one of the first decisions must be whether to use ASP or ASP.NET. Ultimately, the decision is based on many factors, but from a performance viewpoint, the advantages of ASP.NET are quite clear.
The most obvious benefit of ASP.NET is the new output caching mechanism. This new feature is provided within the ASP.NET framework. Output caching is discussed later in this chapter. Also, considering that ASP.NET performs so well, most of the topics in this chapter will discuss ASP.NET site considerations.
The single greatest difference you can make in the performance of your code is your caching strategy. If you are serious about performance, then you need to get serious about your caching design. Early in your Web site design process, you should identify which parts of your site and which parts of each page could be cached. Generally speaking, you will want to cache the code that takes the longest to run. However, if you have a highly personalized site, you might not want to implement caching at all. As with most decisions, you need to find the technological solution that best addresses your business needs.
A number of different caching techniques are described in this chapter. You will need to balance these different caches (see Figure 34-1) according to the resources that you have available. For example, both ASP.NET output caching and CMS node caching use memory resources. In general, you will want to use output caching on as much of your site as possible. However, if you cannot fit all your most popular pages in the output cache, you will want to ensure that you allocate sufficient resources to the CMS node cache. There is no easy rule to follow. You will have to test your site to find the appropriate balance.
Figure 34-1. CMS 2002 caching architecture
For more information about caching, refer to "Best Practices for Improving MCMS Web Site Performance" in the MCMS 2002 documentation.
The Working Set
The term "working set" is used to refer to the set of pages that are most frequently accessed. This group of pages usually accounts for roughly 80% of the site traffic. For this reason, caching the working set results in the greatest performance gain across the site.
The default configuration for ASP.NET output caching is to use up to 60% of total physical memory. Using this configuration, a CMS server with 1GB of physical RAM can store roughly 2,500 pages in the output cache. Of course, this number is significantly influenced by the size of the pages. If your working set is larger than 2,500, you may want to consider dedicating some testing to determine the size of these pages. Once you have done this, you can target the "heavy" pages and try to scale them down. Refer to the CMS partner page (http://www.microsoft.com/cmserver/partners) for information about companies that offer products that perform these types of tests.
If you find that you are still having trouble getting your working set into the cache, you should consider increasing the server's memory or partitioning your site across multiple CMS servers. This topic is discussed later in this chapter.
Another key point to mention is that the size of the output cache does not significantly affect performance. This means that you can safely increase the resources allocated to the output cache without worrying that this in itself will negatively affect performance. The ASP.NET output caching mechanism is intelligent enough that this is not a concern. As long as increasing this memory allocation is not starving other systems of memory, then it is a good use of your resources.
ASP.NET Output Caching
As mentioned already, caching is your best way to increase performance. The next point to consider is that ASP.NET output caching is the most effective caching mechanism. Output caching is so effective that the output-cached WoodgroveNet site performance is over 400% better than the WoodgroveASP site.
Since ASP.NET output caching is implemented within the .NET Framework, this technique obviously applies only to ASP.NET pages. However, it is so effective that it's worth considering migrating to ASP.NET purely to use this feature. Figure 34-2 shows the impressive impact of output caching.
Figure 34-2. WoodgroveNet with .NET output caching and WoodgroveASP with ASP fragment caching
This chart is dramatic, but it is also important to show the results when pages are served from outside the cache. This is referred to as a "cache miss." Figure 34-3 shows what happens when cache misses increase.
Figure 34-3. .NET output cache ratio
The tests show that performance is seriously affected by cache misses. When a cache miss occurs, quite often it results in a trip to the database and also a write to the CMS node cache. This work is considerably slower than simply serving the cached HTML.
The reason that output caching is so effective is that it is the first operation that the server attempts. If the HTML for the page is cached, it is not necessary to do any other calculations. The cached HTML is simply sent to the browser client. It is possible to cache entire pages or fragments of pages. Performance is obviously optimized if the whole page is cached. However, the decision about how to implement this method of caching will be determined by your business requirements. For example, you may have some legal information displayed on your site. If your business requires that this data is always up to date, it may be that it can never be cached. In this case, you would not be able to output cache the entire page.
Partial-page caching is also an important means of using output caching. For example, you could choose to only cache your navigation controls. Whether you cache the whole page or just part of the page, a number of custom settings are available. For example, you can set the cache lifetime. You can also set some CMS-specific properties such as "vary by cms role". This allows you to cache the page content based on the role of the user who is accessing the site.
For more information about output caching, refer to the MCMS documentation, the WoodgroveNet sample site, and the ASP.NET documentation.
ASP Fragment Caching
If you are not running an ASP.NET site, then your best bet is to implement ASP fragment caching. CMS provides a built-in mechanism for doing this, but there are other choices. For example, if you are running Microsoft Commerce Server (CS) 2002, you might decide to use the CS LRU cache instead of the CMS fragment caching mechanism.
Fragment caching works in a similar fashion to ASP.NET output caching. The idea is to compile pieces of HTML and then serve these from memory instead of compiling them each time they are needed. Of course, ASP fragment caching can be used on any site since it does not require the ASP.NET framework or ASPX pages.
Figure 34-4 shows the benefit of fragment caching on the Woodgrove ASP site.
Figure 34-4. Fragment caching on the WoodgroveASP site
The following brief example shows how fragment caching is implemented in ASP code. For a more thorough example of this, refer to the WoodgroveASP sample site available on MSDN.
<% 'Display HTML for left navigation of this posting 'First check LRU cache, if it is not found then compile and 'populate the LRU cache. Dim strHtml, strKey strKey = Autosession.This.GUID & "left nav" strHTML = cache.lookup(strKey) If strHtml = "" then 'Execute code to compile the required HTML strHtml = compiledHtml cache.add(strKey, strHtml) End If Response.Write(strHtml) %>
CMS Node Cache
If a page is not available via output caching or fragment caching, CMS will have to assemble the HTML. To maximize the performance of this assembly process, CMS has a number of its own caching strategies. Only a couple of these caches are discussed here since many of them are not meant to be configured by users. Refer to Chapter 3, CMS Architecture, for more details about these caches.
The performance of the CMS server is highly dependent upon its customized caching strategies. The most prominent of these is the CMS node cache. The node cache is used to store various types of data for example, the channel structure, the template gallery items, and the placeholder content.
Since the node cache is so important, it can easily be adjusted by CMS administrators. To configure this cache, open the Server Configuration Application (SCA) and click the Cache tab (see Figure 34-5). This allows you to alter the number of nodes in the node cache.
Figure 34-5. Cache settings in the SCA
As mentioned previously, you do not want to blindly increase this number. The more nodes you have in the CMS cache, the less memory there is available for output caching. You will need to find the right balance. Refer to the next chapter for information about CMS performance counters. These counters can help you find the right balance for your available resources.
The other cache that can be configured in the SCA is the disk cache. The disk cache stores data that would otherwise have to be retrieved from the CMS database. To minimize the trips to SQL Server, these files are cached on the server's hard drive. The combination of this cache and the inherent performance of SQL Server is so effective that SQL Server is rarely the performance bottleneck for a CMS site. This is true even when a significant number of CMS servers are accessing a single SQL Server. More often, the CMS server CPU is simply not able to keep up with the writes to all the appropriate caches.
To make use of this cache, you must have sufficient disk space available on your CMS server. Do not set the disk cache size so high that the server could run out of empty space. One recommendation from Microsoft is to keep the Internet Information Services (IIS) logs on a different drive from the CMS disk cache. This is one example of how disk space can be freed up for use by CMS.
Template Design Considerations
CMS template design almost equates to CMS performance design. If your templates are designed well, it is straightforward to deploy a high-performance CMS site. If your template performs poorly, then you may find that you have to spend substantial time tweaking your caches and adding more resources to your deployment.
When you are creating CMS templates, trade-offs have to be made between offering increased flexibility and increased performance. Just as in any other Web site, the fastest templates are the ones that run the least amount of code and serve the least amount of data. On a CMS site, placeholders are used to store content, but each added placeholder also requires more code to run. The fastest CMS templates are those that have few placeholders with small amounts of data.
The amount of data in placeholders has a significant effect on the performance of CMS pages. But this is a very straightforward principle. Just as in any other Web site, the more data you send to the client, the slower the page will render. For CMS users, it is key to remember that placeholders do not nullify this principle. A well-administrated CMS site will include guidelines for authors about the size and type of content that can be included on a particular page. Since image placeholders usually contain more data than HTML placeholders, the type of placeholder can also have an affect on performance. Make sure that your template design decisions include this consideration.
Another issue with large amounts of placeholder data is that caching is no longer as effective. You may save time compiling HTML, but this savings can be negated if the cached HTML is so large that transferring it across the wire takes a long time. Figure 34-6 demonstrates how throughput is affected as the size of placeholder data increases. In this test, each line of content contained 36 characters.
Figure 34-6. Placeholder size performance
Number of Placeholders
Placeholder data is an important factor, but the number of placeholders is also important. All too often, site developers give in to pressure from their authors and try to build the ultimate template. The problem with trying to please everyone with one template is that you end up with so many placeholders that you significantly degrade performance. Each placeholder that you add requires the server to fetch and render more content. This clearly increases the amount of code that the server must execute.
Of course, there are other considerations. You do not want to create too many templates, because this leads to many different problems. Some sites find that after years of running their CMS system, they have produced so many templates that site management becomes awkward. The trick is to find a balance between performance and flexibility. CMS 2002 provides the ability to create your own placeholders. A custom placeholder may allow you to significantly decrease the number of placeholders on your template.
Microsoft has suggested that keeping the number of placeholders on each template under 100 is important. Although each site has its own performance goals, this is an awful lot of placeholders. If you find that you are using more than 20 placeholders on a page, then take a step back and consider if what you really need is another template.
Figure 34-7 shows how the number of placeholders affects performance of the site.
Figure 34-7. Placeholder number performance
Postings and Other CMS Containers
CMS containers are virtual structures. For example, you will not find anything resembling the CMS channel structure within your file system. Based on this architecture, CMS containers must be inflated from the database and cached on the server. These operations take time and are a significant factor in the performance of a CMS site. Fortunately, there are ways to mitigate this performance hit. A well-designed CMS site will not suffer from this issue.
The solution is quite simple. Make sure that all your CMS containers are organized in a distributed hierarchy. When it comes to adding channels, template galleries, resource galleries, and user roles, divide your container content into well-proportioned trees. Microsoft has suggested that you should keep your container content to less than 300 items. The most important case of this rule is the root level of the container trees. Figure 34-8 illustrates this point.
Figure 34-8. Performance of navigation code based on postings in a single container
In addition to the number of items in your containers, consider the depth of your trees. Rather than having few deep trees, divide your containers into trees that balance both depth and width. If you have to make a choice, depth is preferential to width. Figure 34-9 shows the results of deep channel structures.
Figure 34-9. Posting depth performance
CMS resources have many benefits. They allow a single file to be easily used in many places and by many authors. However, just as in any other Web site, resource files should be used carefully. These files generally comprise a significant portion of the data downloaded to the Web browser client. Resources can be any sort of file. However, for simplicity's sake, image files will be used as the default example.
CMS resources are the files stored within a CMS resource gallery. If you add a file directly to a template or add a file directly within a placeholder, then these files are not managed by CMS. Management has advantages but it also comes with a performance implication. This impact can be mitigated by caching your resources. If all your CMS resources are cached, they will behave just like resources on any other Web site. Refer to the earlier section about adjusting the size of your CMS disk cache.