Pay special attention to scaling issues when you are writing applications for a clustered environment. Poorly written code can suffocate any Web site, no matter how much hardware you throw at it. Successful building of applications that scale follows good coding techniques, concentrating on writing clean, well-thought-out code. Scalable code is well organized, modular in nature, structured, and free of common bottlenecks. Code OrganizationA stable and scalable application typically is the end product of careful design and planning before any code is written. Good software comes from good planning, and following common Software Design Life Cycle methodologies. Some attributes of good code are easy to spot: being well organized, sufficiently commented, and easy to follow. Your software should also be thoroughly documentedmost people don't associate this necessity with scaling, but there is nothing worse than trying to grow, extend, and manage software over its life cycle without proper documentation. Before you begin to write any if your application's code, design a directory structure that matches the requirements of your specific application. For example, suppose you're building a large application that makes use of a large number of images and static content. Rather than creating an image directory on each server, you might want to link to content on a separate cluster of file servers or proxy servers. You'll also want to design with encapsulation and reuse in mind. Try to employ as many of ColdFusion MX 7's new features as possible to create loosely coupled applications that have few dependencies on other files, allowing easier management and less complicated application changes over time. Always decide beforehand on specific coding standards and conventions that will be enforced. This not only allows other developers to collaborate more easily, but acts as system documentation for you and future developers in maintaining the system. Along with coding conventions, consider using a specific application-development methodology or framework such as Mach-II (www.mach-ii.org) or Fusebox (www.fusebox.org), especially for large projects. Established methodologies allow groups to work together with less difficulty, and they often solve many common problems faced by most applications. Not every application will be a good fit for public methodologies, and you'll have to consider the particular skills of your team as well as the application's requirements before making that decision. ModularityModular code helps promote code reuse. Code that is used many times in an application, such as a CFC for authentication, should become more stable over time as developers fix bugs and tweak it for performance. Modular code is also much easier to test, and there are a number of useful testing tools to help developers unit-test CFCs. Well-written modular code follows good coding practices and helps you to avoid common bottlenecks. It also eases development efforts because developers do not have to rewrite this code every time they need similar functionality. Streamlined, Efficient CodeImplementing best practices for Web site development is an important discipline for developers building highly scalable applications. The following example illustrates that point. The code attempts to find the name of the first administrator user. Each administrator user has a security level of 1. The code queries all users and loops through the record set searching for the first administrator record and returns their names: <cfquery name="getAdminUser" datasource="db_Utility"> SELECT * FROM tbl_User </cfquery> <!--- Loop until you find first user with security level of 1 ---> <cfloop query="getAdminUser"> <cfif trim(getAdminUser.int_Security) IS 1> <cfset AdminName = getAdminUser.vc_name> </cfif> </cfloop> Admin User Name: <cfoutput>#AdminName#</cfoutput> This example demonstrates inefficient code that can slow your application if this piece of code sustains many requests. In addition, even after it finds the first administrator record, it does not stop looping through the returned user record set. What if the user table contained thousands of records? This code would take a long time to process and consume valuable system resources. Here's a more efficient version for finding the first administrator record and returning the name: <cfquery name="getAdminUser" datasource="db_Utility"> SELECT TOP 1 vc_name FROM tbl_User WHERE int_security = 1 </cfquery> <cfif getAdminUser.RecordCount GT 0> <cfset AdminName = getAdminUser.vc_name> </cfif> Admin User Name: <cfoutput>#AdminName#</cfoutput> This code is much more efficient and easier to understand. The query isolates only the records and columns that need to be used in the code. It will only return one record if any records have a security level of 1. NOTE As important as efficiency is, don't fall into the trap of trying always to write the best and most efficient code possible. Often developers will sacrifice readability, modularity, and maintainability for the sake of performance; as a general rule, this is a huge mistake. Focus on writing good, solid, maintainable code. When you do, your application will be easy to scale via the addition of more hardwarewhich is always cheaper and easier to budget for, and has fewer consequences than spending too much time rewriting code to get minor performance gains. Avoiding Common BottlenecksThe preceding example illustrated a simple way to write more efficient code. Let's look at other coding bottlenecks and discuss ways to avoid them. Querying a DatabaseWhen writing queries to retrieve data for output on the screen or into form variables, pay careful attention to the number of records to be returned and the structure of the SQL itself. A bottleneck, common to complex queries, results from a query returning more records than are required and using only a subset of the returned records. Such a query should be rewritten to return only the required records. In addition, database software is much more efficient at processing database requests than ColdFusion is. For a highly scalable Web site, it's best to create views for selecting data, and stored procedures for inputting, adding, and deleting data to and from the database. Design your ColdFusion templates to call these views and stored procedures to interact with the database. Asking the database server to perform this kind of work is much more efficient and tends to stabilize performance. Here is an example of a poorly coded set of queries to retrieve data. This code is not scalable and will adversely affect Web site performance. Notice that the same table is queried twice to return different data. One query, in this case, would be sufficient:
As you can see, only one query need be called to return this data. This is a common mistake. Absolute Path, Relative Path, and Other LinksOne of the more common problems when working with ColdFusion is confusion about when to use the absolute or relative path for a link. Both methods can be employed, but you must be cognizant of the impact of each approach when you are coding for a clustered environment. Here are a couple of questions to ask before utilizing absolute or relatives paths in your application:
NOTE Relative path is relative to the current template. Absolute path is the path relative to the root of the Web site. Hard-coding links will cause problems with clustered machines. Say that you have an upload facility on your Web site that allows users to upload documents. The code needs to know a physical path in order to upload the documents to the correct place. Server 1 contains the mapped drive E pointing to the central file server where all the documents are stored. The file server has an uploadedfiles directory located on its D drive, so the path can be set to e:\uploadedfiles. But Server 2 does not contain a mapped drive named E pointing to the file server. If you deploy your code from Server 1 to Server 2, the upload code will break because Server 2 does not know where e:\uploadedfiles is. It's better to use Universal Naming Convention (UNC) syntax in the upload path: \\servername\d\ uploadedfiles. Note that having one file server in the configuration described creates a single point of failure for your Web site. NOTE Universal Naming Convention (UNC) is a standard method for identifying the server name and the network name of a resource. UNC names use one of the following formats: \\servername\netname\path\filename or \\servername\netname\devicename. Nesting Files Too DeeplyNesting files using cfinclude, cfmodule, or any other mechanism is considered a valuable tool for developers in building complex applications. Nesting too many files with in other files, however, can cause code to become unmanageable and virtually incomprehensible. A developer working on a Web site where nesting is especially deep may eventually stop trying to follow all the levels of nesting and just attempt write new workaround code. This approach may cause the application to function in unexpected ways. Too many nested files in your code can also affect performance. In part this is how CFMX 7 compiles things like file includes and cfmodule calls but also just the number of functions and processes that a file made up of many nested files will call. It is always better to try to simplify your code to encapsulate specific functions into highly cohesive units and to call as few operations as necessary per application request. Doing so will streamline the application, reduce nested layers, improve code readability, and increase performance. This all goes back to planning and designing before you codenesting problems usually occur in applications whose requirements have changed over time. |