You have a maximum budget of $500,000 for all database-
Software licensing and software support come out of a different budget, and are not factors in this exercise. Components such as cabling and racks are also outside the scope of this exercise, but in the real world you need to consider these factors as well.
| Note |
Keep in mind that this scenario is heavily focused on the SQL Server side of things and not on the application or the rest of the infrastructure, including the aforementioned cabling, racks, and other forms of hardware and software that will comprise the entire solution. For a full real-world solution, those would all be part of the planning process. |
The company has decided that hardware can be purchased only from the options listed in the Excel spreadsheet HW_Cost.xls on the Hardware Base Costs tab.
The bullet points below
Here is what you know so far:
There will be a minimum of two databases: one for reporting and one for your main data. The main data will be OLTP, and the reporting data will be read-only with regular data loading.
Near-real-time reporting looks to be an absolute requirement.
With 700 GB of total disk space needed, following the 80/20 split, there will be 560 GB of data and 140 GB of log. However, that really means 1120 GB of data and 280 GB of log because you have at least two systems hosting copies of the data.
You will have both directly connected and Internet-based users, therefore you will need at least two different security models for the solution.
Because the one system is mainly
Here are some questions you must answer:
You have two security models: authenticating users through the Web and directly in the application. Will they be handled the same way in the code?
How will the database logic be implemented ”are you using direct code or stored procedures? This affects your memory requirements.
What is your network bandwidth? Can it handle whatever mechanism you
What type of staffing will be required? What type of expertise is needed, and is it already in-house?
What are the service level agreements (SLAs) beyond the general four nines? Is it four nines from 5 A.M. to 9 P.M. Eastern Standard Time since you cover both North America and Europe? What are the standard support contracts for your company?
Do you need to create a warm standby in addition to the reporting server? Is it possible to make them the same?
At the same time, draw up a list of initial risks:
How will the disks be protected?
Are there any barriers to adopting a technology such as failover clustering?
Where should the servers be placed for ultimate availability?
Will $500,000 be enough? Will trade-offs need to be made?
What is the business impact of downtime?
Are there proper monitoring systems in place to handle the new systems?
What type of maintenance will be needed, and how will it interface with current plans for other systems?
How many Web servers are needed, and do you set up a demilitarized zone (DMZ) environment? Where will each server be placed? (This is outside the database scope, but in an overall project, you need to consider this.)
How will you load test the system? How will you test it in general? Will there be the same equipment to develop and test on? If not, what is the risk from a production standpoint? What is the budget for doing a load test (including dedicated systems and software), and will it be done in-house or externally?
With enough basic information in hand, you can take the known factors, questions, and risks and
Tackling the database synchronization issue, you have a few choices available to you:
Log shipping This would solve both the reporting and availability and standby issues, but with frequent transaction loads, you really could not use it for reporting.
Replication
This would solve the reporting problem, but does the data you need to send fit in the row
Data Transformation Services (DTS) packages Creating a DTS package is definitely possible, but how would you monitor such a solution and ensure that it is working? What logic would you put into it? How would it impact the OLTP database? Do your developers have the expertise to code it? Would it affect the development timeline?
From the main OLTP database standpoint, how frequently is data changing? That is, are there many
As a first stab, you decide to use failover clustering for protecting the main OLTP database and the distributor ”they will be housed in one instance, transactional replication to populate the reporting server, and log shipping for protecting the entire solution in the event of a whole site failure. Using log shipping means that you not only have another data center but also that you are taking the risk that the log shipped database will serve as both the reporting database and the OLTP database. That might impact performance and would need to be
You also decide to implement this entire solution on Microsoft Windows Server 2003 because it will provide the longest supportability and you know the system will stay in production for at least two
There are a few phases to designing your architecture: designing the hardware and then dealing with people and processes.
| Note |
Because people and processes are extremely important in your design, they are covered extensively and in much detail in other chapters. They are also presented at a high level in this chapter, and all considerations are reviewed. |
First, examine your disk requirements. On the primary OLTP system, you decide that it will serve as both the Publisher and the Distributor. You need to plan for the OLTP database, a distribution database, msdb usage, transaction logs, MS DTC, and a quorum disk, as shown in Table 11-1. You are looking at more than 700 GB of disk space for each copy of the data.
|
Requirement |
Disk Space Planned |
Separate LUN |
|---|---|---|
|
Publisher/main database ”data portion |
600 GB |
Yes |
|
Publisher/main database ”log portion |
200 GB |
Yes |
|
Distributor |
50 GB |
Yes |
|
MS DTC |
1 GB |
Yes |
|
Quorum |
1 GB |
Yes |
Notice that extra space was allocated to all databases. This allows for some extra room should the estimates provided actually turn out to be lower than what is needed or for additional growth.
Now it is time to address the problem of how to deal with the reporting server and warm standby. Should you use the warm standby as a replication primary? No, because you have clustering in place. It will be used purely for high availability, so you need 800 GB of disk space to match the primary. On the replicated server, you need the 800 GB as well. It is beginning to look as if you will incur a large cost by purchasing enough disk space to host the three copies of the data.
At this point, start thinking about how you will monitor, maintain, and administer these systems in a production environment. Choosing the technologies involved will
Although it does not matter in what order you configure the systems, this example starts with the cluster. Disks will more than likely represent your greatest cost. Do you go with larger, slower disks or smaller, faster disks? Start by looking at the number of disks needed for an optimal deployment.
| Note |
The next section goes into depth on cost and disk configuration. Although the amount of detail might be painful, it is necessary. |
{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}
Quorum disk Two dedicated disks (must be the same size as all other disks, even though you are using a small portion).
MS DTC Same as the quorum requirements.
Publisher/main data
At 600 GB, in terms of raw space, that is 17 36-GB disks or 9 72-GB disks. However, that is raw space only; Windows needs a bit more for formatting. Pad that by at least one drive, meaning 18 and 10, respectively. Now, if you implement RAID 5, add another disk for parity, which
Publisher/main log At 200 GB, this is approximately six 36-GB disks and three 72-GB disks in terms of raw space. Again, add another disk for Windows formatting, making it seven or four. You do need to mirror your log disk, so double the number, making it 14 or 8.
Distributor (data and log)
At 50 GB, you could
System databases This means mainly msdb and tempdb. Here, plan on at least the same space as the Distributor. Remember, however, that tempdb usage will vary. Because you do not have any testing numbers, you are guessing about the usage you need.
Now you must configure your cluster based on the numbers you determined. Because there is only one main cluster system on your pick list, it makes your decision a bit easier.
You already decided to use the best processor. This will be a two- node cluster. That translates into the 900-MHz model at $81,200 per node (all dollar amounts given are in U.S. currency) or $162,400 for the two cluster nodes alone. You might have to rethink the decision to cluster, as this represents 32.5 percent of your entire budget.
You need two network cards for each node and, at $250 per card, that is $1,000. You are now up to $163,400.
You will use two controller cards per disk enclosure at $5,000 each. Combined with the disk enclosure, which holds 12 disks, each will be $17,000. The cost of the entire disk subsystem will be added in later.
One host bus adapter (HBA) is already included, and you will not have redundant disk paths. This is an availability risk, but you are trying to keep your costs down, and you determine that having only one card combined with clustering is an acceptable risk for this solution.
You need a fibre hub for your disks, which is $3,000, bringing your new total to $166,400.
| On the CD |
For your disk situation, the math can be found on the CD in the file HW_Cost.xls on the Cluster Disk Costs tab. |
As you can see in the file on the CD, HW_Cost.xls, there are many
| Tip |
Remember, in a real-world situation outside of this book example, take into account the cost of racks and other infrastructure-
|
Because you use racks in your data center, only stand-alone system option number two is valid. That translates into the following:
$45,000 for the base system that comes with 2 GB of memory
$3,400 to add another 2 GB of memory
$1,600 for the fibre adapter (HBA) for the server
$1,200 for the fibre hub
$7,000 for the fibre switch
$3,500 for the extra warranty
This calculates to a base cost of $61,700. Add an additional $250 for the log shipped secondary, as you want to use a private network to send the transaction logs, bringing the cost to $61,950 for that system. Those two elements added to the cluster now cost $123,900 for a new total of $406,100. Next comes the disk configuration for the two systems, which can be found on the Rack System “ Non Cluster tab of the Excel spreadsheet. Remember that you will have separate storage area networks (SANs) for each system, which means that you need separate controllers, hubs, and so on. You could use the same SAN, but that would be an availability risk. The cost of the hub and switch was already
Because the reporting server is in the same data center as the cluster, you can utilize the same SAN (
Although you are still over budget, you go to management to explain the trade-offs you had to make and to get approval for the additional $5,750, which, over the life of the system, works out to be just under $240 per month for the two years. Although you have what seems to be a winning design, management questions the need for the log shipping server and says that if they approve any additional money to go over budget, it should be used for a testing system. This puts your availability in a disaster recovery scenario in jeopardy.
You need to factor in the disk enclosure space. Because you need to use the 36-GB 10,000 rpm disks, that is $11,400 for the disks. Because you are using 82 slots and each enclosure has 12 slots, you must add two more disk enclosures for a total of $34,000. That brings the system cost back up to $539,750. If the log shipped system ever becomes the primary, you will need to do
At this point, you have the following problems:
You are over budget.
You have no test system.
There is not a lot of backup room for every database as it grows, reducing the amount of retention you can have on the active system. You have not budgeted disk space or cost for dedicated backup
You have what you consider nonoptimal disk configurations for your data (log is fine because it is using RAID 1), possibly exposing you to availability and performance problems.
You have no hot spares for disks.
Unfortunately, there is no good solution without significantly increasing budget. If you eliminate the log shipped server, which dramatically puts you in harm s way in a disaster recovery scenario, you can subtract $160,475, bringing you under budget for the system as it is configured. However, you might lose $160,475 or more in revenue and leads if the system goes down. Will saving money on test systems further increase your risk for potential problems in production?
At this point, you need to make some hard decisions. What is most important to the
|
Function |
# of Disks |
Cost |
|---|---|---|
|
Quorum RAID 1 |
2 |
$2,000 |
|
MS DTC RAID 1 |
2 |
$2,000 |
|
System databases RAID 5 |
3 |
$3,000 |
|
OLTP data |
36 |
$36,000 |
|
OLTP log |
14 |
$14000 |
|
Backups |
40 |
$40,000 |
|
Total |
97 |
$97,000 |
Taking into account the nine disk enclosures at $17,000 per unit, that is a cost of $153,000 for a total disk subsystem price of $250,000. The cluster will now cost $414,400. However, because you are adding load to the system, you add another 4 GB ($8,500) of memory per node, bringing the total cost to $431,400. This solution should satisfy your main concerns about the reporting and main OLTP capabilities. The one question here is your use of tempdb. If it is heavily used, you will need more or separate disks, but you now have the drive space for it, not to mention the room for hot spares, so you have also increased your expandability and availability.
You now have $68,600 left in your budget. This is not enough money for either a test system or a log shipped secondary for disaster recovery. At this point, you need to look at the following options:
Request the additional $91,875 for the log shipped secondary to cover yourself if something catastrophic happens, sacrificing the test system, although the log shipped secondary is not to the same specifications as the production system.
Request the additional $91,875 for the test system to cover yourself if something catastrophic happens, sacrificing the log shipped secondary. Although this test system is not to the same specifications as the production cluster, it would be better than having no test system at all.
Request another $431,400 for a proper test system, no matter what decision you make about the log shipped secondary. This ensures that your production and development environments will be equal. If this is done, the entire solution will cost $1,000,000 or double the original budget.
Document your exposures and hope for the best.
The reality is that the last point ”documenting your risks ”is most likely to be the case, and the remaining $68,600 will go into other aspects such as cost overruns in development. The money might not be spent, but multiple people weighing the risk or reward of doing so will make that decision.