Foreword


As I sat behind the wheel of Engine 11, peering out the windshield into the morning fog, my mind was racing. I looked at my watch – 0746hrs. 14 minutes left in my 10 year career as a firefighter. I looked on with a renewed curiosity as my comrades were walking back toward the engine, having finished cleaning up from a traffic accident we had responded to. I was staring at an intersection where the accident had occurred, contemplating what was before me as I stood at an intersection in my life; leaving the fire department and going to work for Microsoft. I could not help but doubt myself – what on earth could I do as a firefighter at Microsoft? I had an IT education from college and extensive opportunities to participate in IT projects in both the fire department as well as my 2nd job, but I had to wonder… was that enough?

In a short period of time at Microsoft, I developed a strong love for IT, databases, and messaging technologies. When the opportunity arose for me to join the Messaging and Collaboration team within the Operations and Technology Group, I could not have been more excited. It probably should have occurred to me sooner why there was an opening for an Operations Manager on this team in the middle of the Exchange 2000 deployment, but I was excited for the challenge. There was this new product called Windows 2000 with its Active Directory, and this brave new next-generation Exchange Server product built on top of it, and I wanted to be a part of it! Back then, the Exchange environment at Microsoft was fairly distributed, with over 220 servers around the world running Exchange in about 80 sites. The complexity of managing this distributed environment along with our newfound dependencies on the Active Directory proved to be quite challenging. At a basic level, it required the basic IQ of Exchange administrators to rise considerably, as the competent management of an Exchange environment required additional expertise in a new Exchange release, the Active directory, and other Windows services such as Internet Information Server (IIS).

I quickly learned that the approach to Exchange outages was far more similar to fire fighting than I ever could have anticipated. While there was no life or property at stake (except maybe my own at times…), the stress level for those involved in this project was quite similar. What was different , however, were the resources that were available to me in the heat of this battle. There were these books available by Jerry Cochran and Tony Redmond that had the teachings and competencies of Exchange experts readily available enabling us to stop and take a breath, read a few pages, and take a different approach toward the fire before us.

Another appropriate analogy here that proved valuable to my new job was the fundamental differences between fire fighting and fire prevention. In the fire service, it seemed much of the glory was in fire fighting, with little benefit to the individual in fire prevention. From a distance, we know that the true contribution to both scenarios is in fire prevention; proper steps and planning that work to prevent the fire… to prevent the outage. Jerry Cochran’s first book, Mission-Critical Microsoft Exchange 2000 Server quickly became the “bible” for my team in the area of prevention. Within that book were kernels of knowledge that helped us to think beyond reactive outage management (fire fighting) and work towards high availability (prevention).

With the advent of Exchange Server 2003 and Outlook 2003, we found ourselves in a new territory – a more robust server and client that provided functionality enabling us to consolidate our operations significantly. We began an aggressive campaign to consolidate servers worldwide into 7 sites (from over 70). Along with this, we augmented our service offering internally by increasing mailbox storage limits from 100mb to 200mb, adding mobility services, scaling up servers in the consolidated sites, and utilizing clustering to provide higher availability. Where our messaging service SLA (service level agreement) commitment to the CIO of Microsoft was previously 99.9%, we got aggressive and raised that SLA to 99.99%. Specific to Microsoft as an Enterprise IT shop, this was a stretch endeavour and a bit of a leap of faith. We have the unique responsibility of ‘dogfooding’, or running our own software in the early beta stages. The significance of this is in its contribution to downtime. Prior to employing clustering, the impacts of 2-3 pre-release product upgrades per month were sizeable, much less any downtime associated with finding bugs in beta software. We designed and developed 7-node clusters, we deployed at the various consolidation sites, we planned, we moved mailboxes – the execution was magnificent.

However, the most important piece was the work we didn’t have to do… the work to make our environment truly mission critical - our investment in prevention and thinking proactively about our deployment. Our investment to become a Mission Critical deployment was the key.

From a personal perspective, what was most interesting to me about this exercise was our effort to understand what caused downtime in our environment. Through the development of a detailed taxonomy to which was assigned every minute of downtime we experienced, we gathered an immense amount of data that pointed quite clearly at the remediation steps that were necessary. The steps we took – storage design, clustering, network interface teaming, disaster recovery planning (using recovery storage groups and Windows VSS), security, and, management practices, etc. all adhered to principles and guidance outlined in Mission Critical Microsoft Exchange 2003 . What was our goal? Elimination of all single points of failure and tight recovery plans for those that remained unforeseen.

As a manager of an enterprise Exchange deployment, providing email and calendaring services to those who work at Microsoft, my team’s responsibilities are significant. It is more than just email. To run an effective service and team – to be effective at my job – my team and I have to be thinking mission critical constantly. It has been said that a multi-day email outage can unleash as much or more stress on an Exchange administrator as a divorce. As such, to be truly successful in my job means the avoidance of this possibility. For my team and I, proactive thinking and executing mission-critical ensures we avoid this stress...

Derek Ingalls
Director, Messaging and Collaboration Services
Operations and Technology Group (OTG)
Microsoft Corporation




Mission-Critical Microsoft Exchange 2003. Designing and Building Reliable Exchange Servers
Mission-Critical Microsoft Exchange 2003: Designing and Building Reliable Exchange Servers (HP Technologies)
ISBN: 155558294X
EAN: 2147483647
Year: 2003
Pages: 91
Authors: Jerry Cochran

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net