You are establishing redundant data centers. You are bringing in banks of blade servers. You are adding other high-density systems to harness the latest high-performance, high-reliability systems.
But, what you may not be doing (or even thinking of) is putting in the appropriate power and cooling systems to meet the demands of these more concentrated architectures.
Earle Weaver, president of Liebert Corp., a data center cooling vendor, advises that managing the high degree of change in IT systems and networks without compromising continuity requires an adaptive power and cooling infrastructure.
Most people have never bothered to figure out the amount of time their computer systems can operate without cooling, and many lack redundancy in their power distribution or back-up power systems.
Even though it’s Weaver’s job to sell cooling systems, Clive Longbottom, an analyst at UK-based Quocirca Ltd, agrees with his view. He believes the greater density of modern systems requires a change of approach in the data center and the server room.
“What we are looking at nowadays is a full set of high heat hotspots from even the lower end Intel systems which are packed closer and closer together in racks,” said Longbottom. “The old approach of just trying to keep the temperature in the room at a steady level, with specific cooling only for the high-value boxes, is no longer valid.”
CIO’s, therefore, need to pay far more attention in BCP (business continuity planning) planning to such factors as localization of cooling efforts. Instead of one large cooler for the whole room, additional AC units need to be deployed where they may be most needed. This factor is particularly important when data centers utilize blade servers and other high-density, rack-mounted server configurations.
When racks were first introducted they created massive heat problems, as designers tended to place power at the bottom, storage next and CPUs at the top.
Recent racks, though, are being better engineered. Each type or part of a rack can have its own cooling technology and design. But even then, they can still be a problem. A single rack of IBM p5-575 servers, for example consumes up to 41.6 kilowatts, or 5000 watts per square foot, far above the industry standard of 50-to-100 watts per square foot.
“Ensure that racks are well designed, mixing heat generators and cooling systems effectively,” said Longbottom. “Dell’s racks are some of the best designed around for heat dissipation.”
He recommends that server rooms be designed with the heat load dotted around as much as possible, with localized cooling provided for each set of hot spots. As well as being more effective, this also provides a higher degree of resilience: If one cooling system goes down, the heat from that area can still bleed through into other cooling areas, giving more time to troubleshoot.
To keep costs down, he suggests that commodity items are used wherever possible. Fans, for example, are an area where costs can be reduced without affecting cooling potential.
“Good fan technologies are now relatively cheap, but don’t go for the cheapest possible as the fan life will be limited,” said Longbottom. “And these days it is okay to use ordinary air conditioning units instead of large, monolithic server room specials.”
It’s during a disaster, of course, that the value of power and cooling become especially noteworthy. Recent hurricane seasons, in particular, have taught many companies a nasty lesson in what can happen without high-quality backup power.
Coleman Technologies (CTI) of Orlando, Fla., a provider of networking and Internet services, is a good example. After experiencing disruptions to its IP communications network—network switches rebooting without warning and reboots in the IP phone system and company computers—the company did some sluething.
Initially, the company suspected a switch fault. Investigation revealed, though, that the cause was a faulty UPS—one of the units didn’t deliver any power in the event of a utility power loss. Further, it had a circuitry problem that could trigger a reboot. These failings resulted in two lost switches and two damaged servers before the problem was located.
“If someone was on a call and the switch rebooted, the person would lose the call, with a minute or two passing before the phone would become operational,” said Kirk Sawyer, CTI’s CFO. “We also could lose our connection to the e-mail and file server.”
As the company had been systemically updating its networking and server infrastructure, it realized that it was now time to do the same on the power side. It added new 6 kVA Liebert GXT2 UPS units that convert incoming power from AC to DC then back to AC to provide an isolated power source to the critical load. This supplies a stable 120-volt source to the connected load even during an extended brownout.
“Based on our experience, we are convinced that mission-critical networks require mission-critical online UPS technology,” said Sawyer.
He further notes that these units saved the CTI network during a season of heavy hurricanes last year. No interruptions took place during the entire period despite its headquarters losing power as many as 10 times.
The main lesson, then, is to include power and cooling into the equation as the infrastructure is built out. Particularly when it comes to BCP, companies either fail to pay enough attention to these factors, or cut corners on them in order to reduce what can be a hefty price tag when it comes time to build a secondary site.
“While IT systems themselves have been effectively integrated into business continuity plans,” said Longbottom, “in some cases, critical infrastructure systems (power and cooling) have not, and that can leave an organization vulnerable to disruption.”