Automating the Data Center … Priceless

Prior to 2001, CIO Ron Rose’s IT staff had its work cut out for it. With 400 Web servers supporting Priceline.com’s e-commerce Web site, provisioning and updating was a time-consuming and, often, less-than-accurate task.

Most of the time, the bulk of the provisioning, updating and patching work went well but, also most of the time, each server was slightly different from its neighbor. In terms of availability, troubleshooting and performance, this meant that when something went wrong, each server involved in the snafu had to be fixed manually.

“We always try to figure out, ‘How do we avoid that (problem) in the future?'” he said. “And so it was the process of doing post-mortems begun three years ago that led us down the path of (automation).”

Invariably, said Rose, because of the one-off nature of having a staffer go through each server individually to make changes or fixes, each server was still configured slightly differently and the whole cycle began again.

Fast-forward to today and Rose no longer worries about this problem. Since deploying BladeLogic’s Operations Manager software, all his Web servers are configured automatically, exactly the same way every time.

This means provisioning, updating, patching and troubleshooting are much less time-consuming and the results far more accurate. And, for Priceline, whose only product is tied directly to the performance of its Web servers, this equals the high availability crucial to the company’s success.

“Our business is the Web and a key part of my job is to ensure the store is always open,” said Rose. “The biggest benefit is consistency of infrastructure and consistency of infrastructure means I have less variability, and variability is bad.

“And it means I have fewer problems in production and I have less wasted time in chasing problems throughout the infrastructure and it means I have less wasted manpower doing things manually that really shouldn’t be done manually.”

On the manpower side, Rose has been able to better utilize his staff because of BladeLogic’s automation saving him hundreds of thousands in salaries over three years. This alone justified the investment in the software, he said.

Also, as part of his ROI analysis, Rose looked at the availability issue very closely since being down, depending on the amount of time involved, could potentially cost Priceline hundreds of thousands of dollars in lost revenues. As well as downtime, debugging has a headcount attached to it that Rose was able to eliminate using BladeLogic.

“The ROI implications are more than manpower for doing the changes, but the manpower for doing the changes was enough to justify the tool,” he said. “And the knock-on benefits more than justified the tools because greater consistency of architecture means greater stability and greater stability reduces your costs in ways you don’t really appreciate until you actually have it.”

For the most part, Rose only uses BladeLogic in his Web facing (Webshpere and Microsoft Transaction Server), middleware servers and his Q/A environment. Although he could be using it on his 70 Sun database servers, those do not require the same effort to keep rolling. And since BladeLogic’s products are aimed at provisioning, Rose employs diagnostics from BMC, Mercury and others.

“When you’re looking to tools of this nature, you should look for tools that give you the ability to cross-architecture,” he said. “It’s a very important part of the tools selection process and its one of the reasons we selected these guys.”

BladeLogic fits in this mix by allowing Rose’s staff to quickly institute changes and fixes across Priceline’s three datacenters as well as automating the rollout of 100-200 monthly software updates, he said. These range from simple text changes in a Web site to a full-blown rollout of one of Priceline’s vendor offerings.

“Every time we do one of these things … we’re using BladeLogic to make this process high-end quality and decrease the risk associated with it,” he said. “Frankly, BladeLogic is a great tool and … provisioning is still a key for high availability, and automated provisioning is a key for avoiding downtime and avoiding needless mean-time-to-repair cost.”