Linux Can, Linux SAN

With such attractions as lower costs and flexibility, it was only a matter of time before the success of Linux in the server sector translated into broader application within the storage market. This doesn’t mean there aren’t still doubts and questions about its viability as a storage platform. But when a company that boasts the fourth largest commercial supercomputer system in the world successfully deploys a Storage Area Network (SAN) using the Linux operating system, it’s time to take a closer look.

The company in question is NuTec Energy, serving the oil and gas industry with seismic imaging services. Based in Houston Texas, NuTec employs 35 staff. In early 2000, the company struck a deal with IBM to develop a massively parallel supercomputing system capable of dealing with the ever-increasing demands of seismic signal processing for the oil and gas industry-based applications.

The storage system initially consisted of 3000 Power 3/3+ CPU’s with AIX on each server, and each CPU running its own analysis. The Network File System (NFS) file server utilized 2 IBM ‘Shark’ units connected to three B80 servers, and with shared file access to all CPU’s. By 2003, however, the system was not keeping up with the demands being made on it, and so a project was established to specify a replacement.

Project Aims

According to Sampath Gajawada, manager of software development at NuTec Energy, “The target was a super-scalable SAN – a high-performance, single image storage environment using Intel, Linux, Fibre Channel and Ethernet.” He defined several key objectives for the SAN:

– Software tuned to be latency-tolerant and massively parallel, buffered asynchronous communication & I/O

– High I/O bandwidth (>500 processing nodes)

– High computing power (processing power >2 Teraflops)

– Large flat file system (10-100TB), with easy storage management

– Cost effective, price/performance balance, scalable at low incremental costs.

The main issues with the incumbent UNIX system were the high cost of the proprietary software and associated support and management, barely adequate computing power and bandwidth for some of the processing requirements, and a bottleneck on the storage NFS.

“The existing system just couldn’t cope with the demands of our Depth-domain Analysis and Time-domain Analysis,” said Gajawada. “We had reached the stage where business requirements were forcing us to reconsider our entire system. We looked at all the alternatives, and settled on a combination of Intel and Linux.”

Page 2: The Switch