Just a few years ago, an enterprise with 1TB of data was considered huge. Now, 1TB can be stored on as little as three disks and even the smallest companies can quickly consume multiple TBs of storage.
Scaling storage to meet today’s needs is now a pretty simple task. Unfortunately, backing up this data has become very difficult. In fact, backup has traditionally been a problem area in most enterprises and is considered a “necessary evil” for many people.
One of the biggest problems with backups is that the data being backed up is growing, but the time allotted for a backup (the backup window) is shrinking or remaining static.
Most companies have addressed the problem by adding more and more tape drives and libraries or bigger, faster tape drives. In some extreme cases, decisions had to be made and some servers were either not backed up or backed up infrequently.
Another common problem is that recovering data from tape can be very slow, especially as tapes have grown in capacity. A common compromise for some production databases is to do one full backup a week and then only backup the archive logs the other days.
Recovering a downed system with these types of backups could take hours or days.
Clearly, backing up data is a daunting task. Unfortunately, too many existing solutions have only addressed backing up the data and have neglected the fact that backups are useless unless the data can be recovered in a timely manner.
Thankfully, there is now an option beyond throwing more tape into the solution.
In an industry wrought with catch phrases, buzz words and “solutions du jour”, disk-to-disk backup, the current trend, is a concept worth considering.
Disk-to-disk backup, also called disk-to-disk-to-tape backup or tape-cache, is when the primary backup is written to disk instead of tape. That backup can then be copied, cloned or migrated to tape at a later time (hence the term “disk-to-disk-to-tape).
So, why is this such a great concept? There are a couple of important reasons.
Before disk-based backup solutions, the only way to increase the performance of backups was to buy more tape drives or upgrade to faster tape drives.
Over time, solutions became very complex and very expensive. Adding even a small amount of disk into the backup solution can significantly reduce the load on existing libraries and extend their useful life.
Additionally, introducing disk to the backup solution improves reliability. Since there are no tapes in a disk-based backup, there will be no backup failures due to a bad tape.
Too many backup projects focus only on backing up the data and forget the sole reason that backups are performed: recoveries.
Very elaborate solutions can be implemented to backup data very quickly, but if the data cannot be restored in a timely manner, if at all, then the speed of the backup is irrelevant.
With a disk-to-disk solution, files can be recovered at disk speeds with no delay to locate and load a tape and then advance the tape to the data. In most cases, a disk-to-disk solution can be finished recovering the data before a tape-only solution has even started streaming the data from tape.
Most disk-to-disk solutions restore files five-to-10 times faster than a conventional tape library. As with backups, recoveries will never fail due to a tape going bad.
Many companies require that tapes be stored offsite, away from the main datacenter. To do this, backup tapes are copied or cloned to create the offsite copies. By backing up to disk and then copying to tape, only one set of tapes is needed.
The main backup can remain on disk and the copy can be sent offsite. If there is a robust enough network between the main site and the offsite storage facility, the backups can also just be copied to a remote disk pool, which could completely eliminate tape.
In addition, if a tape goes bad, all that data is lost and it must be replaced. Tape replacement costs can mount as the tape pool ages. If a disk goes bad, the replacement is typically covered by warranty so replacement is free and since the disk was protected by RAID, the data is not lost.
Implementing a Solution
Disk-to-disk functionality can be implemented by using existing backup software and a large pool of disk. If the backup servers are attached to a SAN then the disk pool used to hold the backup data can easily grow from GBs (gigabytes) to TBs (terabytes) to PBs (petabytes) of useable storage.
Older storage that is “retired” from production use can be redeployed to this pool, extending the useful life of the retired asset.
Since disk-to-disk functionality implemented at the host is just an add-on to existing backup software, host-based solutions are the least expensive to implement. All the major software backup vendors support some sort of disk-to-disk backups.
Virtual tape library (VTL) is a term used to describe a server (or appliance) that runs software to emulate a tape library. Like a host-based solution, most of these VTLs can access a huge pool of disk through a SAN. The backup server sees the VTL as a library with tapes when it is actually just a server with disk.
These solutions tend to be more expensive than a host-based solution since you must have the backup software, a large disk pool and the VTL itself. Also, some VTL solutions come pre-packaged with disk and can be limited in capacity.
A third option is to use an integrated disk and tape solution. This is where a vendor provides a solution with disk, tape and software that automates the cloning of the backups on the disk to tape and packages it together.
Many host-based and appliance-based solutions also provide this functionality while having the advantage of using heterogeneous storage and tape devices.
Whether you are looking to increase backup and recovery speed or reliability; extend the useful life of existing tape or disk devices; or just looking at expanding your current backup solution, disk-to-disk backups are the way to go. Even a small amount of disk can make a dramatic difference to how the entire enterprise operates. And, there is a solution for every budget and need.