“You’re Fired!” Why Snapshots + Replication (Donald) Trump Your Old Backup Strategy

Lately, the question on everyone’s mind has been: is it possible to replace your aging backup strategy with array-based snapshot and replication technology? The inevitable follow-up to that question: why is this so hard to swallow by so many of us, and why do we have a hard time accepting it? It all boils down to what we are used to and what we are comfortable with doing. Change is hard to accept and even harder to implement. I’m hoping with further explanation, I can highlight the benefits of moving away from antiquated backup technology.

First let’s delve into the traditional backup strategy:

>> Incrementals
>> Differentials
>> Nightly’s
>> Weekly’s
>> Monthly’s
>> Auto-loaders
>> Offsite
>> Retention-period

These are all terminologies we are familiar with using on a daily basis. Traditional backups are a huge pain in the $#%, but they have to be done because the business dictates it. The gist of it is this: traditional backup strategies have been around since the 90s. They cut into production hours; they take dedicated server and backup hardware; and we are lucky if the backups actually get done most of the time. Lastly, let’s be honest, when it comes time to do a restore, our fingers are crossed and we hit the RESTORE button with a hope and a prayer that things will actually work.

The industry’s fear of snapshot technology boils down to a few reasons:
[framed_box bgColor=”EFEFEF” rounded=”true”]

  • Snapshots don’t protect against drive failures.
  • Snapshots cannot be moved offsite, or offloaded onto physical media.
  • Data that is not stored centrally on my array is suspect to loss if it’s not getting backed up.
  • Too many snapshots will alter the performance of the array.
  • Snapshots are not integrated into my applications.
  • Snapshots take up too much disk space.
[/framed_box] Now, let me break down these fears and sway you towards snapshots…

Snapshots don’t protect against drive failures.
This depends on two things: 1) how your LUN’s or aggregates are carved out, and 2) you are not replicating your snapshots to a secondary array. The easiest way to overcome the drive failure is to use a RAID technology which supports more than a single drive failure at any given time.

Snapshots cannot be moved offsite, or offloaded onto physical media.
Replicating your data to a secondary array can kill two birds with one stone. It can help protect you against drive failures or total disasters on your primary array. The second bird is that certain manufacturers support NDMP, or Network Data Management Protocol, which is basically an open standard which supports offloading centrally attached storage devices directly to tape. Now why would I bring that up when I am trying to get you away from tape, well, there still is a true business case for it which your organization might not be able to get away from for long term retention.

What about my data that is not stored centrally on my array? Isn’t it suspect to loss if it is not getting backed up?
Two things to help you here, MS VSS and OSSV. For this discussion, I will spew forth about OSSV. This is a technology developed by NetApp used to offload data from a server with locally attached storage and allow snapshots to be taken at the array level. These snapshots can then be used rebuild servers, and even aid in bare metal restores if 3rd party agents are used.

Too many snapshots will alter the performance of the array.
You know, I can’t deny this point, but it also depends on the manufacturer. This boils down to the file system on the array and how snapshots are written. If snapshots are done via copy-on-write technology, the more snapshots you take and keep online, the more performance drops. We have seen up to 60% performance degradation in the field on manufacturers using copy-on-write technology. Array’s that use WAFL and a pointer based snapshot technology will see a very slight performance degradation using snapshots and only when snapshots are being saved into the hundreds.

Snapshots are not integrated into my applications.
Again, another point I cannot deny for the majority of array manufacturers. Usually to get high snapshot based RPO or RTO, you need some kind of appliance connected to the array that is specific to that application. High-five to NetApp here for their snapshot-based application integration in SQL, Oracle, Exchange and virtual environments. With a simple licensed add-on, snapshots can be used for even the most demanding RPO and RTO requirements for an organization.

Snapshots take up too much disk space.
Once again, this depends on manufacturer. If it is a copy-on-write technology, then yes, snap-shotting your array will take up a considerable amount of disk space and your snapshots kept online are severely limited. A pointer based snapshot solution will allow you to keep a tremendous amount of snapshots online at any time, while consecutively consuming very little space on your array. Think about being able to keep a years worth of snapshots online for a 10tb dataset, and only using 2Tb of space to save those snapshots.

I hope I’ve helped you to understand how snapshots can be used to replace aging tape and backup environments. Please feel free to drop a comment below if you would like to dive deeper into how snapshot-based technology can help you in your fight to replace traditional tape backup.

Photo Credit: daveoleary