HEROware: November 2010

Hey All - this is Lynn Shourds, President and CTO of HEROware. When you're talking about Business Continuity it's important to keep in mind the Cost vs. Functionality quotient.

At an enterprise level, end users are not typically concerned with costs. They know that a high-end business continuity solution is going to cost them dollars. And, for these increased dollars they expect lower RPO’s (Recovery Point Objectives) and lower RTO’s (Recovery Time Objectives). However, if they’re mindful of costs, there are technologies available to give them a very low RPO and very low RTO.

Let’s examine the options. We all know that tape backup is going to be at the bottom end of the chart in terms of both RPO and RTO. If we agree that tape is at the bottom, then let’s agree that some form of active/active cluster is at the top. Agreed? Well, since you can’t answer I’ll take that as a yes.

What, then, is in the middle? In the middle we have disk to disk backup, snapshot technology, and asynchronous replication. Let’s look closer. Disk to disk backup is using tape backup technology, just backing up to disk instead of tape. There is still a restore from a proprietary file format involved for recovery. Snapshots are incremental “pictures” of data/system and require a restore. Asynchronous replication is a byte by byte mirroring of the changes only (deltas). These three technologies all have varying RPO and RTO times.

Of the three, however, asynchronous replication brings the biggest bang for the buck. The RPO with asynchronous replication is nearly zero, especially if the replication is taking place on the LAN. The reason for this is every byte-level change is being replicated in real-time as the change takes place on the production server. Because it’s asynchronous replication, there is no hold up of any CPU cycles waiting for a commit on the backup side. Instead, buffers are built in to verify the files have been written, continuously and with write-order preservation.

But, how about RTO you ask? Let’s examine. We’ve already agreed that tape RTO is long because of restoration from that tape and its effective percentage relative to success rates. And we know that an active/active cluster can be very fast, yet not very affordable and it takes significant engineering skills to maintain. So, what about disk-to-disk backups ? Disk to disk backups require a restoration, either to a new physical box or to a virtual session that needs to be built out. Again, it’s tape backup technology just built for disk instead. From the engineers I know and I have spoken with, this can be a guessing game it terms of will it recover or won’t it. Many of them are moving away from this technique and these solutions.

Next we have snapshots. Microsoft has done a nice job with VSS. Snapshots have become a very popular solution and are considered a very good “second” line of defense in your business continuity strategy. Depending on the vendor you can recover these snapshots in various ways, and restore them usually to a virtual machine.

Again, replication technologies typically have faster RTO. The reason for this is that when data is replicated asynchronously it resides in Windows Native File Format. Because there doesn’t have to be a conversion process, these RTO’s can be seconds to minutes.

Even within a single replication technology there can be varying RTO’s. Take Double-Take Software for example. Their RPO's remain extremely low since the underlying asynchronous replication is the same across all platforms. However, because Double-Take uses their technology to vary the costs, the RTO is different depending on, wait for it, yes I did, I’ve brought it back to the opening paragraph…the RTO is different depending on the “Cost vs. Functionality Quotient".

Let me explain. In a typical multi-server environment, there are usually at least 3 Tiers of redundancy needs. The Tier 1 being application servers that need to be up and running within minutes should there be planned or unplanned downtime. Tier 2 being servers that could be down for several minutes to an hour, and Tier 3 for servers that could be down for hours at a time.

Double-Take Software from Vision Solutions has taken these three tiers, with their varying RTO’s, and given the user choices based on how much they want to pay for their individual RTO needs.

For the highest cost you’ll get application-level failover that happens in seconds to minutes. For the next highest dollar amount you’ll get whole-server failover that happens in several minutes. And lastly, for the least amount of money you get replicated backup (Native Windows File Format) that restores to physical or virtual servers in several minutes to a couple hours (depending on size).

This is what I’m talking about when I refer to “Keeping it Real”. You must realize what you get for what you pay. In the big scheme of things, the lowest cost (Tier 3) is plenty good enough for 80% of companies. However, should you need to get your RTO closer to zero minutes, it’s nice to know it’s available to you, it’s just going to cost you a few more bucks.

Thanks for listening-

Lynn

>check out all the Business Continuity options we have at HEROware

Thursday, November 18, 2010

10 Trains, Planes, and Automobiles...and a Bus!

Wednesday, November 17, 2010

Business Continuity - Keeping it Real

Facebook

About Me