There’s a wide-spread understanding held by many IT pros today about the Azure Blob Storage redundancy levels, and how merely selecting the correct redundancy level provides a backup of data stored in Azure Blob Storage.
The belief is that by selecting Geo-Redundant Storage (GRS) as the replication level, one will achieve a geographically segregated backup copy of the data.
To be blunt, that’s an extremely dangerous misconception.
Let me be clear:
None of the Azure Blob Storage replication levels, RA-GRS included, give you backup. Replication levels are provided only as a way of increasing resiliency and availability of data.
Azure Blob Storage Replication vs. Backup
There are several key attributes of backup which are entirely missing in the replication models provided in Azure; this post will discuss those in detail, and then offer a simple solution for Azure Blob Storage backup.
Problem 1: Synchronized modification and deletion of blobs
One of the critical components of a backup is that even if, and specially when, a blob is changed or deleted, the blob is still recoverable, as it existed at a desired previous point in time.
Think about it: With GRS, you get no such capability, since any modifications or deletions replicate to the secondary region.
If the accidental or malicious deletion of an object happens, GRS provides absolutely no recovery option, since the change replicates to all copies, including those in the secondary region. On this basis alone GRS, and any other replication level provided by Azure Blob Storage, fail to meet the most basic requirement of a backup.
The bottom line: replication is not a real backup!
Problem 2: Container or account deletion
Thinking at a higher level than the individual blobs, any backup solution for Azure Blob Storage would need to protect against any deletion, accidental or otherwise, of a container, or an entire storage account.
Once again, none of the replication levels, GRS included, provide any such protection: If you delete a container, or the entire storage account, all copies of the affected blobs in all regions are deleted.
So, the simple act of a container deletion (accidentally or maliciously), an operation that only takes a couple of clicks in the Azure portal, can result in permanent data loss.
Problem 3: Cost
In general, when dealing with backup, the goal is to have the maximum retention and granularity for recovery at the lowest possible cost.
We’ve already talked about how replication levels like GRS do not provide any guaranteed retention and lack point in time recovery. But more than that, replication with Azure Blob Storage GRS is expensive. Storing blobs with GRS will more than double the cost since you have to pay the equivalent of LRS storage twice. Storage operation costs increase also. And you incur a geo-replication bandwidth cost overhead too.
In total, this can add up when we’re talking about even modest data volumes. Consider the additional expense of storing 500 TB on the Cool tier using GRS data for three years:
As you can see, the cost of enabling GRS or RA-GRS replication is a significant jump, especially when you consider this much higher storage cost is just getting you replication, NOT a dependable backup!
Now, what if I told you that there was a way to achieve an actual point in time backup of the same 500 TB for three years, at an additional cost of only $70K, versus the $335K premium for GRS? If you’re interested, keep reading because we’ll delve into this at the end of the post.
Problem 4: Replication region fixed pairings
Another issue with the way Azure blob storage GRS replication works is that you have no control over which region your data will replicate. Microsoft predetermines the Azure regional pairing for storage replication, meaning that blobs in a given Azure region will always go to the paired region when using GRS or RA-GRS. You can view the region pairings here.
So, what happens if you want to store data using GRS in a region other than the default paired region? Unfortunately, you are entirely out of luck, and you cannot adjust the replication region with GRS or RA-GRS options.
In most cases, I don’t think this fixed pairing is a huge deal, but it can be in some scenarios. It really depends on what you want to protect against.
Take Brazil, for example. The Azure region of ‘Brazil South’ is in Sao Paolo State, and its regional pair for storage replication is South Central US which is in Texas. An organization in Brazil may not want their data storage in the US, perhaps either due to the PATRIOT Act or because of a requirement to maintain a safe copy of their data outside of the Western hemisphere.
In many of the other regional pairings, the distance between the two Azure regions may be less than desirable. If you are paranoid about your valuable data, then presumably, you are worried about a real regional failure event, in which case you probably want more distance between your primary and remote copy than some of the default pairings.
Problem 5: At Microsoft’s mercy for failover
With the GRS replication level, you are entirely at Microsoft’s mercy for when a fail-over is performed. In the event of a temporary regional outage, Microsoft will not perform the fail-over, meaning that your data is inaccessible during the blackout (even for reads)!
Now, you can use RA-GRS to provide read accessibility to your data, but this, as we showed above, is prohibitively expensive in most cases, and comes with a slew of restrictions and caveats.
So, when does Microsoft perform a failover? Virtually never, given the complexity and cost involved. Instead, Microsoft allows temporary outages to take data offline (even GRS data) for the duration of the outage. In severe circumstances, after making attempts to recover the primary storage account, Microsoft may elect to open up the replica storage account for you, but it is entirely at their discretion when this would happen.
Problem 6: The undefined recovery point objective
Even if one was willing to pay exorbitant costs for a non-backup backup, as is the case in attempting to use GRS for backup, one would at least hope to achieve a clear RPO (recovery point objective), and know how long it takes for data to replicate, and be proactively informed of any issues.
With GRS, this is also not the case. While Microsoft states a GRS replication RPO 15 minutes, they are careful not to commit to that, meaning you have no guarantees about how long it will take for your data to replicate.
And, because of how GRS obfuscates the replication process, you have no way of monitoring or knowing when replication stalls or fails.
Think about any true backup solution: One of the critical aspects is that you have some capability to know precisely how up-to-date your backup is, and control the process and granularity as needed, being notified of any issues.
The Solution: HubStor SaaS Backup
Given that Azure Blob Storage GRS or RA-GRS options are in no way a backup, and also that they are costly, I am happy to report that there is a simple solution for Azure Blob Storage backup which:
- Delivers a true point-in-time backup and recovery of Azure Blob Storage data.
- Unlocks backup from any Azure region to any Azure region.
- Uses a segregated Azure Blob Storage account as the backend (for separation of duties).
- Supports separate tiering rules between primary and secondary accounts (so your data can be Hot or Cool in primary, but the backup copy can tier to Archive storage).
- Achieves pricing that is a mere fraction of GRS and RA-GRS.
I am talking about HubStor’s market-leading cloud data platform built on Azure. With HubStor, you can backup any blob storage account into the region and tier of your choice.
The backup copy can tier down to Archive if you want to make it very affordable. You also benefit from the compression and deduplication functionality built into the HubStor platform, which reduces the volume of data in the backup.
As an example, let’s augment our example above to show the cost of backing up the 500 TB of Cool tier blobs for three years with HubStor, assuming a modest 30% reduction due to compression and deduplication with the backup copy on the Archive tier:
Thus, HubStor delivers a true point-in-time backup for only $3.88/TB/month.
If you’re interested in learning more about how HubStor is simplifying the backup space – or you would like to get pricing for your scenario – contact us today.