Speaking with storage administrators, I often hear cold storage and archiving used synonymously. But are these concepts distinct? I believe they are. Here’s how and why it matters:
What is Cold Storage?
Cold storage typically refers to the idea of storing low-touch data on low-cost media. Any data that doesn’t require frequent, low-latency access is a candidate, and the low-cost storage media could be tape, optical, object storage in the cloud, etc. The idea being driven by cost-optimization and capacity control.
Sounds like archiving you say?
Well, it depends on what you expect from an archive system. The problem is that most people are thinking ‘active archiving‘ when they say “archive”, and cold storage is really deep archiving. The reason is that cold storage is akin to tape. You get really slow retrieval response times (e.g. hours to get something back). This is the case with Amazon Glacier for instance, and I suspect it will be too with Microsoft’s upcoming Cold Storage tier, Azure Archive Storage.
When we talk about cold storage, things like retention controls, holds, auditing, full-text search, and access rights analysis, to name a few, are features dangerously missing from the conversation. And what about self-service access for users? If it takes hours to pull something back you can be your archive solution isn’t going to be very popular with your users.
Remember, cold storage isn’t immune from audit or litigation. So the savings you gained from using ultra-cheap cold storage can be completely wiped away with eDiscovery or other types of investigations or audits.
In a perfect world
Carving out an enterprise storage strategy, I’d distinguish between my cold and archive storage. I wouldn’t call cold storage an archive. Cold storage would only be used for things like backups, which I’d never have to look to for eDiscovery since my eDiscovery-ready archive would already contain the same data. Archiving would be a repository equipped with data governance controls and hooks optimized for analytics and discovery, for data that is in any way sensitive or likely to come under scrutiny.