A key attraction of the cloud is that it can tap the economies of scale when it comes to compute and storage. Not surprisingly, with the rise of managed cloud database services like Amazon RDS, an ecosystem of third parties have emerged with claims that they can automate some of the housekeeping tasks better than the cloud guys.
What piqued our interest with Clumio a few months back was that they were using AWS’s own technologies to build their own cloud backup and recovery solutions. In essence, fighting fire with fire. They’ve now taken that a step further with long-term data retention, with a new solution targeted at Amazon RDS customers. The new solution adds automation to what would otherwise be a multi-step solution for backing up and accessing data under long-term retention.
With RDS, customers typically have access to snapshots for the latest 35 days of data; after that, if you want to keep the point-in-time records in order to satisfy internal policies or public mandates for data retention, you would either have to replicate data to cold storage, such as Amazon Glacier, or store snapshots manually in S3, and then endure a multi-step process of restoring data to a full blown RDS instance in order to query it.
Instead, Clumio’s solution converts those snapshots into small 16-Kbyte chunks that can be packed into Parquet files that physically reside in Amazon S3 cloud storage. To access the data, you would simply use Amazon’s own Athena service designed for ad hoc queries to S3. Admittedly, querying via Athena might be less efficient that running a query directly in RDS, but on the other hand, you avoid all the time and expense for restoring a full replica of the database from either S3 or Glacier.
The long-term backup solution for RDS comes on the heels of the release of a backup solution for Microsoft 365 announced about a month back that places, under a common control plane, backup for a variety of on-premises and cloud sources including VMware, VMware Cloud on AWS, and Amazon EBS (Elastic Block Store, which is typically used for databases). Clumio’s case is that its backup solution spans multiple SaaS services
The key to Clumio’s solution is that it repurposes the cloud provider’s own infrastructure for the housekeeping of its own backup and recovery tools. For instance, it uses Amazon DynamoDB for metadata storage, RDS Postgres for tracking backup configuration, and Lambda functions to execute backup and restore functions. Clumio’s original backup and restore solution for Amazon EBS was released last fall; the extension for RDS will become available on June 11.