Quantcast
Channel: Noise
Viewing all articles
Browse latest Browse all 39826

Simplify Management of Amazon Redshift Snapshots using AWS Lambda

$
0
0

Post Syndicated from Ian Meyers original https://blogs.aws.amazon.com/bigdata/post/Tx3KMZO4C3CD1HI/Simplify-Management-of-Amazon-Redshift-Snapshots-using-AWS-Lambda

Ian Meyers is a Solutions Architecture Senior Manager with AWS

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. A cluster is automatically backed up to Amazon S3 by default, and three automatic snapshots of the cluster are retained for 24 hours. You can also convert these automatic snapshots to ‘manual’, which means they are kept forever. Snapshots are incremental, so they only store the changes made since the last snapshot was taken, and are very space efficient.

You can restore manual snapshots into new clusters at any time, or you can use them to do table restores, without having to use any third-party backup/recovery software. (For an overview of how to build systems that use disaster recovery best practices, see the AWS white paper Using AWS for Disaster Recovery.)

When creating cluster backups for a production system, you must carefully consider two dimensions:

  • RTO: Recovery Time Objective. How long does it take to recover from  disaster recovery scenario?
  • RPO: Recovery Point Objective. When you have recovered, to what point in time will the system be consistent?

Recovery Time Objective

When using Amazon Redshift, your RTO is determined by the node type you are using, how many of those nodes you have, and the size of the data they store. It is vital that you practice restoration from snapshots created on the cluster to correctly determine Recovery Time Objective. It is also important that you re-test the restore performance any time you resize the clsuter or your data volume changes significantly.

Recovery Point Objective

Amazon Redshift’s automatic recovery snapshots are created every 8 hours, or every 5GB of data changed on disk, whichever comes first. For some customers, an 8 hour RPO is too long and they need to take snapshots more frequently. For other customers with a large amount of data change, these snapshots might be taken far too frequently. That’s where this module comes in. By supplying a simple configuration, you can ensure that snapshots are taken on a fixed-time basis that meets your data recovery needs.

What’s New?

We’ve just launched a new Amazon Redshift Utils module that helps you manage the Snapshots that your cluster creates. You supply a simple configuration, and then AWS Lambda ensures that you have cluster snapshots as frequently as required to meet your RPO. It also manages the retention of the snapshots it creates, and will allow you to create layered backup schedules to meet backup requirements for the short term and long term .

Have a look at the Snapshot Manager project and let us know how it works for you!

If you have questions or suggestions, please leave a comment below.

——————————

Related

Top 10 Performance Tuning Techniques for Amazon Redshift


Viewing all articles
Browse latest Browse all 39826

Trending Articles