How to do a serverless AWS Route53 backup

AWS Route53 Serverless backup

Share This Post

Share on linkedin
Share on facebook
Share on twitter
Share on email

Route53 is the platform you can use inside AWS to manage your DNS. It is simple, solid, cheap, reliable. However, it lacks something very important: backups. If you change your DNS records and mess it up, you cannot revert to the previous version natively. This is about to change. With this AWS Route53 backup tutorial, we will see how to do a backup – all inside AWS!

Note: to get the most out of this AWS Route53 backup tutorial, you should be using Route53 already. Furthermore, some basic knowledge of Python will be helpful.

AWS Route53 Backup Tutorial

The Route53 environment

Whether you are just starting out with Route53 or already have a complex deployment, this post is for you. In fact, the backup solution we are going to apply can digest any Route53 configuration and backup it all.

In this solution, we see how to backup Route53 zones, but you can extend the same concepts to the backup of domains.

Furthermore, all this backup solution is cloud-based and serverless: we can run it inside Amazon without any problem. However, please note it will have some additional costs: the cost of the storage of the backups, and the cost of the execution. Obviously, those costs are minimal.

A place for backups: S3

Like all the backups of any kind, we need tome storage space to store the information we are backing up. We are going to do it the Amazon way, using AWS S3. Even if you are not used it before, just know it is extremely simple. It works like a folder, in which you can create files or subfolders.

Log in into your AWS Console, navigate to the S3 service, and create a new bucket with a name you like. The important thing here, during the creation of the bucket, is to enable versioning. Otherwise, all the backup script will be useless.

If you need more help on how to configure an S3 bucket for versioning, we have a wonderful guide to teach you the basics of S3 (including versioning).

Once you have your bucket ready, we can proceed with the heart of the script: Lambda.

A runner for the AWS Route53 backup: Lambda

Now we know what data to take (Route53 configuration) and where to put it (S3 bucket). We need someone to do that.

This is where AWS Lamba comes it. AWS Lambda is a serverless function product. In other words, you provide some files containing the code to execute, and Amazon just executes them (when you trigger them). You only pay for the execution time. If our backup doesn’t run, we won’t pay for it.

This is perfectly in the spirit of AWS – and cloud in general – and that’s what we are going to do.

AWS Route53 backup function

Creating the Lambda function

The Lambda function is the core of our backup. Is the worker who actually does the backup. To create a new Lambda function, search for the service AWS Lambda, and in it create a new function.

When creating a function, you have several options. Choose to Author from scratch your function, give it a name, and set the runtime to be Python 3.x (3.7 at the time of writing this post).

As an execution role, select to create a new role. This is the set of permissions your function will have. We will have to add to it permission to access Route53 and S3 later.

Once you create your function, you will end up in a basic Python editor where you can manage your function. By default, it contains only one file, lambda_function.py.

Writing the Lambda function

Instead of a single file, we are going to use two. In one file, zone.py, we model a Route53 zone into a class. Instead, in the lambda_function.py file, we trigger the backup of all zones through the class we defined in zone.py.

The Zone class

Below, you find the entire zone.py file that you can copy-paste into your function. After that, we will break it down into multiple snippets.

import boto3
import json
from datetime import datetime

route53 = boto3.client('route53')
s3 = boto3.resource('s3')
cloudwatch = boto3.client('cloudwatch')

class Zone:
    def __init__(self, id):
        self.id = id
        
    @staticmethod
    def log(zone_id, message, backup_name=None, has_error=False):
        print('zone_id="{}" message="{}" backup_name="{}" has_error="{}"'.format(
            zone_id,
            message,
            backup_name,
            has_error
        ))
    
    def get_hosted_zone(self):
        """Get the hosted zone backup"""
        try:
            return {
                'success': True,
                'response': route53.get_hosted_zone(Id=self.id)
            }
        except Exception as e:
            self.log(
                self.id,
                'Error while dumping hosted zone: {}'.format(str(e)),
                has_error=True
            )
            return {
                'success': False,
                'error': str(e)
            }
    
    def get_record_set(self):
        """Get the record set backup"""
        try:
            return {
                'success': True,
                'response': route53.list_resource_record_sets(HostedZoneId=self.id)
            }
        except Exception as e:
            self.log(
                self.id,
                'Error while dumping record sets: {}'.format(str(e)),
                has_error=True
            )
            return {
                'success': False,
                'error': str(e)
            }
        
    def get_tags(self):
        """Get the tags backup"""
        try:
            return {
                'success': True,
                'response': route53.list_tags_for_resource(
                    ResourceType='hostedzone',
                    ResourceId=self.id
                )
            }
        except Exception as e:
            self.log(
                self.id,
                'Error while dumping tag: {}'.format(str(e)),
                has_error=True
            )
            return {
                'success': False,
                'error': str(e)
            }
        
    def backup(self):
        """Get the full backup"""
        hosted_zone = self.get_hosted_zone()
        record_set = self.get_record_set()
        tags = self.get_tags()
        if not hosted_zone['success'] or not record_set['success'] or not tags['success']:
            return None
        return {
            'HostedZone': hosted_zone['response']['HostedZone'],
            'DelegationSet': hosted_zone['response']['DelegationSet'],
            'ResourceRecordSets': record_set['response']['ResourceRecordSets'],
            'Tags': tags['response']['ResourceTagSet']['Tags']
        }
        
    @staticmethod
    def get_backup_name(id, path, backup):
        name = backup['HostedZone']['Name'][:-1] + '-' + id + '.json'
        if path is not None:
            name = path + '/' + name
        return name
    
    def upload_backup(self, bucket, path):
        """Push the full backup to S3"""
        backup = self.backup()
        if backup is None:
            self.log(
                self.id,
                'Attempted backup failed because there was an error in generating the backup',
                has_error=True
                )
            cloudwatch.put_metric_data(
                Namespace='Network',
                MetricData=[{
                    'MetricName': 'Route53ZoneBackup',
                    'Dimensions': [
                        {
                            'Name': 'Status',
                            'Value': 'Failure'
                        }
                    ],
                    'Timestamp': datetime.now(),
                    'Value': 1,
                    'Unit': 'Count'
                }])
            return {
                'success': False,
                'error': 'No backup was generated'
            }
        name = self.get_backup_name(self.id, path, backup)
        # Check if data has changed since the last backup
        obj = s3.Object(bucket, name)
        try:
            last_backup = json.loads(obj.get()['Body'].read())
            if last_backup == backup:
                self.log(
                    self.id,
                    'Backup suceeded, current configuration is identical to the one already in backup (VersionId={}). No new file was uploaded.'.format(obj.version_id),
                    backup_name=name
                )
                cloudwatch.put_metric_data(
                Namespace='Network',
                MetricData=[{
                    'MetricName': 'Route53ZoneBackup',
                    'Dimensions': [
                        {
                            'Name': 'Status',
                            'Value': 'Success'
                        },
                        {
                            'Name': 'NewUpload',
                            'Value': 'False'
                        }
                    ],
                    'Timestamp': datetime.now(),
                    'Value': 1,
                    'Unit': 'Count'
                }])
                return {
                    'success': True,
                    'uploaded_new_version': False,
                    'existing_valid_version': obj.version_id
                }
        except:
            pass
        else:
            obj.put(Body=json.dumps(backup, indent=4, sort_keys=True))
            self.log(
                self.id,
                'Backup suceeded, configuration changed since last backup and new file was uploaded.',
                backup_name=name
            )
            cloudwatch.put_metric_data(
                Namespace='Network',
                MetricData=[{
                    'MetricName': 'Route53ZoneBackup',
                    'Dimensions': [
                        {
                            'Name': 'Status',
                            'Value': 'Success'
                        },
                        {
                            'Name': 'NewUpload',
                            'Value': 'True'
                        }
                    ],
                    'Timestamp': datetime.now(),
                    'Value': 1,
                    'Unit': 'Count'
                }])
            return {
                'success': True,
                'uploaded_new_version': True
            }

Boto3

The first thing we need to do in our file is getting an instance of AWS APIs. In fact, our function calls the APIs of Amazon to do the backup of Route53. We need to use three AWS APIs:

  • route53 to fetch data from route53
  • s3 is needed to upload data into our bucket
  • cloudwatch, to write some logs about the operation of our function

We import them and create three global variables. Global variables may not be the best thing to do, but as this is a very simple function they will do the job.

import boto3
import json
from datetime import datetime

route53 = boto3.client('route53')
s3 = boto3.resource('s3')
cloudwatch = boto3.client('cloudwatch')

The constructor

Now, we have to write a constructor for our Zone. To identify a zone, we have to provide its unique ID that AWS associated with the zone. It is unique in all the Amazon world, so it is a good way to get the right zone.

class Zone:
    def __init__(self, id):
        self.id = id
        

Logging

Logging is quite important for our script. In fact, the function will check if the backup already present in S3 is identical to the export of route53. If so, it will not create a new version of our file in S3. However, we may want to know if this check actually happens, or if we have had any error. Thus, we will write some logs to Cloud Watch.

We want to write all logs in a way that is standard, easy to read, and easy to parse. To do that, we use a dedicated function, log(). This is a static method, as it is not strictly related to the zone being instantiated.

    @staticmethod
    def log(zone_id, message, backup_name=None, has_error=False):
        print('zone_id="{}" message="{}" backup_name="{}" has_error="{}"'.format(
            zone_id,
            message,
            backup_name,
            has_error
        ))

Backup methods

In total, we need to fetch three different items from route53: the hosted zone configuration, the record sets, and the tags. Thus, we create three methods to do exactly that. They will call various AWS APIs and get the JSON data needed. If the call to the API fails, they will return None.

    def get_hosted_zone(self):
        """Get the hosted zone backup"""
        try:
            return {
                'success': True,
                'response': route53.get_hosted_zone(Id=self.id)
            }
        except Exception as e:
            self.log(
                self.id,
                'Error while dumping hosted zone: {}'.format(str(e)),
                has_error=True
            )
            return {
                'success': False,
                'error': str(e)
            }
    
    def get_record_set(self):
        """Get the record set backup"""
        try:
            return {
                'success': True,
                'response': route53.list_resource_record_sets(HostedZoneId=self.id)
            }
        except Exception as e:
            self.log(
                self.id,
                'Error while dumping record sets: {}'.format(str(e)),
                has_error=True
            )
            return {
                'success': False,
                'error': str(e)
            }
        
    def get_tags(self):
        """Get the tags backup"""
        try:
            return {
                'success': True,
                'response': route53.list_tags_for_resource(
                    ResourceType='hostedzone',
                    ResourceId=self.id
                )
            }
        except Exception as e:
            self.log(
                self.id,
                'Error while dumping tag: {}'.format(str(e)),
                has_error=True
            )
            return {
                'success': False,
                'error': str(e)
            }

And, finally, we combine all three results into a single function, that returns None in case if at least one of the previous three has failed.

    def backup(self):
        """Get the full backup"""
        hosted_zone = self.get_hosted_zone()
        record_set = self.get_record_set()
        tags = self.get_tags()
        if not hosted_zone['success'] or not record_set['success'] or not tags['success']:
            return None
        return {
            'HostedZone': hosted_zone['response']['HostedZone'],
            'DelegationSet': hosted_zone['response']['DelegationSet'],
            'ResourceRecordSets': record_set['response']['ResourceRecordSets'],
            'Tags': tags['response']['ResourceTagSet']['Tags']
        }

The backup name

Now, we need to give our route53 backup a name. We could just use the ID of the Zone, but if we have many backups it would be very hard to browse among them manually. So, as name, we use a combination of zone name and unique ID.

To create this name, we have a dedicated function. It will take as input the JSON object extracted from the backup function, the zone ID, and a path, in case we don’t want to put our zone in the root of the bucket.

    @staticmethod
    def get_backup_name(id, path, backup):
        name = backup['HostedZone']['Name'][:-1] + '-' + id + '.json'
        if path is not None:
            name = path + '/' + name
        return name

Uploading the backup

Now our Route53 backup tutorial gets interesting. This part of the code snippet contains most of the logic. First, it attempts to obtain a backup, and if that’s in error it writes an error message in CloudWatch.

    def upload_backup(self, bucket, path):
        """Push the full backup to S3"""
        backup = self.backup()
        if backup is None:
            self.log(
                self.id,
                'Attempted backup failed because there was an error in generating the backup',
                has_error=True
                )
            cloudwatch.put_metric_data(
                Namespace='Network',
                MetricData=[{
                    'MetricName': 'Route53ZoneBackup',
                    'Dimensions': [
                        {
                            'Name': 'Status',
                            'Value': 'Failure'
                        }
                    ],
                    'Timestamp': datetime.now(),
                    'Value': 1,
                    'Unit': 'Count'
                }])
            return {
                'success': False,
                'error': 'No backup was generated'
            }

If instead, the backup was obtained, it identifies the name of the backup files in S3 and creates an object to represent it.

        name = self.get_backup_name(self.id, path, backup)
        # Check if data has changed since the last backup
        obj = s3.Object(bucket, name)

It attempts to download the object from S3 and compare it to the one just obtained from Route53. If they are equal, it writes that to CloudWatch, otherwise continue.

        try:
            last_backup = json.loads(obj.get()['Body'].read())
            if last_backup == backup:
                self.log(
                    self.id,
                    'Backup suceeded, current configuration is identical to the one already in backup (VersionId={}). No new file was uploaded.'.format(obj.version_id),
                    backup_name=name
                )
                cloudwatch.put_metric_data(
                Namespace='Network',
                MetricData=[{
                    'MetricName': 'Route53ZoneBackup',
                    'Dimensions': [
                        {
                            'Name': 'Status',
                            'Value': 'Success'
                        },
                        {
                            'Name': 'NewUpload',
                            'Value': 'False'
                        }
                    ],
                    'Timestamp': datetime.now(),
                    'Value': 1,
                    'Unit': 'Count'
                }])
                return {
                    'success': True,
                    'uploaded_new_version': False,
                    'existing_valid_version': obj.version_id
                }
        except:
            pass

If we made it this far it means the two backups are different. The function uploads a new version to S3 and logs it.

        else:
            obj.put(Body=json.dumps(backup, indent=4, sort_keys=True))
            self.log(
                self.id,
                'Backup suceeded, configuration changed since last backup and new file was uploaded.',
                backup_name=name
            )
            cloudwatch.put_metric_data(
                Namespace='Network',
                MetricData=[{
                    'MetricName': 'Route53ZoneBackup',
                    'Dimensions': [
                        {
                            'Name': 'Status',
                            'Value': 'Success'
                        },
                        {
                            'Name': 'NewUpload',
                            'Value': 'True'
                        }
                    ],
                    'Timestamp': datetime.now(),
                    'Value': 1,
                    'Unit': 'Count'
                }])
            return {
                'success': True,
                'uploaded_new_version': True
            }

While this may seem a complex script, its job is very simple. Indeed, having a run in various cases and checking the logs in cloud watch will help you understand it better.

lambda_function

Unlike our previous file, lambda_function.py is much more succinct. It wants to know the zone id as parameter, and then triggers its backup. Furthermore, it can run whether you call the lambda function directly or if you subscribe it to a SNS stream.

from zone import Zone
import json

def lambda_handler(event, context):
    try:
        args = json.loads(event['Records'][0]['Sns']['Message'])
    except:
        args = event
    zone = Zone(args['zone_id'])
    return {
        'statusCode': 200,
        'body': json.dumps(zone.upload_backup(args['bucket_name'], args['bucket_path']))
    }

Save both files, and your lambda function is ready to go.

Overview of the lambda function

If you save your function, AWS will show you a summary of its triggers and resources. A trigger is something that makes the function run, while a resource is something your function use.

AWS Route53 Backup tutorial: the lambda function summary
The lambda function in use.

Note that you won’t have any trigger in your function, because you haven’t subscribed it to any SNS stream yet. You can attach this function to either an API, or a SNS stream – this is up to you.

Using SNS may be simpler if you want to quickly test a manual backup. Once you have your stream set, you can publish into it the ID of a zone you wish to backup, and the function will take care of that.

You can also use another Lambda function to write the zone IDs into that stream automatically. In this way, you will have one function doing the backup (this one), and another saying when and what to backup. That’s what I have put in production in an enterprise-grade environment.

IAM Permissions

If you rune your function now, it will not work. By default, in fact, it has no permissions. We need to adjust its role to have the permission it needs. It is outside the purpose of this article to give you a detailed explanation of IAM. Just know that the right production approach would be to give the minimal privilege required. This means it should only access route53 in read-only, and S3 in read-write but for only the bucket where you want to store the backups. On top of that, the function should not have the right to alter the properties of the bucket itself.

Additionally, the function must have the ability to create logs in CloudWatch.

If you are just testing out, you can give your function all permissions in route53, S3, and CloudWatch. Of course, this is not recommended.

Running our AWS Route53 Backup

If you now run your AWS Route53 backup, you will be able to see its results in CloudWatch and S3. For example, in the environment where I have this function backing up all the zones, I can see the following in our bucket.

AWS Route53 Backup results inside S3
The AWS Route53 backups in S3.

You can download and see the files for yourself in the version you need them. If you need to do a restore, you will have to call the API and provide the three objects stored in each backup: hosted zone details, record set, tags.

Last words on AWS Route53 Backup

In this intense post, we saw how to use AWS tools to create an AWS Route53 backup. Of course, I recommend extending that to a more production-ready solution. Things that may be creat to do include: periodic backup, automatic trigger of backup upon changes, and automated restore, without having to call the API manually.

What do you think of this solution? Does it help you managing your Route53 zones? Let me know in the comments.

Alessandro Maggio

Alessandro Maggio

Project manager, critical-thinker, passionate about networking & coding. I believe that time is the most precious resource we have, and that technology can help us not to waste it. I founded ICTShore.com with the same principle: I share what I learn so that you get value from it faster than I did.
Alessandro Maggio

Alessandro Maggio

Project manager, critical-thinker, passionate about networking & coding. I believe that time is the most precious resource we have, and that technology can help us not to waste it. I founded ICTShore.com with the same principle: I share what I learn so that you get value from it faster than I did.

Join the Newsletter to Get Ahead

Revolutionary tips to get ahead with technology directly in your Inbox.

Alessandro Maggio

2020-02-13T16:30:58+00:00

Unspecified

Cloud, AWS, Python

Unspecified

Want Visibility from Tech Professionals?

If you feel like sharing your knowledge, we are open to guest posting - and it's free. Find out more now.