Optimise Restores from S3

by **neildobson** on Mon Oct 21, 2013 9:08 pm

Hi,
I'm trying to mirror some MS Access Databases and i've successfully set-up a backup job to copy to S3 every 10 minutes only the changes (delta copy). This is working well as the AWS inbound charges are free and the storage costs for the full copy + deltas are minimal. However when it comes to restores this is expensive as i'm restoring to a server every 10 minutes and the restore job appears to delete the file and overwrite it with a fresh copy from S3. The zipped DB is about 40MB and so the restores every 10 minutes results in about 5GB of outbound traffic per day, and growing.

I guess it might be too much to consider delta updates on the restore side but is there any other way to optimise the restores? For instance, if the file hasn't been modified since the last restore can the job be set to ignore? In our case the restored copy sits in a staged area and after restoring a powershell job copies the restored db into production area, so the staged restored db will never be touched. Therefore the restore job should be able to compare its timestamp with the copy on S3.

Our aim is to copy(mirror) the on-premise access dbs to a number of load-balanced cloud servers like this. All traffic is one-way:

BACKUP:
ON-PREMISE--->S3

RESTORE:
S3--->CLOUD SERVER1
+-->CLOUD SERVER2
+-->CLOUD SERVER3

Any assistance would be appreciated.

Thanks,
Neil

by **superflexible** on Tue Oct 22, 2013 2:57 am

Hello,

yes, you should set up two jobs for the restore.

One would mirror the many zip files on the S3 storage 1:1 to a local copy. This job would not have any special checkmark, no unzipping etc. Plus, under Versioning->More, it needs two checkmarks:
X Do Not Decode Left-Hand Filenames When Building File List
X Do Not Decode Right-Hand Filenames When Building File List

That way, you should get a 1:1 local copy.

The second job can then do the actual Synthetic Restore from this copy quickly.

by **superflexible** on Tue Oct 22, 2013 2:59 am

The first job should also use the following checkmark on the "Files" category:

X Automatically resume

This will download with a temporary filename until the copy is complete, so that the second job won't try to process incomplete files which are still downloading.

by **neildobson** on Tue Oct 22, 2013 5:29 pm

Hi,
Thanks for the detailed response. The first part I think I understand. From the backup point-of-view, because we're mirroring, we limit the number of versions required. The synthetic backups are marked to create a checkpoint every day and we limit - remove unneeded incrementals older than 1 day - so that the data on S3 effectively contains a full backup plus all the deltas created every 5 minutes for 1 full day.

With this in mind, step 1 of your suggestion would be minimal because mirroring would copy only a handful of files and each subsequent run would need to copy only the new deltas which in some cases are only a few kbs. Pretty good so far.

However what I would be left with is a load of zip files in a folder on my restore machines. The second job I don't know how to set-up. Would this be pointing to S3 or the local files? I assume the latter but some advice would be good here. Once i've got it figured out I can provide a set of backup & restore jobs for others to use.

Thanks.

by **neildobson** on Tue Oct 22, 2013 8:11 pm

Got it working.
The first job exactly mirrors S3 bucket and the second job restores from the local copy. The average S3 restore is now 200Kb rather than 30MB and only copies if necessary. Great saving.
Thank you.

by **neildobson** on Mon Oct 28, 2013 6:45 pm

As promised here are the config files to schedule backups to S3 using synthetic backups, and the restore jobs which mirror S3 files local and then locally extract the mirrored parts to update/overwrite your local copy.

With this set-up you could for instance backup a single or set of files and simultaneously restore to an unlimited number of machines.

In my case I backup and restore every 5 minutes. On the backup side I am copying an MS Access DB which is about 300MB in size and is in constant use. The delta copies are about 300kb in size. This results in about 200MB of upload & download traffic per day.

by **superflexible** on Sun Nov 03, 2013 3:00 pm

Many thanks!

Optimise Restores from S3

Optimise Restores from S3

Re: Optimise Restores from S3

Re: Optimise Restores from S3

Re: Optimise Restores from S3

Re: Optimise Restores from S3

Re: Optimise Restores from S3

Re: Optimise Restores from S3