Disaster Recovery, High Availability, Performance, Recovery, SQL Server, Troubleshooting

How to halve your database recovery time in 60 seconds

Introduction

Picture the scene. Your sitting at you’re desk one morning and a user bursts in in a blind panic and tells you your company’s most critical system is down and your business is currently dead in the water.

Hundreds, maybe thousands of users are now sitting idle wondering what’s going on. Scores of customers are also getting increasingly irate with each passing second. Senior management are asking what the hell is going on.

After some frantic phone calls and emergency investigation work you realise that something catastrophic has happened to the single SQL Server database that underpins your entire operation. The damage to the database is so bad that the only way to recover it is a full database restore from last nights backup.

Oh, and the database is 500 GB+

  • Lets leave aside the time it takes you to figure out roughly what’s going on
  • Lets leave aside the time it takes for you to decide that a restore is the only option
  • Lets also leave aside the  fact that you have no replacement hardware and assume you have a perfectly sized location ready and waiting for you (really?)
  • Lets leave aside any time that may be required to find and transfer backups to where they need to be
  • Lets assume we don’t have the added complexity of needing to figure out how to do a transaction log restore because you’ve never actually needed to do one

All we’re interested in is how long it would take from the moment you ran the RESTORE command, for the database to be fully recovered?

Or put another way, how long would your business be well and truly up shit creek?

Instant File Initialisation

If you would be the poor happless schmuck who has to paddle the business out of said creek using only your hands, you owe it to yourself to take a look at whether your databases could take advantage of Instant File Initialisation.

If you deal in “large” databases, you should check this out right now because this feature may turn out to be the difference between a shit day and a very shit day.

Instant File Initialisation is a feature introduced in SQL Server 2005 that allows SQL Server to allocate MDF space instantly without having to zero initialise it first.

There are a range of scenarios that this improves (creating and autogrowing databases amongst them) but by far the most important scenario it improves is during a database recovery.

The concept of zero initialisation probably doesnt mean that much to you if you are dealing with small-ish databases. However if you deal in databases in the hundreds of gigabyte range or larger then you really need to understand the impact it will have.

In the scenario I described above, as part of a recovery of 500 GB SQL Server would have to write ~500 GB of zeros to the recovered data file before even starting to restore actual data.

I’m not going to explain how to switch Instant File Initialisation on or provide much background material on the feature. Far smarter people than I have already done this.

Check out the following couple of articles for the necessary details:

  • Kimberley Tripp did a great article on this back in the day. She also gave the most understandable description I’ve seen about why some hyper security concious setups might not want to do this (but in fact probably still should)
  • You can also look here for the basic details on how to enable this feature.

In the rest of this post I’m going to show a quick test I did on a 260 GB database to demonstrate the scale of improvement this simple change can have.

The Test

Equipment

  • HP Elite Book
  • Quad Core I7
  • 32GB RAM
  • C Drive: 7200 RPM disk with 20GB SSD cache
  • E Drive: 7200 RPM external disk over USB 3.0

Database to Recover:

  • 123 GB compressed backup
  • 267 GB Data file
  • 65 GB Log file

To avoid disk contention as much as possible, the backup file is located on E and the database will be recovered to C. Not exactly server hardware but its the best I could do!

Test 1: Instant File Initialisation Off

  • Total Time to Recover: 2 hours 15 Minutes

Test: 2 Instant File Initialisation On

  • Total Time to Recover: 1 hour 19 minutes

Conclusions

In a recovery crisis, having Instant File Initialisation enabled could reduce your restore time by approaching half.

In the scenario we have above, we could improve the situation even further by attempting to minimise the size of the log file. At 65 GB we still need to zero initialise a substantial amount of data. This is because unlike data files the log file must be zero initialised before use. If we could control the size of the log file (perhaps through changing bulk import or indexing strategies) we could potentially minimise this cost and reduce the time a bit further.

I say a bit because as far as I can tell SQL Server seems to recover both the data file and log file in parallel rather than sequentially. I’ve never looked in detail at what exactly SQL Server is doing whilst writing to both the data file and log file at the same time, but at a guess it could be recovering the data file whilst simultaneously zeroing out the log.

This means it’s not as though the recovery is completely stalled whilst zero initialisation takes place on the log.

If we extrapolate the numbers up a bit, if we were dealing with a 500 GB database we might expect that to restore in about 4 – 5 hours. With Instant File Initialisation enabled that number might be around the 2 – 2.5 hour mark.

This probably won’t be much comfort the day this happens to you for real, but at least you’ll know that you’ve made the best  you could have out of a pretty bad hand

I hope this helps someone!

Advertisements

One thought on “How to halve your database recovery time in 60 seconds

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s