Performance, SQL Server

SQL Server: Are your non-clustered indexes useless?

Introduction

If you’ve been into your SQL Server performance tuning for a while you’ve probably come across the stock advice that if your query returns too many rows, your indexes may not be used. Although its sometimes skimmed over – this logic generally only applies to non-covering, non-clustered indexes.

Before long various stats are flying around, usually in the form of a quote like:

“If your query returns more than X% of the rows in a table, your index won’t be used”

Where X = 1%, 3%, 5%, 10%, 15%, 30% or any other number that seems to be plucked out of the ether somewhere.

But what if I told you something even more outlandish than that 1% figure. What if I told you the number could be as low as ~0.15% of a table before your non-clustered index becomes effectively useless?

I don’t know about you but that seems fairly improbable to me. We’re basically saying that even though the index could be used to eliminate 99.85% of the table from consideration in the blink of an eye, we’re going to scan the entire table anyway. What the what now!?

Don’t take my word for it though. Lets take a look at a couple of examples and then I’ll explain why this makes sense.

Scenario 1: Small Rows

Lets create a sample database and see if we can recreate this scenario:

USE Master

GO

IF DATABASEPROPERTY('Selectivity', 'Version') > 0
    DROP DATABASE Selectivity;

GO

CREATE DATABASE Selectivity

GO

USE Selectivity

GO

CREATE TABLE tblData(
	ID INT IDENTITY(1, 1),
	Data VARCHAR(50),					-- NOTE: We are deliberately creating small rows to fit as many individual rows on a page as possible
	NonClusteredColumn INT
)

This database and table structure is unremarkable in all respects except one. The rows are very small – unusually so in fact, but not beyond the realms of possibility. Even if this table were pressing 1 million+ rows, its not going to take up very much room and as we’ll see later, this is quite important.

Lets hoof some rows in there and build both a clustered and non-clustered index on two different columns. Remember we’re going to be basing our queries on the non-clustered index as the behaviour of the clustered index will be quite different.


-- Disable notifications of row inserts for a small performance boost
SET NOCOUNT ON 

GO

DECLARE @counter AS INT = 0

BEGIN TRAN

	WHILE @counter < 1000000
		BEGIN

			INSERT INTO tblData (Data, NonClusteredColumn)
			VALUES ('Data', @counter + 1)

			SET @counter = @counter + 1
		END

COMMIT

GO
-- Create indexes once the table is built. This should gaurantee that our statistics are accurate
CREATE CLUSTERED INDEX PK_tblData ON tblData(ID)
GO
CREATE NONCLUSTERED INDEX IX_tblData_NonClusteredColumn ON tblData(NonClusteredColumn)
GO

Again nothing remarkable there except the fact that we build the indexes at the end simply to make sure that the index statistics will be up to date.

So – on with the testing. Give the following query a go:


SELECT * FROM tblData
WHERE tblData.NonClusteredColumn > 998485

Small Rows Seek

That query will return 0.151% of the rows in the table and will cause a non-clustered index seek. I don’t know about you, but that’s exactly what I was expecting given the search predicate is perfectly suited to using the non-clustered index.

So lets modify the query to return just 1 more row:


SELECT * FROM tblData
WHERE tblData.NonClusteredColumn > 998484

And the result is a clustered index scan:

Small Rows Scan

In other words, once we started returning 0.156% of the rows in the table, SQL Server went “bugger it” and started scanning the whole table. Even though it could have used the Non-clustered index to eliminate 99.844% of the table at a stroke.

Now, just to prove that this was a deliberate decision on the part of SQL Server, take a look at the row estimates used by the plan:

Accurate Statistics

The Estimated Number of Rows entry for the plan proves that SQL Server knew exactly what it was doing when it chose to scan the table. Those row statistics could have barely been any better and proves that SQL Server did not get confused or was in some way mislead by its statistics as to the nature of what was in the table.

So what accounts for that behaviour? Lets take a look at another example, this time with more typical rows sizes.

Scenario 2: Large Rows

I’m not going to give you a blow by blow description this time. Its the same setup as before, the one difference being we have artificially set each row to be substantially larger than in the first scenario. This will mean that fewer rows will fit on each 8K data page and the table will be physically much larger than before:

USE Master

GO

IF DATABASEPROPERTY('Selectivity', 'Version') > 0
    DROP DATABASE Selectivity;

GO

CREATE DATABASE Selectivity

GO

USE Selectivity

GO

CREATE TABLE tblData(
	ID INT IDENTITY(1, 1),
	Data CHAR(2000),					-- NOTE: Using a largeish CHAR data type to simulate larger rows. This table will be substantially larger than before
	NonClusteredColumn INT
)

-- Diable notifications of row inserts for a small performance boost
SET NOCOUNT ON 

GO

DECLARE @counter AS INT = 0

-- Use an explict transaction to speed the individual inserts up

BEGIN TRAN

	WHILE @counter < 1000000
		BEGIN

			INSERT INTO tblData (Data, NonClusteredColumn)
			VALUES ('Data', @counter + 1)

			SET @counter = @counter + 1
		END

COMMIT

GO
-- Create indexes once the table is built. This should guarantee that our statistics are accurate
CREATE CLUSTERED INDEX PK_tblData ON tblData(ID)
GO
CREATE NONCLUSTERED INDEX IX_tblData_NonClusteredColumn ON tblData(NonClusteredColumn)
GO

With the larger rows, we actually get up to around the 6.5% mark before we will be switched over to a clustered index scan as you can see below:

Large Rows Seek

Flipping over to a scan around the 6.5% mark:

Large Rows Scan

So that’s a little more in keeping with the conventional advice of around the 5% mark.

But what causes the difference?

Unlike some SQL Server query plan behaviour – this situation is really easy to explain. Take a look at the following screen shot of how big the “small” 1 million row table is:

Small table size

Even though it holds 1000000 rows – its still pretty small in memory terms. So small in fact that it would fit fairly comfortably in memory and at 26MB, SQL Server would be through that in a flash. No clumsy bookmark lookups from the separate non-clustered index. No fuss.

Now lets take a look at the larger table:

Large Rows

Hmmm. That’s the exact same 1000000 rows, but in memory terms, the table is much larger.

SQL Server could still very well hold 2GB of table in memory – but it doesn’t take a genius to realise that scanning 2GB of data to get a small percentage of rows is a different proposition when compared to scanning 26MB.

That’s not to say that SQL Server won’t scan the whole 2GB – it will (once we reach the 6.5% rows returned mark). But Microsoft have clearly set the bar a bit higher with larger (and perhaps more typical) tables.

Conclusion

It would be nice if there was some sort of simple formula that I could share with you to tell you what the percentages are at different table sizes. If there is such a formula I certainly don’t know what it is and it would likely be pretty complicated given the number of factors the Query Optimiser needs to take account of. It almost certainly considers far more than just how big the table is or how many pages it contains.

The objective of this post was really just to point out that on small tables, your non-clustered indexes may well be useless  even with relatively high row counts. At first sight this seems pretty counter-intuitive. But when you consider what is involved in doing a straight read of say, a 26MB table, versus having to jump in and out of a separate non-clustered index structure it begins to make sense.

The thing to understand is that small tables behave very differently to large tables in this respect and the only way to know what the exact percentage is, is to try it yourself.

Update: Gail Shaw provided some additional info on the formula that SQL Server uses to flip between scan or seek in the comments below

Performance, Recovery, SQL Server

Delayed Durability in SQL Server 2014 – Part 3 – When should I use it?

In Part 1 of this series we introduced the concept of transaction Durability and its importance in modern relational databases. We also saw that for the first time in SQL Server 2014 Microsoft are going to allow us to control or at least influence transaction durability through its new Delayed Durability feature. In Part 2 we saw that taking advantage of the Delayed Durability feature may offer write-heavy workloads a substantial performance improvement.

In this final post I’m going to offer some brief thoughts on the questions you should ask when considering use of Delayed Durability.

Is my workload suitable?

This is the single most important question we need to answer when considering using Delayed Durability. All other factors are academic if you answer “no” to this question.

So when might a workload be suitable? The short answer to this is that your workload is suitable if you can tolerate data loss. So perhaps the question here should be, when might I be able to tolerate data loss?

When data can be reconstituted from other sources (easily)

It is not uncommon to have data being transferred between systems in a way that is somewhat repeatable. This might be some sort of import into a reporting system or overnight download of transactions. I’ve highlighted the word easily above because just because data can be reconstituted theoretically does not mean it can be done in practice. You don’t want to be figuring out the distance between theory and practice whilst you are trying to recover from some sort of data loss event!

That said, if the mechanism for reconstituting the data is relatively simple , rapid and well understood you may well be able to cope with data loss.

So when else might you be able to tolerate data loss?

When losing data in one system does not automatically imply corruption in another

When considering whether you can deal with data loss you need to think very carefully about how your system interacts with other related systems. Many modern IT systems involve multiple co-dependent databases.

If you consider a web application that is supported by 3 separate databases, is it acceptable for a transaction to go missing in one of them, but not the other two? It’s certainly possible, but you’re going to want to see the small print on that one. The devil is going to be in the detail.

Is Delayed Durability really the only option I have?

It takes a lot to saturate a well tuned database running on decent hardware. In fact, I have only ever come across 2 systems that were being clogged up by write traffic.  It is much more common to see systems struggling with read performance. In the first case the issue was a single poorly structured stored procedure causing system wide havoc. In the second case the workload was actually quite modest – it was just being run against vastly underpowered hardware (think 4GB of RAM for a 130GB database with data, logs and OS all on the C drive!).

The point is – good database design and appropriately specified hardware can easily get you into the realms of tens of thousands of transactions per second. Because Delayed Durability implies data loss it should probably be well down your list of things to look into if you need to squeeze some more performance out of a system.

Bottom Line

Enabling Delayed Durability is a little bit like disabling the air bags in your car. For 99% of us that would be a pretty stupid and thankfully unnecessary thing to do. For the remaining 1% of systems that really need the performance we need to think carefully about how we would deal with the implied data loss in practice. We would also want to make sure that we can’t meet our performance requirements through design changes or more appropriate hardware – after all, why settle for data loss if we don’t need to?

The addition of Delayed Durability in SQL Server 2014 is a welcome one. It brings SQL Server more into line with both Oracle and MySQL in terms of giving the DBA/Developer the flexibility to control how strictly SQL Server enforces the Durability aspect of ACID compliance.

Performance, Recovery, SQL Server

Delayed Durability in SQL Server 2014 – Part 2 – Performance Testing

Introduction

In Part 1 of this series we took a look at some background information on the introduction of Delayed Durability in SQL Server 2014. In this post we take a rough and ready look at the sorts of performance improvements one might expect with Delayed Durability enabled and in Part 3 we take a look at whether your workload is suitable for running in a Delayed Durability environment.

The Tests

The basic principal behind our test is to ensure consistency each test run. To that end I ran the following script to set up a clean database for each test run:


USE Master
GO

IF (DATABASEPROPERTY('DelayedDurability', 'Version') > 0)
	DROP DATABASE DelayedDurability

CREATE DATABASE DelayedDurability ON PRIMARY(
	NAME = 'DelayedDurability_Data',

	-- You may need to change this
	FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\DelayedDurability.mdf',

	-- We want the data size to be large enough to not require an autogrowth op
	SIZE = 200MB,
	FILEGROWTH = 100MB

)
LOG ON (
	NAME = 'DelayedDurability_Log',

		-- You may need to change this
	FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\DelayedDurability.ldf',

	-- We want the log size to be large enough to not require an autogrowth op
	SIZE = 200MB,
	FILEGROWTH = 100MB

)
GO

USE DelayedDurability

GO

CREATE TABLE tblData(
	ID INT IDENTITY(1, 1),
	Data1 VARCHAR(50),
	Data2 INT
)

CREATE CLUSTERED INDEX PK_tblData ON tblData(ID)
GO

CREATE NONCLUSTERED INDEX IX_tblData_Data2 ON tblData(Data2)

GO

This creates a new, very simple database with one key attribute – the data file and log file are set larger than they need to be. This should ensure that no time is spent performing an auto growth operation on either the data file or log.

On test runs where we enabled Delayed Durability I set it at a database level to Forced – meaning all commits return immediately without having to specifically indicate it in each query. This was done to simplify the code and is not necessarily what you should do in the real world

The test script was fairly primitive – but good enough for a simple brute force test:


USE DelayedDurability

SET NOCOUNT ON

DECLARE @counter AS INT = 0

PRINT (CONVERT(VARCHAR, SYSDATETIME(), 121))

	WHILE (@counter < 1000000)
		BEGIN

			INSERT INTO tblData (Data1, Data2)
			VALUES('My Data', @counter)

			SET @counter = @counter + 1

		END

GO

GO
PRINT (CONVERT(VARCHAR, SYSDATETIME(), 121))
GO

We basically send 1 million individual insert transactions as quickly as SSMS can send them.

The performance difference was pretty stark.

Operating out of the box in what I’ll call ACID-compliant mode, 1 million inserts took an average of:

360 seconds

With Delayed Durability Enabled the same operation took an average of:

27 seconds

which by my reckoning is considerably more than a 10X improvement in throughput.

Hey – I get far better/far worse results than these!

Now – I am not trying to convince you that if you enable Delayed Durability “you will see a 10 fold improvement in write performance”.  In fact you almost certainly won’t see that size of improvement because I deliberately designed the test to maximise the impact of the Delayed_Durability setting. There are a huge number of variables at play that will effect your numbers. Off the top of my head these might include:

1. You are running SQL Server 2014 rather than 2014 CTP2
2. You are not running your tests on my laptop!
3. Your write workload may be substantially different to my deliberately primitive test workload. You might already be bundling up large operations into single transactions for example
4. You might have a snazzy fibre channel SAN with huge amounts of cache sitting in front of SSDs, in which case I’m pretty jealous of you right now.

Clearly my tests are hardly scientific and were deliberately set to maximise the effect of batching up large numbers of very small transactions. In the real world your mileage will vary considerably. That said, I think its fair to say that under certain circumstances the performance improvement of Delayed Durability could be significant.

Be sure to read Part 1 and Part 3 for some background and guidance on using Delayed Durability.

Cheers

Performance, Recovery, SQL Server

Delayed Durability in SQL Server 2014 – Part 1 – Introduction

Check out the following additional posts in this series:

Part 2 – Performance Testing Delayed Durability
Part 3 – Delayed Durability – When should I use it?

Delayed Durability – An Introduction

When first learning about the A.C.I.D.(Atomicity, Consistency, Isolation, Durability) principles of modern relational databases it can be tempting to think of them as a set of immutable laws, as unbending and inviolable as the laws of physics.

Once we look a bit deeper however we begin to see that all is not as it seems. Some of these properties are implemented along a spectrum of strictness, sometimes even allowing the developer to configure exactly how A.C.I.D. compliant a given system should behave in certain regards.

The reason for this configurability is relatively straightforward – strict A.C.I.D. compliance is  computationally expensive. What’s more, under certain scenarios at least, strict compliance is also unnecessary.

One of the most interesting changes coming in SQL Server 2014 is that for the first time we will be able to control the database engine’s approach to transaction Durability using a new feature called Delayed Durability. This is similar, though not quite as configurable as the Isolation Level modifiers we can use to control how strictly SQL Server enforces transaction Isolation.

As we shall see in Part 2, taking advantage of the Delayed Durability option can lead to a substantial improvement to write throughput. Sticking with this post for now though – I want to provide some background material on the concept of delayed durability.

Why is Durability Important Anyway?

In ACID parlance, Durability is the principal that once COMMITTED, a transaction becomes permanent.

This seems simple enough, but lets take a look at a concrete example of why it would be very bad if this were not true in most systems.

An insurance brokerage system places a transaction to take out a £25 million insurance contract on an oil tanker. The database informs the brokerage application that the transaction has been committed successfully. The application then automatically issues paperwork advising the client that they are insured and initiates various other workflows relating to billing.

The importance of transaction permanence or durability here should be quite clear. Based on what the database has told us – we have advised the client that they are insured and have potentially set in motion a number of related actions in secondary systems. If we later found out that the transaction was not really committed – we would run in to a lot of problems very quickly – not least of all because our transaction has caused a cascading series of events and possible legal obligations that may well be impossible to unwind.

Write Ahead Logging

SQL Server achieves transaction durability by ensuring the client application is only notified of success once the transaction has made it to the transaction log on disk. This concept is known as write-ahead logging. The transaction will also be committed in memory and will eventually be flushed out to the data file but it is this idea of writing to the log file on disk *before* notifying the client that guarantees durability.

Although durability is an essential property of many systems it is unfortunately a very expensive safety mechanism. Unlike read operations  that are free to make extensive use of lightning fast RAM, all write ahead logging operations must contend with the latencies involved in writing to a permanent storage system – often a spinning disk.

Even taking a fairly benign scenario – where we have a high-end storage array fronted by some sort of cache – our writes to permanent storage are going to be some orders of magnitude slower than a write to RAM. In systems that require extreme levels of write throughput it is not an exaggeration to say that IO on the transaction log is likely to be the bottleneck.

This situation is made worse in scenarios where we have large numbers of very small, discrete transactions that each need to be acknowledged in serial as they are written to disk.

Delayed Durability – What Happens When I Switch it On?

When Delayed Durability is enabled – either at the database or transaction level – SQL Server simply doesn’t wait for Windows to advise it on whether a write was successful. Writes will eventually be flushed out to the transaction log as a batch but the transaction will have long since completed as far as SQL Server is concerned.

This has a number of advantageous effects on write throughput (at the expense of durability):

1. The CPU can continue processing without needing to engage in monstrously long waits for the disk to commit. Less waiting means more useful work on the processor.

2. Because our writes are now bundled up into batches the IO operations themselves are considerably more efficient when compared to writing out each individual write transaction as it arrives.

With this in mind then – and assuming we are willing to accept the risks, it would be reasonable to assume that we should get a pretty sizeable performance improvement by switching Delayed Durability on.

Take a look at Part 2 to see just how big a difference and Part 3 for some guidance on whether Delayed Durability is really what you need in your life.

Performance, SQL Server

SQL Server: Inserting test data FAST

You’ll sometimes want to create a table quickly in SQL Server to test some concept or feature. If you’re anything like me you’ll quite often want to do so with a reasonable amount of data.

The brute force approach to loading a table in this scenario might be to use simple while loop. When you’re starting out with SQL Server you might be tempted to do something like this:

-- Disable notifications of row inserts for a small performance boost
SET NOCOUNT ON
GO

DECLARE @counter AS INT = 0

WHILE @counter < 1000000

  BEGIN

    INSERT INTO tblData (Data, MoreData)
    VALUES ( 'Data',  'MoreData')

    SET @counter = @counter + 1
  END

Can’t argue with that for simplicity. This approach is effective, but won’t get you finished in a hurry. On my laptop, a while loop insert of 1 million rows takes around 4 minutes 50 seconds. 

Now I like reasonable test volumes – but I don’t like waiting the best part of 5 minutes to get them.

There are a few more adventurous measures you can take to improve this dramatically, but in terms of keeping things simple and effective you can’t beat simply wrapping the WHILE loop in an explicit transaction:

DECLARE @counter AS INT = 0

BEGIN TRANSACTION

  WHILE @counter < 1000000
    BEGIN

      INSERT INTO tblData (Data, MoreData)
      VALUES ( 'Data', @counter + 1)

      SET @counter = @counter + 1
    END

COMMIT

With that tiny change, you have turned a 1 million transaction operation into a single transaction. 

In my environment that drops the operation from approaching 5 minutes down to 18 seconds

I know it seems obvious but I’ve seen numerous developers forget about the advantages of an explicit vs an implicit transaction at the right moment

Not bad for adding 3 words!