Thursday, March 18, 2010

A Clustered Index Does Not Guarantee Order

Learned something the hard way recently: just because a table has a clustered index doesn't mean a SELECT without an ORDER BY will return the rows in order. For example, given this table:

CREATE TABLE MyTable
(
    ID  INT IDENTITY (1, 1)  NOT NULL
        CONSTRAINT PK_tblMyTable PRIMARY KEY CLUSTERED (ID)
)

... if you say:

SELECT * FROM MyTable

... then the rows will be returned in undefined (that is, random) order.

"Clustering", I've learned, is a physical construct, not a logical one. The rows are laid out in physical order on the hard drive, but SQL Server is free (in the abscence of an ORDER BY clause) to return them in whatever order it thinks most efficient.

As developers, we don't usually see this behavior, because we tend to have fresh, well-maintained tables. In fact, I only discovered it in a customer's database that hadn't had any maintenance done to it in years. So, the bottom line is, as always, "no ORDER BY, no ordering."

2 comments:

Sankar said...

Hi Larry,

There are different scenarios and its about the access patterns SQL Server uses to fetch the data. One such pattern and most people don't know about are the "Merry-go-round" scans aka advanced scanning only available in Enterprise editions. In short, If a query is scanning a table/index and another query also scans the same table/index while the first one still running, the second query will start using/sharing the data from the 1st query and goes back to fetch the remaining data that wasn't fetched/caught from 1st query. This is an extremely helpful optimization in SQL Server 2005 & above but without an order by one can't never guarantee the order even if a CI exists.

http://msdn.microsoft.com/en-us/library/ms191475.aspx

Michael J. Swart said...

About terminology, When talking to developers or management (or basically, anyone else who cares) I like to use the word "arbitrary" instead of random.

Queries without an order by clause are returned in an arbitrary order. Often you can detect a pattern, and in that sense it isn't random. But you certainly can't count on it!

I remember being asked "What changed last night that we're now see the bug today. Things were fine yesterday". The bug was ulitmately a missing "order by" clause. We were just lucky (unlucky?) that it didn't reveal itself earlier.

Post a Comment