Should I index a bit field in SQL Server?

Consider what an index is in SQL - and index is really a chunk of memory pointing at other chunks of memory (i.e. pointers to rows). The index is broken into pages so that portions of the index can be loaded and unloaded from memory depending on usage.

When you ask for a set of rows, SQL uses the index to find the rows more quickly than table scanning (looking at every row).

SQL has clustered and non-clustered indexes. My understanding of clustered indexes is that they group similar index values into the same page. This way when you ask for all the rows matching an index value, SQL can return those rows from a clustered page of memory. This is why trying to cluster index a GUID column is a bad idea - you don't try to cluster random values.

When you index an integer column, SQL's index contains a set of rows for each index value. If you have a range of 1 to 10, then you would have 10 index pointers. Depending on how many rows there are this can be paged differently. If your query looks for the index matching "1" and then where Name contains "Fred" (assuming the Name column is not indexed), SQL gets the set of rows matching "1" very quickly, then table scans to find the rest.

So what SQL is really doing is trying to reduce the working set (number of rows) it has to iterate over.

When you index a bit field (or some narrow range), you only reduce the working set by the number of rows matching that value. If you have a small number of rows matching it would reduce your working set a lot. For a large number of rows with 50/50 distribution, it might buy you very little performance gain vs. keeping the index up to date.

The reason everyone says to test is because SQL contains a very clever and complex optimizer that may ignore an index if it decides table scanning is faster, or may use a sort, or may organize memory pages however it darn well likes.


I just came across this question by way of another. Assuming that your statement that only a handful of the records assume the value of 1 (and that those are the ones you're interested in), then a filtered index could be a good choice. Something like:

create index [IX_foobar] on dbo.Foobar (FooID) where yourBitColumn = 1

This will create a substantially smaller index that the optimizer is smart enough to use when that is a predicate in your query.


100-million records with only a few having the bit field set to 1? Yes, I would think indexing the bit field would definitely speed up querying the bit=1 records. You should get logarithmic search time from the index and then only touch the few pages with bit=1 records. Otherwise, you'd have to touch all pages of the 100-million record table.

Then again, I'm definitely not a database expert and could be missing something important.


If your distribution is pretty known and unbalanced, like 99% of the rows are bit = 1 and the 1% are bit = 0, when you do a WHERE clause with bit = 1, a full table scan will be around the same time as the index scan. If you want to have a fast query where bit = 0, the best way I know is create a filtered index, adding a clause WHERE bit = 0. That way, that index will only store the 1% row. Then doing a WHERE bit = 0 will simply let the query optimizer choose that index, and all rows from it will be bit = 0. You also have the benefit to have a very small amount of disk space required compare a full index on the bit.