What are Covering Indexes and Covered Queries in SQL Server?
Can you explain the concepts of, and relationship between, Covering Indexes and Covered Queries in Microsoft's SQL Server?
A covering index is one which can satisfy all requested columns in a query without performing a further lookup into the clustered index.
There is no such thing as a covering query.
Have a look at this Simple-Talk article: Using Covering Indexes to Improve Query Performance.
If all the columns requested in the select
list of query, are available in the index, then the query engine doesn't have to lookup the table again which can significantly increase the performance of the query. Since all the requested columns are available with in the index, the index is covering the query. So, the query is called a covering query and the index is a covering index.
A clustered index can always cover a query, if the columns in the select list are from the same table.
The following links can be helpful, if you are new to index concepts:
- Excellent video on advantages and disadvantages of index and covering queries and indexes.
- Indexes in SQL Server
A Covering Index is a Non-Clustered
index. Both Clustered and Non-Clustered indexes use B-Tree data structure to improve the search for data, the difference is that in the leaves of a Clustered Index a whole record (i.e. row) is stored physically right there!, but this is not the case for Non-Clustered indexes. The following examples illustrate it:
Example: I have a table with three columns: ID, Fname and Lname.
However, for a Non-Clustered index, there are two possibilities: either the table already has a Clustered index or it doesn't:
As the two diagrams show, such Non-Clustered indexes do not provide a good performance, because they cannot find the favorite value (i.e. Lname) solely from the B-Tree. Instead they have to do an extra Look Up step (either Key or RID look up) to find the value of Lname. And, this is where covered index comes to the screen. Here, the Non-Clustered index on ID coveres the value of Lname right next to it in the leaves of the B-Tree and there is no need for any type of look up anymore.
A covered query is a query where all the columns in the query's result set are pulled from non-clustered indexes.
A query is made into a covered query by the judicious arrangement of indexes.
A covered query is often more performant than a non-covered query in part because non-clustered indexes have more rows per page than clustered indexes or heap indexes, so fewer pages need to be brought into memory in order to satisfy the query. They have more rows per page because only part of the table row is part of the index row.
A covering index is an index which is used in a covered query. There is no such thing as an index which, in and of itself, is a covering index. An index may be a covering index with respect to query A, while at the same time not being a covering index with respect to query B.
Here's an article in devx.com that says:
Creating a non-clustered index that contains all the columns used in a SQL query, a technique called index covering
I can only suppose that a covered query is a query that has an index that covers all the columns in its returned recordset. One caveat - the index and query would have to be built as to allow the SQL server to actually infer from the query that the index is useful.
For example, a join of a table on itself might not benefit from such an index (depending on the intelligence of the SQL query execution planner):
PersonID ParentID Name
1 NULL Abe
2 NULL Bob
3 1 Carl
4 2 Dave
Let's assume there's an index on PersonID,ParentID,Name
- this would be a covering index for a query like:
SELECT PersonID, ParentID, Name FROM MyTable
But a query like this:
SELECT PersonID, Name FROM MyTable LEFT JOIN MyTable T ON T.PersonID=MyTable.ParentID
Probably wouldn't benifit so much, even though all of the columns are in the index. Why? Because you're not really telling it that you want to use the triple index of PersonID,ParentID,Name
.
Instead, you're building a condition based on two columns - PersonID
and ParentID
(which leaves out Name
) and then you're asking for all the records, with the columns PersonID, Name
. Actually, depending on implementation, the index might help the latter part. But for the first part, you're better off having other indexes.