Does assigning stored procedure input parameters to local variables help optimize the query?

I have a stored procedure that takes 5 input parameters. The procedure is a bit complicated and takes around 2 minutes to execute. I am in process of optimizing query.

So, my question is, does it always help to assign input parameters to local variables and then use local variables in the procedure?

If so, how does it help?

I will not try and explain the full details of parameter sniffing, but in short, no it does not always help (and it can hinder).

Imagine a table (T) with a primary key and an indexed Date column (A), in the table there are 1,000 rows, 400 have the same value of A (lets say today 20130122), the remaining 600 rows are the next 600 days, so only 1 record per date.

This query:

SELECT *
FROM T
WHERE A = '20130122';

Will yield a different execution plan to:

SELECT *
FROM T
WHERE A = '20130123';

Since the statistics will indicate that for the first 400 out of 1,000 rows will be returned the optimiser should recognise that a table scan will be more efficient than a bookmark lookup, whereas the second will only yield 1 rows, so a bookmark lookup will be much more efficient.

Now, back to your question, if we made this a procedure:

CREATE PROCEDURE dbo.GetFromT @Param DATE
AS
    SELECT *
    FROM T
    WHERE A = @Param

Then run

EXECUTE dbo.GetFromT '20130122'; --400 rows

The query plan with the table scan will be used, if the first time you run it you use '20130123' as a parameter it will store the bookmark lookup plan. Until such times as the procedure is recompiled the plan will remain the same. Doing something like this:

CREATE PROCEDURE dbo.GetFromT @Param VARCHAR(5)
AS
    DECLARE @Param2 VARCHAR(5) = @Param;
    SELECT *
    FROM T
    WHERE A = @Param2

Then this is run:

EXECUTE dbo.GetFromT '20130122';

While the procedure is compiled in one go, it does not flow properly, so the query plan created at the first compilation has no idea that @Param2 will become the same as @param, so the optimiser (with no knowledge of how many rows to expect) will assume 300 will be returned (30%), as such will deem a table scan more efficient that a bookmark lookup. If you ran the same procedure with '20130123' as a parameter it would yield the same plan (regardless of what parameter it was first invoked with) because the statistics cannot be used for an unkonwn value. So running this procedure for '20130122' would be more efficient, but for all other values would be less efficient than without local parameters (assuming the procedure without local parameters was first invoked with anything but '20130122')

Some queries to demonstate so you can view execution plans for yourself

Create schema and sample data

CREATE TABLE T (ID INT IDENTITY(1, 1) PRIMARY KEY, A DATE NOT NULL, B INT,C INT, D INT, E INT);

CREATE NONCLUSTERED INDEX IX_T ON T (A);

INSERT T (A, B, C, D, E)
SELECT  TOP 400 CAST('20130122' AS DATE), number, 2, 3, 4 
FROM    Master..spt_values 
WHERE   type = 'P'
UNION ALL
SELECT TOP 600 DATEADD(DAY, number, CAST('20130122' AS DATE)), number, 2, 3, 4 
FROM    Master..spt_values 
WHERE   Type = 'P';
GO
CREATE PROCEDURE dbo.GetFromT @Param DATE
AS
    SELECT *
    FROM T
    WHERE A = @Param
GO
CREATE PROCEDURE dbo.GetFromT2 @Param DATE
AS
    DECLARE @Param2 DATE = @Param;
    SELECT *
    FROM T
    WHERE A = @Param2
GO

Run procedures (showing actual execution plan):

EXECUTE GetFromT '20130122';
EXECUTE GetFromT '20130123';
EXECUTE GetFromT2 '20130122';
EXECUTE GetFromT2 '20130123';
GO
EXECUTE SP_RECOMPILE GetFromT;
EXECUTE SP_RECOMPILE GetFromT2;
GO
EXECUTE GetFromT '20130123';
EXECUTE GetFromT '20130122';
EXECUTE GetFromT2 '20130123';
EXECUTE GetFromT2 '20130122';

You will see that the first time GetFromT is compiled it uses a table scan, and retains this when run with the parameter '20130122', GetFromT2 also uses a table scan and retains the plan for '20130122'.

After the procedures have been set for recompilation and run again (note in a different order), GetFromT uses a bookmark loopup, and retains the plan for '20130122', despite having previously deemed that an table scan is a more approprate plan. GetFromT2 is unaffected by the order and has the same plan as before the recompliateion.

So, in summary, it depends on the distribution of your data, and your indexes, your frequency of recompilation, and a bit of luck as to whether a procedure will benefit from using local variables. It certainly does not always help.

Hopefully I have shed some light on the effect of using local parameters, execution plans and stored procedure complilation. If I have failed completely, or missed a key point a much more in depth explanation can be found here:

http://www.sommarskog.se/query-plan-mysteries.html

I don't believe so. Modern computer architectures have plenty of cache close to the processor for putting in stored procedure values. Essentially, you can consider these as being on a "stack" which gets loaded into local cache memory.

If you have output parameters, then possibly copying input values to a local variable would eliminate one step of indirection. However, the first time that indirection is executed, the destination memory will be put in the local cache and it will probably remain there.

So, no, I don't think this is an important optimization.

But, you could always time different variants of a stored procedure to see if this would help.

It does help.

Below links contain more details about parameter sniffing.

http://blogs.msdn.com/b/turgays/archive/2013/09/10/parameter-sniffing-problem-and-workarounds.aspx

http://sqlperformance.com/2013/08/t-sql-queries/parameter-sniffing-embedding-and-the-recompile-options

When you execute a SP with parameters for the first time, query optimizer creates the query plan based on the value of the parameter. Query optimizer uses statistics data for that particular value to decide the best query plan. But cardinality issues can affect this. Which means if you execute the same SP with different parameter value that previously generated query plan may not be the best plan.

By assigning parameters to local variables we hide the parameter values from query optimizer. So it creates the query plan for general case.

this is same as using "OPTIMIZE FOR UNKNOWN" hint in the SP.

Does assigning stored procedure input parameters to local variables help optimize the query?

Related

Recent Posts