Favourite performance tuning tricks [closed]

When you have a query or stored procedure that needs performance tuning, what are some of the first things you try?


Here is the handy-dandy list of things I always give to someone asking me about optimisation.
We mainly use Sybase, but most of the advice will apply across the board.

SQL Server, for example, comes with a host of performance monitoring / tuning bits, but if you don't have anything like that (and maybe even if you do) then I would consider the following...

99% of problems I have seen are caused by putting too many tables in a join. The fix for this is to do half the join (with some of the tables) and cache the results in a temporary table. Then do the rest of the query joining on that temporary table.

Query Optimisation Checklist

  • Run UPDATE STATISTICS on the underlying tables
    • Many systems run this as a scheduled weekly job
  • Delete records from underlying tables (possibly archive the deleted records)
    • Consider doing this automatically once a day or once a week.
  • Rebuild Indexes
  • Rebuild Tables (bcp data out/in)
  • Dump / Reload the database (drastic, but might fix corruption)
  • Build new, more appropriate index
  • Run DBCC to see if there is possible corruption in the database
  • Locks / Deadlocks
    • Ensure no other processes running in database
      • Especially DBCC
    • Are you using row or page level locking?
    • Lock the tables exclusively before starting the query
    • Check that all processes are accessing tables in the same order
  • Are indices being used appropriately?
    • Joins will only use index if both expressions are exactly the same data type
    • Index will only be used if the first field(s) on the index are matched in the query
    • Are clustered indices used where appropriate?
      • range data
      • WHERE field between value1 and value2
  • Small Joins are Nice Joins
    • By default the optimiser will only consider the tables 4 at a time.
    • This means that in joins with more than 4 tables, it has a good chance of choosing a non-optimal query plan
  • Break up the Join
    • Can you break up the join?
    • Pre-select foreign keys into a temporary table
    • Do half the join and put results in a temporary table
  • Are you using the right kind of temporary table?
    • #temp tables may perform much better than @table variables with large volumes (thousands of rows).
  • Maintain Summary Tables
    • Build with triggers on the underlying tables
    • Build daily / hourly / etc.
    • Build ad-hoc
    • Build incrementally or teardown / rebuild
  • See what the query plan is with SET SHOWPLAN ON
  • See what’s actually happenning with SET STATS IO ON
  • Force an index using the pragma: (index: myindex)
  • Force the table order using SET FORCEPLAN ON
  • Parameter Sniffing:
    • Break Stored Procedure into 2
    • call proc2 from proc1
    • allows optimiser to choose index in proc2 if @parameter has been changed by proc1
  • Can you improve your hardware?
  • What time are you running? Is there a quieter time?
  • Is Replication Server (or other non-stop process) running? Can you suspend it? Run it eg. hourly?

  1. Have a pretty good idea of the optimal path of running the query in your head.
  2. Check the query plan - always.
  3. Turn on STATS, so that you can examine both IO and CPU performance. Focus on driving those numbers down, not necessarily the query time (as that can be influenced by other activity, cache, etc.).
  4. Look for large numbers of rows coming into an operator, but small numbers coming out. Usually, an index would help by limiting the number of rows coming in (which saves disk reads).
  5. Focus on the largest cost subtree first. Changing that subtree can often change the entire query plan.
  6. Common problems I've seen are:
    • If there's a lot of joins, sometimes Sql Server will choose to expand the joins, and then apply WHERE clauses. You can usually fix this by moving the WHERE conditions into the JOIN clause, or a derived table with the conditions inlined. Views can cause the same problems.
    • Suboptimal joins (LOOP vs HASH vs MERGE). My rule of thumb is to use a LOOP join when the top row has very few rows compared to the bottom, a MERGE when the sets are roughly equal and ordered, and a HASH for everything else. Adding a join hint will let you test your theory.
    • Parameter sniffing. If you ran the stored proc with unrealistic values at first (say, for testing), then the cached query plan may be suboptimal for your production values. Running again WITH RECOMPILE should verify this. For some stored procs, especially those that deal with varying sized ranges (say, all dates between today and yesterday - which would entail an INDEX SEEK - or, all dates between last year and this year - which would be better off with an INDEX SCAN) you may have to run it WITH RECOMPILE every time.
    • Bad indentation...Okay, so Sql Server doesn't have an issue with this - but I sure find it impossible to understand a query until I've fixed up the formatting.