Maximum size for a SQL Server Query? IN clause? Is there a Better Approach [duplicate]
Possible Duplicate:
T-SQL WHERE col IN (…)
What is the maximum size for a SQL Server query? (# of characters)
Max size for an IN clause? I think I saw something about Oracle having a 1000 item limit but you could get around this with ANDing 2 INs together. Similar issue in SQL Server?
UPDATE So what would be the best approach if I need to take say 1000 GUIDs from another system (Non Relational Database) and do a "JOIN in code' against the SQL Server? Is it to submit the list of 1000 GUIDs to an IN clause? Or is there another technique that works more efficiently?
I haven't tested this but I wonder if I could submit the GUIDs as an XML doc. For example
<guids>
<guid>809674df-1c22-46eb-bf9a-33dc78beb44a</guid>
<guid>257f537f-9c6b-4f14-a90c-ee613b4287f3</guid>
</guids>
and then do some kind of XQuery JOIN against the Doc and the Table. Less efficient than 1000 item IN clause?
Every SQL batch has to fit in the Batch Size Limit: 65,536 * Network Packet Size.
Other than that, your query is limited by runtime conditions. It will usually run out of stack size because x IN (a,b,c) is nothing but x=a OR x=b OR x=c which creates an expression tree similar to x=a OR (x=b OR (x=c)), so it gets very deep with a large number of OR. SQL 7 would hit a SO at about 10k values in the IN, but nowdays stacks are much deeper (because of x64), so it can go pretty deep.
Update
You already found Erland's article on the topic of passing lists/arrays to SQL Server. With SQL 2008 you also have Table Valued Parameters which allow you to pass an entire DataTable as a single table type parameter and join on it.
XML and XPath is another viable solution:
SELECT ...
FROM Table
JOIN (
SELECT x.value(N'.',N'uniqueidentifier') as guid
FROM @values.nodes(N'/guids/guid') t(x)) as guids
ON Table.guid = guids.guid;
The SQL Server Maximums are disclosed http://msdn.microsoft.com/en-us/library/ms143432.aspx (this is the 2008 version)
A SQL Query can be a varchar(max) but is shown as limited to 65,536 * Network Packet size, but even then what is most likely to trip you up is the 2100 parameters per query. If SQL chooses to parameterize the literal values in the in clause, I would think you would hit that limit first, but I havn't tested it.
Edit : Test it, even under forced parameteriztion it survived - I knocked up a quick test and had it executing with 30k items within the In clause. (SQL Server 2005)
At 100k items, it took some time then dropped with:
Msg 8623, Level 16, State 1, Line 1 The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
So 30k is possible, but just because you can do it - does not mean you should :)
Edit : Continued due to additional question.
50k worked, but 60k dropped out, so somewhere in there on my test rig btw.
In terms of how to do that join of the values without using a large in clause, personally I would create a temp table, insert the values into that temp table, index it and then use it in a join, giving it the best opportunities to optimse the joins. (Generating the index on the temp table will create stats for it, which will help the optimiser as a general rule, although 1000 GUIDs will not exactly find stats too useful.)