MySQL "IN" queries terribly slow with subquery but fast with explicit values
I have a MySQL query (Ubu 10.04,Innodb, Core i7, 16Gb RAM, SSD drives, MySQL params optimized):
SELECT
COUNT(DISTINCT subscriberid)
FROM
em_link_data
WHERE
linkid in (SELECT l.id FROM em_link l WHERE l.campaignid = '2900' AND l.link != 'open')
The table em_link_data has about 7million rows, em_link has a few thousand. This query will take about 18 seconds to complete. However, if I substitute the results of the subquery and do this:
SELECT
COUNT(DISTINCT subscriberid)
FROM
em_link_data
WHERE
linkid in (24899,24900,24901,24902);
then the query will run in less than 1 millisecond. The subquery alone runs in less than 1ms, the column linkid is indexed.
If I rewrite the query as a join, also less than 1ms. Why is a "IN" query so slow with a subquery in it and why so fast with values in it? I can't rewrite the query (bought software) so I was hoping there is some tweak or hint to speedup this query! Any help is appreciated.
Solution 1:
Subqueries execute every time you evaluate them (in MySQL anyway, not all RDBMSes), i.e. you're basically running 7 million queries! Using a JOIN, if possible, will reduce this to 1. Even if adding indexing improves performance of those, you're still running them.
Solution 2:
Yes, IN
with subqueries is slow. Use a join instead.
SELECT
COUNT(DISTINCT subscriberid)
FROM em_link_data JOIN em_link ON em_link_data.linkid=em_link.id
WHERE em_link.campaignid = '2900' AND em_link.link != 'open'
And make sure you've defined indexes on em_link_data.linkid
and em_link.id
.