MySQL "IN" queries terribly slow with subquery but fast with explicit values

I have a MySQL query (Ubu 10.04,Innodb, Core i7, 16Gb RAM, SSD drives, MySQL params optimized):

SELECT
COUNT(DISTINCT subscriberid)
FROM
em_link_data
WHERE
linkid in (SELECT l.id FROM em_link l WHERE l.campaignid = '2900' AND l.link != 'open')

The table em_link_data has about 7million rows, em_link has a few thousand. This query will take about 18 seconds to complete. However, if I substitute the results of the subquery and do this:

SELECT
COUNT(DISTINCT subscriberid)
FROM
em_link_data
WHERE
linkid in (24899,24900,24901,24902);

then the query will run in less than 1 millisecond. The subquery alone runs in less than 1ms, the column linkid is indexed.

If I rewrite the query as a join, also less than 1ms. Why is a "IN" query so slow with a subquery in it and why so fast with values in it? I can't rewrite the query (bought software) so I was hoping there is some tweak or hint to speedup this query! Any help is appreciated.


Solution 1:

Subqueries execute every time you evaluate them (in MySQL anyway, not all RDBMSes), i.e. you're basically running 7 million queries! Using a JOIN, if possible, will reduce this to 1. Even if adding indexing improves performance of those, you're still running them.

Solution 2:

Yes, IN with subqueries is slow. Use a join instead.

SELECT
COUNT(DISTINCT subscriberid)
FROM em_link_data JOIN em_link ON em_link_data.linkid=em_link.id
WHERE em_link.campaignid = '2900' AND em_link.link != 'open'

And make sure you've defined indexes on em_link_data.linkid and em_link.id.