FIND_IN_SET() vs IN()

I have 2 tables in my database. One is for orders, and one is for companies.

Orders has this structure:

OrderID     |     attachedCompanyIDs
------------------------------------
   1                     1,2,3
   2                     2,4

And Company has this structure:

CompanyID      |        name
--------------------------------------
    1                 Company 1
    2                 Another Company
    3                 StackOverflow
    4                 Nothing

To get an order's companies names, I can do a query as such:

SELECT name FROM orders,company
WHERE orderID = 1 AND FIND_IN_SET(companyID, attachedCompanyIDs)

That query works fine, but the following query does not.

SELECT name FROM orders,company
WHERE orderID = 1 AND companyID IN (attachedCompanyIDs)

Why does the first query work but not the second one?

The first query returns:

name
---------------
Company 1
Another Company
StackOverflow

The second query only returns:

name
---------------
Company 1

Why is this, why does the first query return all the companies, but the second query only returns the first one?


Solution 1:

SELECT  name
FROM    orders,company
WHERE   orderID = 1
        AND companyID IN (attachedCompanyIDs)

attachedCompanyIDs is a scalar value which is cast into INT (type of companyID).

The cast only returns numbers up to the first non-digit (a comma in your case).

Thus,

companyID IN ('1,2,3') ≡ companyID IN (CAST('1,2,3' AS INT)) ≡ companyID IN (1)

In PostgreSQL, you could cast the string into array (or store it as an array in the first place):

SELECT  name
FROM    orders
JOIN    company
ON      companyID = ANY (('{' | attachedCompanyIDs | '}')::INT[])
WHERE   orderID = 1

and this would even use an index on companyID.

Unfortunately, this does not work in MySQL since the latter does not support arrays.

You may find this article interesting (see #2):

  • 10 things in MySQL (that won’t work as expected)

Update:

If there is some reasonable limit on the number of values in the comma separated lists (say, no more than 5), so you can try to use this query:

SELECT  name
FROM    orders
CROSS JOIN
        (
        SELECT  1 AS pos
        UNION ALL
        SELECT  2 AS pos
        UNION ALL
        SELECT  3 AS pos
        UNION ALL
        SELECT  4 AS pos
        UNION ALL
        SELECT  5 AS pos
        ) q
JOIN    company
ON      companyID = CAST(NULLIF(SUBSTRING_INDEX(attachedCompanyIDs, ',', -pos), SUBSTRING_INDEX(attachedCompanyIDs, ',', 1 - pos)) AS UNSIGNED)

Solution 2:

attachedCompanyIDs is one big string, so mysql try to find company in this its cast to integer

when you use where in

so if comapnyid = 1 :

companyID IN ('1,2,3')

this is return true

but if the number 1 is not in the first place

 companyID IN ('2,3,1')

its return false

Solution 3:

To get the all related companies name, not based on particular Id.

SELECT 
    (SELECT GROUP_CONCAT(cmp.cmpny_name) 
    FROM company cmp 
    WHERE FIND_IN_SET(cmp.CompanyID, odr.attachedCompanyIDs)
    ) AS COMPANIES
FROM orders odr