SQL server ignore case in a where expression

How do I construct a SQL query (MS SQL Server) where the "where" clause is case-insensitive?

SELECT * FROM myTable WHERE myField = 'sOmeVal'

I want the results to come back ignoring the case


Solution 1:

In the default configuration of a SQL Server database, string comparisons are case-insensitive. If your database overrides this setting (through the use of an alternate collation), then you'll need to specify what sort of collation to use in your query.

SELECT * FROM myTable WHERE myField = 'sOmeVal' COLLATE SQL_Latin1_General_CP1_CI_AS

Note that the collation I provided is just an example (though it will more than likely function just fine for you). A more thorough outline of SQL Server collations can be found here.

Solution 2:

Usually, string comparisons are case-insensitive. If your database is configured to case sensitive collation, you need to force to use a case insensitive one:

SELECT balance FROM people WHERE email = '[email protected]'
  COLLATE SQL_Latin1_General_CP1_CI_AS 

Solution 3:

I found another solution elsewhere; that is, to use

upper(@yourString)

but everyone here is saying that, in SQL Server, it doesn't matter because it's ignoring case anyway? I'm pretty sure our database is case-sensitive.

Solution 4:

The top 2 answers (from Adam Robinson and Andrejs Cainikovs) are kinda, sorta correct, in that they do technically work, but their explanations are wrong and so could be misleading in many cases. For example, while the SQL_Latin1_General_CP1_CI_AS collation will work in many cases, it should not be assumed to be the appropriate case-insensitive collation. In fact, given that the O.P. is working in a database with a case-sensitive (or possibly binary) collation, we know that the O.P. isn't using the collation that is the default for so many installations (especially any installed on an OS using US English as the language): SQL_Latin1_General_CP1_CI_AS. Sure, the O.P. could be using SQL_Latin1_General_CP1_CS_AS, but when working with VARCHAR data, it is important to not change the code page as it could lead to data loss, and that is controlled by the locale / culture of the collation (i.e. Latin1_General vs French vs Hebrew etc). Please see point # 9 below.

The other four answers are wrong to varying degrees.

I will clarify all of the misunderstandings here so that readers can hopefully make the most appropriate / efficient choices.

  1. Do not use UPPER(). That is completely unnecessary extra work. Use a COLLATE clause. A string comparison needs to be done in either case, but using UPPER() also has to check, character by character, to see if there is an upper-case mapping, and then change it. And you need to do this on both sides. Adding COLLATE simply directs the processing to generate the sort keys using a different set of rules than it was going to by default. Using COLLATE is definitely more efficient (or "performant", if you like that word :) than using UPPER(), as proven in this test script (on PasteBin).

    There is also the issue noted by @Ceisc on @Danny's answer:

    In some languages case conversions do not round-trip. i.e. LOWER(x) != LOWER(UPPER(x)).

    The Turkish upper-case "İ" is the common example.

  2. No, collation is not a database-wide setting, at least not in this context. There is a database-level default collation, and it is used as the default for altered and newly created columns that do not specify the COLLATE clause (which is likely where this common misconception comes from), but it does not impact queries directly unless you are comparing string literals and variables to other string literals and variables, or you are referencing database-level meta-data.

  3. No, collation is not per query.

  4. Collations are per predicate (i.e. something operand something) or expression, not per query. And this is true for the entire query, not just the WHERE clause. This covers JOINs, GROUP BY, ORDER BY, PARTITION BY, etc.

  5. No, do not convert to VARBINARY (e.g.convert(varbinary, myField) = convert(varbinary, 'sOmeVal')) for the following reasons:

    1. that is a binary comparison, which is not case-insensitive (which is what this question is asking for)
    2. if you do want a binary comparison, use a binary collation. Use one that ends with _BIN2 if you are using SQL Server 2008 or newer, else you have no choice but to use one that ends with _BIN. If the data is NVARCHAR then it doesn't matter which locale you use as they are all the same in that case, hence Latin1_General_100_BIN2 always works. If the data is VARCHAR, you must use the same locale that the data is currently in (e.g. Latin1_General, French, Japanese_XJIS, etc) because the locale determines the code page that is used, and changing code pages can alter the data (i.e. data loss).
    3. using a variable-length datatype without specifying the size will rely on the default size, and there are two different defaults depending on the context where the datatype is being used. It is either 1 or 30 for string types. When used with CONVERT() it will use the 30 default value. The danger is, if the string can be over 30 bytes, it will get silently truncated and you will likely get incorrect results from this predicate.
    4. Even if you want a case-sensitive comparison, binary collations are not case-sensitive (another very common misconception).
  6. No, LIKE is not always case-sensitive. It uses the collation of the column being referenced, or the collation of the database if a variable is compared to a string literal, or the collation specified via the optional COLLATE clause.

  7. LCASE is not a SQL Server function. It appears to be either Oracle or MySQL. Or possibly Visual Basic?

  8. Since the context of the question is comparing a column to a string literal, neither the collation of the instance (often referred to as "server") nor the collation of the database have any direct impact here. Collations are stored per each column, and each column can have a different collation, and those collations don't need to be the same as the database's default collation or the instance's collation. Sure, the instance collation is the default for what a newly created database will use as its default collation if the COLLATE clause wasn't specified when creating the database. And likewise, the database's default collation is what an altered or newly created column will use if the COLLATE clause wasn't specified.

  9. You should use the case-insensitive collation that is otherwise the same as the collation of the column. Use the following query to find the column's collation (change the table's name and schema name):

    SELECT col.*
    FROM   sys.columns col
    WHERE  col.[object_id] = OBJECT_ID(N'dbo.TableName')
    AND    col.[collation_name] IS NOT NULL;
    

    Then just change the _CS to be _CI. So, Latin1_General_100_CS_AS would become Latin1_General_100_CI_AS.

    If the column is using a binary collation (ending in _BIN or _BIN2), then find a similar collation using the following query:

    SELECT *
    FROM   sys.fn_helpcollations() col
    WHERE  col.[name] LIKE N'{CurrentCollationMinus"_BIN"}[_]CI[_]%';
    

    For example, assuming the column is using Japanese_XJIS_100_BIN2, do this:

    SELECT *
    FROM   sys.fn_helpcollations() col
    WHERE  col.[name] LIKE N'Japanese_XJIS_100[_]CI[_]%';
    

For more info on collations, encodings, etc, please visit: Collations Info