JPA2: Case-insensitive like matching anywhere

It may seem a little awkward at first, but it is type-safe. Building queries from strings isn't, so you notice errors at runtime instead of at compile time. You can make the queries more readable by using indentations or taking each step separately, instead of writing an entire WHERE clause in a single line.

To make your query case-insensitive, convert both your keyword and the compared field to lower case:

query.where(
    builder.or(
        builder.like(
            builder.lower(
                root.get(
                    type.getDeclaredSingularAttribute("username", String.class)
                )
            ), "%" + keyword.toLowerCase() + "%"
        ), 
        builder.like(
            builder.lower(
                root.get(
                    type.getDeclaredSingularAttribute("firstname", String.class)
                )
            ), "%" + keyword.toLowerCase() + "%"
        ), 
        builder.like(
            builder.lower(
                root.get(
                    type.getDeclaredSingularAttribute("lastname", String.class)
                )
            ), "%" + keyword.toLowerCase() + "%"
        )
    )
);

As I commented in the (currently) accepted answer, there is a pitfall using on one hand DBMS' lower() function and on the other hand java's String.toLowerCase() as both method are not warrantied to provide the same output for the same input string.

I finally found a much safer (yet not bullet-proof) solution which is to let the DBMS do all the lowering using a literal expression:

builder.lower(builder.literal("%" + keyword + "%")

So the complete solution would look like :

query.where(
    builder.or(
        builder.like(
            builder.lower(
                root.get(
                    type.getDeclaredSingularAttribute("username", String.class)
                )
            ), builder.lower(builder.literal("%" + keyword + "%")
        ), 
        builder.like(
            builder.lower(
                root.get(
                    type.getDeclaredSingularAttribute("firstname", String.class)
                )
            ), builder.lower(builder.literal("%" + keyword + "%")
        ), 
        builder.like(
            builder.lower(
                root.get(
                    type.getDeclaredSingularAttribute("lastname", String.class)
                )
            ), builder.lower(builder.literal("%" + keyword + "%")
        )
    )
);

Edit:
As @cavpollo requested me to give example, I had to think twice about my solution and realized it's not that much safer than the accepted answer:

DB value* | keyword | accepted answer | my answer
------------------------------------------------
elie     | ELIE    | match           | match
Élie     | Élie    | no match        | match
Élie     | élie    | no match        | no match
élie     | Élie    | match           | no match

Still, I prefer my solution as it does not compare the outcome out two different functions that are supposed to work alike. I apply the very same function to all character arrays so that comparing the output become more "stable".

A bullet-proof solution would involve locale so that SQL's lower() become able to correctly lower accented characters. (But this goes beyond my humble knowledge)

*Db value with PostgreSQL 9.5.1 with 'C' locale


This work for me :

CriteriaBuilder critBuilder = em.getCriteriaBuilder();

CriteriaQuery<CtfLibrary> critQ = critBuilder.createQuery(Users.class);
Root<CtfLibrary> root = critQ.from(Users.class);

Expression<String> path = root.get("lastName");
Expression<String> upper =critBuilder.upper(path);
Predicate ctfPredicate = critBuilder.like(upper,"%stringToFind%");
critQ.where(critBuilder.and(ctfPredicate));
em.createQuery(critQ.select(root)).getResultList();

Easier and more efficient to enforce case insensitity within the database than JPA.

  1. Under the SQL 2003, 2006, 2008 standards, can do this by adding COLLATE SQL_Latin1_General_CP1_CI_AS OR COLLATE latin1_general_cs to the following:

    • Column Definition

      CREATE TABLE <table name> (
        <column name> <type name> [DEFAULT...] 
                                  [NOT NULL|UNIQUE|PRIMARY KEY|REFERENCES...]
                                  [COLLATE <collation name>], 
        ...
      )
      
    • Domain Definition

      CREATE DOMAIN <domain name> [ AS ] <data type>
        [ DEFAULT ... ] [ CHECK ... ] [ COLLATE <collation name> ]
      
    • Character Set Definition

      CREATE CHARACTER SET <character set name>
      [ AS ] GET <character set name> [ COLLATE <collation name> ]
      

    For full description of above refer: http://savage.net.au/SQL/sql-2003-2.bnf.html#column%20definition http://dev.mysql.com/doc/refman/5.1/en/charset-table.html http://msdn.microsoft.com/en-us/library/ms184391.aspx

  2. In Oracle, can set NLS Session/Configuration parameters

     SQL> ALTER SESSION SET NLS_COMP=LINGUISTIC;
     SQL> ALTER SESSION SET NLS_SORT=BINARY_CI;
     SQL> SELECT ename FROM emp1 WHERE ename LIKE 'McC%e';
    
     ENAME
     ----------------------
     McCoye
     Mccathye
    

    Or, in init.ora (or OS-specific name for initialization parameter file):

    NLS_COMP=LINGUISTIC
    NLS_SORT=BINARY_CI
    

    Binary sorts can be case-insensitive or accent-insensitive. When you specify BINARY_CI as a value for NLS_SORT, it designates a sort that is accent-sensitive and case-insensitive. BINARY_AI designates an accent-insensitive and case-insensitive binary sort. You may want to use a binary sort if the binary sort order of the character set is appropriate for the character set you are using. Use the NLS_SORT session parameter to specify a case-insensitive or accent-insensitive sort:

    Append _CI to a sort name for a case-insensitive sort.
    Append _AI to a sort name for an accent-insensitive and case-insensitive sort. 
    

    For example, you can set NLS_SORT to the following types of values:

    FRENCH_M_AI
    XGERMAN_CI
    

    Setting NLS_SORT to anything other than BINARY [with optional _CI or _AI] causes a sort to use a full table scan, regardless of the path chosen by the optimizer. BINARY is the exception because indexes are built according to a binary order of keys. Thus the optimizer can use an index to satisfy the ORDER BY clause when NLS_SORT is set to BINARY. If NLS_SORT is set to any linguistic sort, the optimizer must include a full table scan and a full sort in the execution plan.

    Or, if NLS_COMP is set to LINGUISTIC, as above, then sort settings can be applied locally to indexed columns, rather than globally across the database:

    CREATE INDEX emp_ci_index ON emp (NLSSORT(emp_name, 'NLS_SORT=BINARY_CI'));
    

    Reference: ORA 11g Linguistic Sorting and String Searching ORA 11g Setting Up a Globalization Support Environment