PreparedStatement IN clause alternatives?

Solution 1:

An analysis of the various options available, and the pros and cons of each is available here.

The suggested options are:

  • Prepare SELECT my_column FROM my_table WHERE search_column = ?, execute it for each value and UNION the results client-side. Requires only one prepared statement. Slow and painful.
  • Prepare SELECT my_column FROM my_table WHERE search_column IN (?,?,?) and execute it. Requires one prepared statement per size-of-IN-list. Fast and obvious.
  • Prepare SELECT my_column FROM my_table WHERE search_column = ? ; SELECT my_column FROM my_table WHERE search_column = ? ; ... and execute it. [Or use UNION ALL in place of those semicolons. --ed] Requires one prepared statement per size-of-IN-list. Stupidly slow, strictly worse than WHERE search_column IN (?,?,?), so I don't know why the blogger even suggested it.
  • Use a stored procedure to construct the result set.
  • Prepare N different size-of-IN-list queries; say, with 2, 10, and 50 values. To search for an IN-list with 6 different values, populate the size-10 query so that it looks like SELECT my_column FROM my_table WHERE search_column IN (1,2,3,4,5,6,6,6,6,6). Any decent server will optimize out the duplicate values before running the query.

None of these options are ideal.

The best option if you are using JDBC4 and a server that supports x = ANY(y), is to use PreparedStatement.setArray as described here

There doesn't seem to be any way to make setArray work with IN-lists, though.


Sometimes SQL statements are loaded at runtime (e.g., from a properties file) but require a variable number of parameters. In such cases, first define the query:

query=SELECT * FROM table t WHERE t.column IN (?)

Next, load the query. Then determine the number of parameters prior to running it. Once the parameter count is known, run:

sql = any( sql, count );

For example:

/**
 * Converts a SQL statement containing exactly one IN clause to an IN clause
 * using multiple comma-delimited parameters.
 *
 * @param sql The SQL statement string with one IN clause.
 * @param params The number of parameters the SQL statement requires.
 * @return The SQL statement with (?) replaced with multiple parameter
 * placeholders.
 */
public static String any(String sql, final int params) {
    // Create a comma-delimited list based on the number of parameters.
    final StringBuilder sb = new StringBuilder(
        String.join(", ", Collections.nCopies(possibleValue.size(), "?")));

    // For more than 1 parameter, replace the single parameter with
    // multiple parameter placeholders.
    if (sb.length() > 1) {
        sql = sql.replace("(?)", "(" + sb + ")");
    }

    // Return the modified comma-delimited list of parameters.
    return sql;
}

For certain databases where passing an array via the JDBC 4 specification is unsupported, this method can facilitate transforming the slow = ? into the faster IN (?) clause condition, which can then be expanded by calling the any method.

Solution 2:

Solution for PostgreSQL:

final PreparedStatement statement = connection.prepareStatement(
        "SELECT my_column FROM my_table where search_column = ANY (?)"
);
final String[] values = getValues();
statement.setArray(1, connection.createArrayOf("text", values));

try (ResultSet rs = statement.executeQuery()) {
    while(rs.next()) {
        // do some...
    }
}

or

final PreparedStatement statement = connection.prepareStatement(
        "SELECT my_column FROM my_table " + 
        "where search_column IN (SELECT * FROM unnest(?))"
);
final String[] values = getValues();
statement.setArray(1, connection.createArrayOf("text", values));

try (ResultSet rs = statement.executeQuery()) {
    while(rs.next()) {
        // do some...
    }
}

Solution 3:

No simple way AFAIK. If the target is to keep statement cache ratio high (i.e to not create a statement per every parameter count), you may do the following:

  1. create a statement with a few (e.g. 10) parameters:

    ... WHERE A IN (?,?,?,?,?,?,?,?,?,?) ...

  2. Bind all actuall parameters

    setString(1,"foo"); setString(2,"bar");

  3. Bind the rest as NULL

    setNull(3,Types.VARCHAR) ... setNull(10,Types.VARCHAR)

NULL never matches anything, so it gets optimized out by the SQL plan builder.

The logic is easy to automate when you pass a List into a DAO function:

while( i < param.size() ) {
  ps.setString(i+1,param.get(i));
  i++;
}

while( i < MAX_PARAMS ) {
  ps.setNull(i+1,Types.VARCHAR);
  i++;
}

Solution 4:

You can use Collections.nCopies to generate a collection of placeholders and join them using String.join:

List<String> params = getParams();
String placeHolders = String.join(",", Collections.nCopies(params.size(), "?"));
String sql = "select * from your_table where some_column in (" + placeHolders + ")";
try (   Connection connection = getConnection();
        PreparedStatement ps = connection.prepareStatement(sql)) {
    int i = 1;
    for (String param : params) {
        ps.setString(i++, param);
    }
    /*
     * Execute query/do stuff
     */
}

Solution 5:

An unpleasant work-around, but certainly feasible is to use a nested query. Create a temporary table MYVALUES with a column in it. Insert your list of values into the MYVALUES table. Then execute

select my_column from my_table where search_column in ( SELECT value FROM MYVALUES )

Ugly, but a viable alternative if your list of values is very large.

This technique has the added advantage of potentially better query plans from the optimizer (check a page for multiple values, tablescan only once instead once per value, etc) may save on overhead if your database doesn't cache prepared statements. Your "INSERTS" would need to be done in batch and the MYVALUES table may need to be tweaked to have minimal locking or other high-overhead protections.