Most efficient conversion of ResultSet to JSON?
Solution 1:
I think there's a way to use less memory (a fixed and not linear amount depending on data cardinality) but this imply to change the method signature. In fact we may print the Json data directly on an output stream as soon as we fetch them from the ResultSet: the already written data will be garbage collected since we don't need an array that keeps them in memory.
I use GSON that accepts type adapters. I wrote a type adapter to convert ResultSet to JsonArray and it looks very like to your code. I'm waiting the "Gson 2.1: Targeted Dec 31, 2011" release which will have the "Support for user-defined streaming type adapters". Then I'll modify my adapter to be a streaming adapter.
Update
As promised I'm back but not with Gson, instead with Jackson 2. Sorry to be late (of 2 years).
Preface: The key to use less memory of the result itsef is in the "server side" cursor. With this kind of cursors (a.k.a. resultset to Java devs) the DBMS sends data incrementally to client (a.k.a. driver) as the client goes forward with the reading. I think Oracle cursor are server side by default. For MySQL > 5.0.2 look for useCursorFetch at connection url paramenter. Check about your favourite DBMS.
1: So to use less memory we must:
- use server side cursor behind the scene
- use resultset open as read only and, of course, forward only;
- avoid to load all the cursor in a list (or a
JSONArray
) but write each row directly on an output line, where for output line I mean an output stream or a writer or also a json generator that wraps an output stream or a writer.
2: As Jackson Documentation says:
Streaming API is best performing (lowest overhead, fastest read/write; other 2 methods build on it)
3: I see you in your code use getInt, getBoolean. getFloat... of ResultSet without wasNull. I expect this can yield problems.
4: I used arrays to cache thinks and to avoid to call getters each iteration. Although not a fan of the switch/case construct, I used it for that int
SQL Types
.
The answer: Not yet fully tested, it's based on Jackson 2.2:
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.2.2</version>
</dependency>
The ResultSetSerializer
object instructs Jackson on how to serialize (tranform the object to JSON) a ResultSet. It uses the Jackson Streaming API inside. Here the code of a test:
SimpleModule module = new SimpleModule();
module.addSerializer(new ResultSetSerializer());
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.registerModule(module);
[ . . . do the query . . . ]
ResultSet resultset = statement.executeQuery(query);
// Use the DataBind Api here
ObjectNode objectNode = objectMapper.createObjectNode();
// put the resultset in a containing structure
objectNode.putPOJO("results", resultset);
// generate all
objectMapper.writeValue(stringWriter, objectNode);
And, of course, the code of the ResultSetSerializer class:
public class ResultSetSerializer extends JsonSerializer<ResultSet> {
public static class ResultSetSerializerException extends JsonProcessingException{
private static final long serialVersionUID = -914957626413580734L;
public ResultSetSerializerException(Throwable cause){
super(cause);
}
}
@Override
public Class<ResultSet> handledType() {
return ResultSet.class;
}
@Override
public void serialize(ResultSet rs, JsonGenerator jgen, SerializerProvider provider) throws IOException, JsonProcessingException {
try {
ResultSetMetaData rsmd = rs.getMetaData();
int numColumns = rsmd.getColumnCount();
String[] columnNames = new String[numColumns];
int[] columnTypes = new int[numColumns];
for (int i = 0; i < columnNames.length; i++) {
columnNames[i] = rsmd.getColumnLabel(i + 1);
columnTypes[i] = rsmd.getColumnType(i + 1);
}
jgen.writeStartArray();
while (rs.next()) {
boolean b;
long l;
double d;
jgen.writeStartObject();
for (int i = 0; i < columnNames.length; i++) {
jgen.writeFieldName(columnNames[i]);
switch (columnTypes[i]) {
case Types.INTEGER:
l = rs.getInt(i + 1);
if (rs.wasNull()) {
jgen.writeNull();
} else {
jgen.writeNumber(l);
}
break;
case Types.BIGINT:
l = rs.getLong(i + 1);
if (rs.wasNull()) {
jgen.writeNull();
} else {
jgen.writeNumber(l);
}
break;
case Types.DECIMAL:
case Types.NUMERIC:
jgen.writeNumber(rs.getBigDecimal(i + 1));
break;
case Types.FLOAT:
case Types.REAL:
case Types.DOUBLE:
d = rs.getDouble(i + 1);
if (rs.wasNull()) {
jgen.writeNull();
} else {
jgen.writeNumber(d);
}
break;
case Types.NVARCHAR:
case Types.VARCHAR:
case Types.LONGNVARCHAR:
case Types.LONGVARCHAR:
jgen.writeString(rs.getString(i + 1));
break;
case Types.BOOLEAN:
case Types.BIT:
b = rs.getBoolean(i + 1);
if (rs.wasNull()) {
jgen.writeNull();
} else {
jgen.writeBoolean(b);
}
break;
case Types.BINARY:
case Types.VARBINARY:
case Types.LONGVARBINARY:
jgen.writeBinary(rs.getBytes(i + 1));
break;
case Types.TINYINT:
case Types.SMALLINT:
l = rs.getShort(i + 1);
if (rs.wasNull()) {
jgen.writeNull();
} else {
jgen.writeNumber(l);
}
break;
case Types.DATE:
provider.defaultSerializeDateValue(rs.getDate(i + 1), jgen);
break;
case Types.TIMESTAMP:
provider.defaultSerializeDateValue(rs.getTime(i + 1), jgen);
break;
case Types.BLOB:
Blob blob = rs.getBlob(i);
provider.defaultSerializeValue(blob.getBinaryStream(), jgen);
blob.free();
break;
case Types.CLOB:
Clob clob = rs.getClob(i);
provider.defaultSerializeValue(clob.getCharacterStream(), jgen);
clob.free();
break;
case Types.ARRAY:
throw new RuntimeException("ResultSetSerializer not yet implemented for SQL type ARRAY");
case Types.STRUCT:
throw new RuntimeException("ResultSetSerializer not yet implemented for SQL type STRUCT");
case Types.DISTINCT:
throw new RuntimeException("ResultSetSerializer not yet implemented for SQL type DISTINCT");
case Types.REF:
throw new RuntimeException("ResultSetSerializer not yet implemented for SQL type REF");
case Types.JAVA_OBJECT:
default:
provider.defaultSerializeValue(rs.getObject(i + 1), jgen);
break;
}
}
jgen.writeEndObject();
}
jgen.writeEndArray();
} catch (SQLException e) {
throw new ResultSetSerializerException(e);
}
}
}
Solution 2:
A simpler solution (based on code in question):
JSONArray json = new JSONArray();
ResultSetMetaData rsmd = rs.getMetaData();
while(rs.next()) {
int numColumns = rsmd.getColumnCount();
JSONObject obj = new JSONObject();
for (int i=1; i<=numColumns; i++) {
String column_name = rsmd.getColumnName(i);
obj.put(column_name, rs.getObject(column_name));
}
json.put(obj);
}
return json;
Solution 3:
Two things that will make this faster are:
Move your call to rsmd.getColumnCount()
out of the while loop. The column count should not vary across rows.
For each column type, you end up calling something like this:
obj.put(column_name, rs.getInt(column_name));
It will be slightly faster to use the column index to retrieve the column value:
obj.put(column_name, rs.getInt(i));
Solution 4:
The JIT Compiler is probably going to make this pretty fast since it's just branches and basic tests. You could probably make it more elegant with a HashMap lookup to a callback but I doubt it would be any faster. As to memory, this is pretty slim as is.
Somehow I doubt this code is actually a critical bottle neck for memory or performance. Do you have any real reason to try to optimize it?
Solution 5:
Use a third party library for the JSON export
You could use jOOQ for the job. You don't have to use all of jOOQ's features to take advantage of some useful JDBC extensions. In this case, simply write:
String json = DSL.using(connection).fetch(resultSet).formatJSON();
Relevant API methods used are:
-
DSLContext.fetch(ResultSet)
to convert a JDBC ResultSet into a jOOQ Result. -
Result.formatJSON()
to format the jOOQ Result into a JSON String.
The resulting formatting will look like this:
{"fields":[{"name":"field-1","type":"type-1"},
{"name":"field-2","type":"type-2"},
...,
{"name":"field-n","type":"type-n"}],
"records":[[value-1-1,value-1-2,...,value-1-n],
[value-2-1,value-2-2,...,value-2-n]]}
You could also create your own formatting rather easily, through Result.map(RecordMapper)
This essentially does the same as your code, circumventing the generation of JSON objects, "streaming" directly into a StringBuilder
. I'd say that the performance overhead should be negligible in both cases, though.
(Disclaimer: I work for the company behind jOOQ)
Use SQL/JSON features instead
Of course, you don't have to use your middleware to map JDBC ResultSets to JSON. The question doesn't mention for which SQL dialect this needs to be done, but many support standard SQL/JSON syntax, or something similar, e.g.
Oracle
SELECT json_arrayagg(json_object(*))
FROM t
SQL Server
SELECT *
FROM t
FOR JSON AUTO
PostgreSQL
SELECT to_jsonb(array_agg(t))
FROM t