Cassandra Client Java API's [closed]

Thrift is becoming more of a legacy API:

First, you should be aware that the Thrift API is not going to be getting new features ; it's there for backwards compatibility, and not recommended for new projects.
- the paul

So I'd avoid Thrift based APIs (thrift is only kept for backwards compatibility).

In saying that if you do need to use a thrift based API I'd go for Astyanax. Astyanax is very easy to use (compared to other thrift APIs but my personal experience is that Datastax's driver is even easier).

So you should have a look at Datastax's API (and GitHub repo)? I'm not sure if there any compiled versions of the API for download but you can easily build it with Maven. Also if you take a look at the GitHub repo's commit logs it undergoes very frequent updates.

The driver works exclusively with CQL3 and is asynchronous but be warned that Cassandra 1.2 is the earliest supported version.

Performance
Astyanax is thrift based and Datastax's drive is the binary protocol. Here are the latest benchmarks I could find between thrift and CQL (note these are definitely out of date). But in fairness the small difference in performance shown in these benchmarks will rarely matter.

Asynch support
Datastax's asynch support is a definite advantage over Astyanax (Netflix tried implementing it but decided not to).

Documentation
I cant really argue against Netflix's wiki. The documentation is excellent and its updated fairly frequently. Their wiki includes code examples, and you can find tests in the source code if you need to see the code at work. I struggled to find any documentation of the Datastax driver however test are provided in the GitHub repository so that is a starting point.

Also have a look at this answer (well.. not my one anyway) It looks into some advantages/disadvantages of Thrift and CQL.


I would recommend Datastax java driver for Cassandra http://www.datastax.com.

For JPA like support try my mapping tool. http://valchkou.com/cassandra-driver-mapping.html

Annotation driven No mapping files, no scripts, no configuration files. No need for DDL scripts. Schema automatically synchronized with the entity definition.

Usage sample:

   Entity entity = new Entity();
   mappingSession.save(entity);
   entity = mappingSession.get(Entity.class, id);
   mappingSession.delete(entity); 

available on maven central

   <dependency>
      <groupId>com.valchkou.datastax</groupId>
      <artifactId>cassandra-driver-mapping</artifactId>          
    </dependency>

I would also add decent support as well. We post answers to playORM all the time on stack overflow ;). It also is about to start supporting mongodb(work is nearly finished) so any clients can run on mongodb or cassandra. It has it's own query language such that this port works just fine. You always have access to the raw astyanax interface too when really need the speed.

Also, your note on asynch...thrift previously did not support asynch so no clients did either as they generated the thrift code. Since that has changed, I don't know of a client that has added the asynch stuff in.

I know hbase has an asynch client though. Anyways, just thought I would add my 2 cents in case it helps a little.

EDIT: I was recently in the cassandra-thrift generated source code and it is not a very good api for async development with send and a recv() method but you don't know when to call the recv method. Aaron morton on cassandra user list has a blog on how you can really do it but it is not clean at all...have to grab the selector from thrift deep down and do some stuff so you know when to call the recv method...pretty nasty stuff.

later, Dean