Has anyone worked with Aerospike? How does it compare to MongoDB? [closed]

Can anyone say if Aerospike is as good as they claim it to be? I'm a bit skeptical since it's a commercial enterprise. As far as I understand they just released a open source version, but the claims on their website could still be exaggerated.

I'm especially interested on how Aerospike compares to MongoDB.


Solution 1:

I have used Aerospike, MongoDB and Redis and have tested many other NoSQL databases. I would say Aerospike is very good at what it does but it is different than MongoDB. Everything depends on what you are planning on using a database for. I can give you an example of what I am using my different databases for. I can also go over the differences between them and discuss the benefits of Aerospike.

MongoDB

I am using MongoDB as a SQL alternative. In my MongoDB database I have many different fields. Often times the fields are changing and I will randomly need to query on various fields. It is a very unstructured database and MongoDB is amazing at that. I have also used MongoDB as a standard key-value store. It performs well but I have had MongoDB perform sub-optimally at both transaction scale and database size scale. Admittedly, the database might have been optimized a little better but I find it very hard to find documentation on configuring MongoDB correctly in different situations.

Redis

Redis is a pure key-value store. Redis' biggest problem is that it is purely in-memory (it will use disk as a backup but you cannot store more information than you have memory available). It is extremely fast for what it is used for. I personally use it for a small transactional database: I do very simple functions on keys like counting how many times an event happened for a certain user. I also do quick in-memory look ups that I need mapped to different values. Redis is a great tool for a small dataset and it is extremely fast. Configuration is very easy as well.

Aerospike

I personally use Aerospike to replace Redis when it's time to scale. From my understanding, it can be used for more. Like Redis, Aerospike is a key-value store. I believe the open source edition also supports secondary indexes which Redis does not (I have not used secondary indexes in production but have done little testing on them).

Aerospike's best feature is its ability to scale. The biggest problem I needed to solve when looking into Aerospike was scaling my system to handle large data sets while remaining extremely fast. The project I use Aerospike for has very stringent requirements on speed. I usually make 3-4 database lookups plus other processing and need to have sub-50ms transaction times. A few look-ups are on data sets which are 300GB+. I could not find a solution to hold this data and make it accessible in a reasonable amount of time. Redis obviously won't work unless I had a machine which had 300GB+ of RAM. MongoDB started to perform extremely poorly at a size much lower than 300GB. So I gave Aerospike a shot, and it was able to handle everything very well. The best thing about Aerospike: as my data set has grown I have not had to do much more than standing up a new box when needed. The speed has stayed consistent.

I also find Aerospikes documentation very good. It isn't too hard to configure and it's pretty easy to find answers for any issue that comes up.

Conclusion

So, is Aerospike is as good as they claim? Personally, I have seen nothing less than what has been claimed. I haven't had to scale to 1 million TPS but I do believe with enough hardware that would be possible. I also believe the numbers showing a speed difference between Aerospike and MongoDB. Aerospike is a much more "configured" and "planned out" database than MongoDB. Because of this Aerospike will be much faster at scale than MongoDB. It only has to worry about a single (or in case of secondary indices, a few hundred) indexes unlike MongoDB which can change dynamically. The question you really need to be asking is what you are trying to accomplish with your database. Then look into which database will fit your needs best. If you need a scalable, fast, key-value store database I would say Aerospike is probably the best out there.

Let me know if you have any specific questions or need anything clarified. I would probably be able to help you out.

Solution 2:

Speed

Aerospike is faster. Almost any system will be quick with low load or simple data access but Aerospike has stayed consistently fast by optimizing for in-memory and SSD-based storage options. Mongo is fast when used with lots of RAM where for caching but is otherwise slow and has low write performance.

Reliability

Aerospike is very stable, although with simpler data access. MongoDB has historically been problematic with persisting data and failover but is much better now. Because Aerospike has better performance and easier management, it leads to less potential problems when scaling.

Setup/Configuration

The clustering with Aerospike is much easier to setup since all nodes are the same and the client drivers handle connections and failover automatically. MongoDB can be easier if you're setting up a single server as it runs on more platforms natively and you can start it without any configuration.

MongoDB has two major ways of clustering, replica sets (for availability) and sharding (for scalability). We had 5 shards and each shard had a replica-set of 3 servers. That's 15 servers to hold data. Then we had 3 config servers that maintained the cluster configuration and had to add 2 arbiter processes after our first major outage to deal with properly escalating a slave to master. That's a lot of moving pieces and also makes it incredibly hard to change your layout in the future.

In contrast, Aerospike has took much less effort but requires more configuration, most of which cannot be changed once the cluster has started whereas with MongoDB you can create and alter databases anytime.

Aerospike does have the ability to sync multiple clusters (which is complicated to setup) so you can have different active datacenters replicating data and accepting writes, something that MongoDB doesn't really support at all.

Data Access

MongoDB has database/collection/document where each document is just json. Aerospike has namespace/set/record where each record is a collection of key-value "bins", which can then have nested key/value structures. Namespaces are pre-configured and are not dynamic, and names for properties are limited to 14 characters which is annoying to work with.

Both have secondary indexes although MongoDB lets you query immediately by anything while Aerospike requires index setup or custom scripting. Both have built-in aggregation frameworks. Aerospike clients support LUA scripting but MongoDB supports map-reduce and custom javascript functions.

It really depends on what your application needs, but MongoDB wins in flexibility, easier querying and less restrictions.

Cost

Both are now open-source and free. Both have enterprise versions with extra features, but licensing is expensive if you have lots of data. Aerospike might be cheaper since it requires less machines for the same performance.

Overall

For most scenarios, I would recommend Aerospike. The document-store semantics and flexibility of MongoDB are great but scaling and maintaining it as a distributed database is painful. Aerospike is fast and reliable and can run with fewer nodes that are easier to scale.


January 2016: MongoDB has released MongoDB Cloud Manager which is a paid SaaS service that can provision and manage your clusters. This solves a lot of the trouble with configuring Mongo.

March 2017: Both databases have come a long way. Aerospike now has faster replication and more flexible config settings without restarting the whole cluster. MongoDB has new schema enforcement, better performance and even supports joins along with MongoDB Atlas managed service to take away all the scaling issues.


I now highly recommend ScyllaDB which is a Cassandra compatible open-source database with incredible performance, multi-datacenter replication, and no limits on usage.

Solution 3:

I have used MongoDB(2.4) and Aerospike 3 in our production systems. These are the few observation found by our team :-

1)Read/Write throughput by Aerospike is unbeatable. Usually Mongo db works up to certain scale if read requests are at higher side. If you need concurrent read/write as 95/5 percent ratio, Mongo degrades like anything. With Aerospike we have seen very little impact even if this ratio is 90/10. On AWS we have achieved 200k TPS using Aerospike.

2)In Aerospike latency is very low. Read latency was sub-millisecond for 99 percentile at server side. Write latency was sub-millisecond for 80 percentile and within 8ms for 100 percentile. Best thing was that we got almost similar number in different POC, so consistent performance.

3)Very few nodes are sufficient in Aerospike cluster compare to other solutions. Also SSD based data store gives quite impressive numbers, so very cost effective and little maintenance overhead.

4)Now Aerospike is open source, so hope for wider community support :-)

So we are using Aerospike for all the new systems and trying to migrate from MongoDB.