What is the difference between Google Cloud Bigtable and Google Cloud Datastore / App Engine datastore, and what are the main practical advantages/disadvantages? AFAIK Cloud Datastore is build on top of Bigtable.


Solution 1:

Based on experience with Datastore and reading the Bigtable docs, the main differences are:

  • Bigtable was originally designed for HBase compatibility, but now has client libraries in multiple languages. Datastore was originally more geared towards Python/Java/Go web app developers (originally App Engine)
  • Bigtable is 'a bit more IaaS' than Datastore in that it's not 'just there' but requires a cluster to be configured.
  • Bigtable supports only one index - the 'row key' (the entity key in Datastore)
    • This means queries are on the Key, unlike Datastore's indexed properties
  • Bigtable supports atomicity only on a single row - there are no transactions
  • Mutations and deletions appear not to be atomic in Bigtable, whereas Datastore provides eventual and strong consistency, depending on the read/query method
  • The billing model is very different:
    • Datastore charges for read/write operations, storage and bandwidth
    • Bigtable charges for 'nodes', storage and bandwidth

Solution 2:

Bigtable is optimized for high volumes of data and analytics

  • Cloud Bigtable doesn’t replicate data across zones or regions (data within a single cluster is replicated and durable), which means Bigtable is faster and more efficient, and costs are much lower, though it is less durable and available in the default configuration
  • It uses the HBase API - there’s no risk of lock-in or new paradigms to learn
  • It is integrated with the open-source Big Data tools, meaning you can analyze the data stored in Bigtable in most analytics tools customers use (Hadoop, Spark, etc.)
  • Bigtable is indexed by a single Row Key
  • Bigtable is in a single zone

Cloud Bigtable is designed for larger companies and enterprises who often have larger data needs with complex backend workloads.

Datastore is optimized to serve high-value transactional data to applications

  • Cloud Datastore has extremely high availability with replication and data synchronization
  • Datastore, because of its versatility and high availability, is more expensive
  • Datastore is slower writing data due to synchronous replication
  • Datastore has much better functionality around transactions and queries (since secondary indexes exist)

Solution 3:

Bigtable and Datastore are extremely different. Yes, the datastore is build on top of Bigtable, but that does not make it anything like it. That is kind of like saying a car is build on top of wheels, and so a car is not much different from wheels.

Bigtable and Datastore provide very different data models and very different semantics in how the data is changed.

The main difference is that the Datastore provides SQL-database-like ACID transactions on subsets of the data known as entity groups (though the query language GQL is much more restrictive than SQL). Bigtable is strictly NoSQL and comes with much weaker guarantees.