Difference between local and global indexes in DynamoDB
I'm curious about these two secondary indexes and differences between them. It is hard to imagine how this looks like. And I think, this will help more people than just me.
Local Secondary Indexes still rely on the original Hash Key. When you supply a table with hash+range, think about the LSI as hash+range1, hash+range2.. hash+range6. You get 5 more range attributes to query on. Also, there is only one provisioned throughput.
Global Secondary Indexes defines a new paradigm - different hash/range keys per index.
This breaks the original usage of one hash key per table.
This is also why when defining GSI you are required to add a provisioned throughput per index and pay for it.
More detailed information about the differences can be found in the GSI announcement
Here is the formal definition from the documentation:
Global secondary index — an index with a hash and range key that can be different from those on the table. A global secondary index is considered "global" because queries on the index can span all of the data in a table, across all partitions.
Local secondary index — an index that has the same hash key as the table, but a different range key. A local secondary index is "local" in the sense that every partition of a local secondary index is scoped to a table partition that has the same hash key.
However, the differences go way beyond the possibilities in terms of key definitions. Find below some important factors that will directly impact the cost and effort for maintaining the indexes:
- Throughput :
Local Secondary Indexes consume throughput from the table. When you query records via the local index, the operation consumes read capacity units from the table. When you perform a write operation (create, update, delete) in a table that has a local index, there will be two write operations, one for the table another for the index. Both operations will consume write capacity units from the table.
Global Secondary Indexes have their own provisioned throughput, when you query the index the operation will consume read capacity from the index, when you perform a write operation (create, update, delete) in a table that has a global index, there will be two write operations, one for the table another for the index*.
*When defining the provisioned throughput for the Global Secondary Index, make sure you pay special attention to the following requirements:
In order for a table write to succeed, the provisioned throughput settings for the table and all of its global secondary indexes must have enough write capacity to accommodate the write; otherwise, the write to the table will be throttled.
- Management :
Local Secondary Indexes can only be created when you are creating the table, there is no way to add Local Secondary Index to an existing table, also once you create the index you cannot delete it.
Global Secondary Indexes can be created when you create the table and added to an existing table, deleting an existing Global Secondary Index is also allowed.
- Read Consistency:
Local Secondary Indexes support eventual or strong consistency, whereas, Global Secondary Index only supports eventual consistency.
- Projection:
Local Secondary Indexes allow retrieving attributes that are not projected to the index (although with additional cost: performance and consumed capacity units). With Global Secondary Index you can only retrieve the attributes projected to the index.
Special Consideration about the Uniqueness of the Keys Defined to Secondary Indexes:
In a Local Secondary Index, the range key value DOES NOT need to be unique for a given hash key value, same thing applies to Global Secondary Indexes, the key values (Hash and Range) DO NOT need to be unique.
Source: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SecondaryIndexes.html
These are the possible searches by index:
- By Hash
- By Hash + Range
- By Hash + Local Index
- By Global index
- By Global index + Range Index
Hash and Range indexes of a table: These are the usual indexes of previous versions of the Amazon AWS SDK.
Global and Local indexes: These are 'additional' indexes created on a table, in addition to existing hash and range indexes of the table. Global index is similar to a hash. Range index behave similarly to the range index used with the hash of the table. In you entity model in your code, the getter must be annotated in this way:
-
For global indexes:
@DynamoDBIndexHashKey(globalSecondaryIndexName = INDEX_GLOBAL_RANGE_US_TS) @DynamoDBAttribute(attributeName = PROPERTY_USER) public String getUser() { return user; }
-
For range index associated to the global index:
@DynamoDBIndexRangeKey(globalSecondaryIndexName = INDEX_GLOBAL_RANGE_US_TS) @DynamoDBAttribute(attributeName = PROPERTY_TIMESTAMP) public String getTimestamp() { return timestamp; }
Besides, if you read a table by a Global index, it must be an Eventual read (not Consistent read):
queryExpression.setConsistentRead(false);