Redis strings vs Redis hashes to represent JSON: efficiency?

Solution 1:

This article can provide a lot of insight here: http://redis.io/topics/memory-optimization

There are many ways to store an array of Objects in Redis (spoiler: I like option 1 for most use cases):

  1. Store the entire object as JSON-encoded string in a single key and keep track of all Objects using a set (or list, if more appropriate). For example:

    INCR id:users
    SET user:{id} '{"name":"Fred","age":25}'
    SADD users {id}
    

    Generally speaking, this is probably the best method in most cases. If there are a lot of fields in the Object, your Objects are not nested with other Objects, and you tend to only access a small subset of fields at a time, it might be better to go with option 2.

    Advantages: considered a "good practice." Each Object is a full-blown Redis key. JSON parsing is fast, especially when you need to access many fields for this Object at once. Disadvantages: slower when you only need to access a single field.

  2. Store each Object's properties in a Redis hash.

    INCR id:users
    HMSET user:{id} name "Fred" age 25
    SADD users {id}
    

    Advantages: considered a "good practice." Each Object is a full-blown Redis key. No need to parse JSON strings. Disadvantages: possibly slower when you need to access all/most of the fields in an Object. Also, nested Objects (Objects within Objects) cannot be easily stored.

  3. Store each Object as a JSON string in a Redis hash.

    INCR id:users
    HMSET users {id} '{"name":"Fred","age":25}'
    

    This allows you to consolidate a bit and only use two keys instead of lots of keys. The obvious disadvantage is that you can't set the TTL (and other stuff) on each user Object, since it is merely a field in the Redis hash and not a full-blown Redis key.

    Advantages: JSON parsing is fast, especially when you need to access many fields for this Object at once. Less "polluting" of the main key namespace. Disadvantages: About same memory usage as #1 when you have a lot of Objects. Slower than #2 when you only need to access a single field. Probably not considered a "good practice."

  4. Store each property of each Object in a dedicated key.

    INCR id:users
    SET user:{id}:name "Fred"
    SET user:{id}:age 25
    SADD users {id}
    

    According to the article above, this option is almost never preferred (unless the property of the Object needs to have specific TTL or something).

    Advantages: Object properties are full-blown Redis keys, which might not be overkill for your app. Disadvantages: slow, uses more memory, and not considered "best practice." Lots of polluting of the main key namespace.

Overall Summary

Option 4 is generally not preferred. Options 1 and 2 are very similar, and they are both pretty common. I prefer option 1 (generally speaking) because it allows you to store more complicated Objects (with multiple layers of nesting, etc.) Option 3 is used when you really care about not polluting the main key namespace (i.e. you don't want there to be a lot of keys in your database and you don't care about things like TTL, key sharding, or whatever).

If I got something wrong here, please consider leaving a comment and allowing me to revise the answer before downvoting. Thanks! :)

Solution 2:

It depends on how you access the data:

Go for Option 1:

  • If you use most of the fields on most of your accesses.
  • If there is variance on possible keys

Go for Option 2:

  • If you use just single fields on most of your accesses.
  • If you always know which fields are available

P.S.: As a rule of the thumb, go for the option which requires fewer queries on most of your use cases.