Is Redis just a cache?
I have been reading some Redis docs and trying the tutorial at http://try.redis-db.com/. So far, I can't see any difference between Redis and caching technologies like Velocity or the Enterprise Library Caching Framework
You're effectively just adding objects to an in-memory data store using a unique key. There do not seem to be any relational semantics...
What am I missing?
No, Redis is much more than a cache.
Like a Cache, Redis stores key=value pairs. But unlike a cache, Redis lets you operate on the values. There are 5 data types in Redis - Strings, Sets, Hash, Lists and Sorted Sets. Each data type exposes various operations.
The best way to understand Redis is to model an application without thinking about how you are going to store it in a database.
Lets say we want to build StackOverflow.com. To keep it simple, we need Questions, Answers, Tags and Users.
Modeling Questions, Users and Answers
Each object can be modeled as a Map. For example, a Question is a map with fields {id, title, date_asked, votes, asked_by, status}. Similarly, an Answer is a map with fields {id, question_id, answer_text, answered_by, votes, status}. Similarly, we can model a user object.
Each of these objects can be directly stored in Redis as a Hash. To generate unique ids, you can use the atomic increment command. Something like this -
$ HINCRBY unique_ids question 1
(integer) 1
$ HMSET question:1 title "Is Redis just a cache?" asked_by 12 votes 0
OK
$ HINCRBY unique_ids answer 1
(integer) 1
$ HMSET answer:1 question_id 1 answer_text "No, its a lot more" answered_by 15 votes 1
OK
Handling Up Votes
Now, everytime someone upvotes a question or an answer, you just need to do this
$ HINCRBY question:1 votes 1
(integer) 1
$ HINCRBY question:1 votes 1
(integer) 2
List of Questions for Homepage
Next, we want to store the most recent questions to display on the home page. If you were writing a .NET or Java program, you would store the questions in a List. Turns out, that is the best way to store this in Redis as well.
Every time someone asks a question, we add its id to the list.
$ lpush questions question:1
(integer) 1
$ lpush questions question:2
(integer) 1
Now, when you want to render your homepage, you ask Redis for the most recent 25 questions.
$ lrange questions 0 24
1) "question:100"
2) "question:99"
3) "question:98"
4) "question:97"
5) "question:96"
...
25) "question:76"
Now that you have the ids, retrieve items from Redis using pipelining and show them to the user.
Questions by Tags, Sorted by Votes
Next, we want to retrieve questions for each tag. But SO allows you to see top voted questions, new questions or unanswered questions under each tag.
To model this, we use Redis' Sorted Set feature. A Sorted Set allows you to associate a score with each element. You can then retrieve elements based on their scores.
Lets go ahead and do this for the Redis tag
$ zadd questions_by_votes_tagged:redis 2 question:1
(integer) 1
$ zadd questions_by_votes_tagged:redis 10 question:2
(integer) 1
$ zadd questions_by_votes_tagged:redis 5 question:613
(integer) 1
$ zrange questions_by_votes_tagged:redis 0 5
1) "question:1"
2) "question:613"
3) "question:2"
$ zrevrange questions_by_votes_tagged:redis 0 5
1) "question:2"
2) "question:613"
3) "question:1"
What did we do over here? We added questions to a sorted set, and associated a score (number of votes) to each question. Each time a question gets upvoted, we will increment its score. And when a user clicks "Questions tagged Redis, sorted by votes", we just do a zrevrange
and get back the top questions.
Realtime Questions without refreshing page
And finally, a bonus feature. If you keep the questions page opened, SO will notify you when a new question is added. How can Redis help over here?
Redis has a pub-sub model. You can create channels, for example "channel_questions_tagged_redis". You then subscribe
users to a particular channel. When a new question is added, you would publish
a message to that channel. All users would then get the message. You will have to use a web technology like web sockets or comet to actually deliver the message to the browser, but Redis helps you with all the plumbing on the server side.
Persistence, Reliability etc.
Unlike a Cache, Redis persists data on the hard disk. You can have a master-slave setup to provide better reliability. To learn more, go through Persistence and Replication topics over here - http://redis.io/documentation
Not just a cache.
- In memory key-value storage
- Support multiple datatypes (strings, hashes, lists, sets, sorted sets, bitmaps, and hyperloglogs)
- It provides an ability to store cache data into physical storage (if needed).
- Support pub-sub model
- Redis cache provides replication for high availability (master/slave)
Redis has unique abilities like ultra-fast lua-scripts. Its execution time equals to C commands execution. This also brings atomicity for sophisticated Redis data manipulation required for work many advanced objects like Locks and Semaphores.
There is a Redis based in memory data grid called Redisson which allows to easily build distributed application on Java. Thanks to distributed Lock
, Semaphore
, ReadWriteLock
, CountDownLatch
, ConcurrentMap
objects and many others.
Perfectly works in cloud and supports AWS Elasticache, AWS Elasticache Cluster and Azure Redis Cache support
Actually there is no dependency between relative data representation (or any type of data representation) and database role (cache, permanent persistence etc).
Redis is good for cache it's true, but it's much more then just a cache. It's high speed fully in-memory database. It does persist data on disk. It's not relational, it's key-value storage.
We use it in production. Redis helps us to build software that handles thousands of requests per second and keep customer business data during whole natural lifecycle.
Redis is a cache which best suited for distributed environment/Microservice architecture.
It is fast, reliable, provides atomicity and consistency and has range of datatypes such as sets, hashes, list etc.
I am using it from last one year and it really comes as a saviour when you to need provide a production ready solution very fast and for any performance related issues as you can always use it to cache data.