Redis Python - how to delete all keys according to a specific pattern In python, without python iterating
I'm writing a django management command to handle some of our redis caching. Basically, I need to choose all keys, that confirm to a certain pattern (for example: "prefix:*") and delete them.
I know I can use the cli to do that:
redis-cli KEYS "prefix:*" | xargs redis-cli DEL
But I need to do this from within the app. So I need to use the python binding (I'm using py-redis). I have tried feeding a list into delete, but it fails:
from common.redis_client import get_redis_client
cache = get_redis_client()
x = cache.keys('prefix:*')
x == ['prefix:key1','prefix:key2'] # True
# And now
cache.delete(x)
# returns 0 . nothing is deleted
I know I can iterate over x:
for key in x:
cache.delete(key)
But that would be losing redis awesome speed and misusing its capabilities. Is there a pythonic solution with py-redis, without iteration and/or the cli?
Thanks!
Solution 1:
Use SCAN iterators: https://pypi.python.org/pypi/redis
for key in r.scan_iter("prefix:*"):
r.delete(key)
Solution 2:
Here is a full working example using py-redis:
from redis import StrictRedis
cache = StrictRedis()
def clear_ns(ns):
"""
Clears a namespace
:param ns: str, namespace i.e your:prefix
:return: int, cleared keys
"""
count = 0
ns_keys = ns + '*'
for key in cache.scan_iter(ns_keys):
cache.delete(key)
count += 1
return count
You can also do scan_iter
to get all the keys into memory, and then pass all the keys to delete
for a bulk delete but may take a good chunk of memory for larger namespaces. So probably best to run a delete
for each key.
Cheers!
UPDATE:
Since writing the answer, I started using pipelining feature of redis to send all commands in one request and avoid network latency:
from redis import StrictRedis
cache = StrictRedis()
def clear_cache_ns(ns):
"""
Clears a namespace in redis cache.
This may be very time consuming.
:param ns: str, namespace i.e your:prefix*
:return: int, num cleared keys
"""
count = 0
pipe = cache.pipeline()
for key in cache.scan_iter(ns):
pipe.delete(key)
count += 1
pipe.execute()
return count
UPDATE2 (Best Performing):
If you use scan
instead of scan_iter
, you can control the chunk size and iterate through the cursor using your own logic. This also seems to be a lot faster, especially when dealing with many keys. If you add pipelining to this you will get a bit of a performance boost, 10-25% depending on chunk size, at the cost of memory usage since you will not send the execute command to Redis until everything is generated. So I stuck with scan:
from redis import StrictRedis
cache = StrictRedis()
CHUNK_SIZE = 5000
def clear_ns(ns):
"""
Clears a namespace
:param ns: str, namespace i.e your:prefix
:return: int, cleared keys
"""
cursor = '0'
ns_keys = ns + '*'
while cursor != 0:
cursor, keys = cache.scan(cursor=cursor, match=ns_keys, count=CHUNK_SIZE)
if keys:
cache.delete(*keys)
return True
Here are some benchmarks:
5k chunks using a busy Redis cluster:
Done removing using scan in 4.49929285049
Done removing using scan_iter in 98.4856731892
Done removing using scan_iter & pipe in 66.8833789825
Done removing using scan & pipe in 3.20298910141
5k chunks and a small idle dev redis (localhost):
Done removing using scan in 1.26654982567
Done removing using scan_iter in 13.5976779461
Done removing using scan_iter & pipe in 4.66061878204
Done removing using scan & pipe in 1.13942599297
Solution 3:
I think the
for key in x: cache.delete(key)
is pretty good and concise. delete
really wants one key at a time, so you have to loop.
Otherwise, this previous question and answer points you to a lua-based solution.