StackExchange.Redis timeout and "No connection is available to service this operation"

I have the following issues in our production environment (Web-Farm - 4 nodes, on top of it Load balancer):

1) Timeout performing HGET key, inst: 3, queue: 29, qu=0, qs=29, qc=0, wr=0/0 at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor``1 processor, ServerEndPoint server) in ConnectionMultiplexer.cs:line 1699 This happens 3-10 times in a minute

2) No connection is available to service this operation: HGET key at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor``1 processor, ServerEndPoint server) in ConnectionMultiplexer.cs:line 1666

I tried to implement as Marc suggested (Maybe I interpreted it incorrectly) - better to have fewer connections to Redis than multiple. I made the following implementation:

public class SeRedisConnection
{
    private static ConnectionMultiplexer _redis;

    private static readonly object SyncLock = new object();

    public static IDatabase GetDatabase()
    {
        if (_redis == null || !_redis.IsConnected || !_redis.GetDatabase().IsConnected(default(RedisKey)))
        {
            lock (SyncLock)
            {
                try
                {
                    var configurationOptions = new ConfigurationOptions
                    {
                        AbortOnConnectFail = false
                    };
                    configurationOptions.EndPoints.Add(new DnsEndPoint(ConfigurationHelper.CacheServerHost,
                        ConfigurationHelper.CacheServerHostPort));

                    _redis = ConnectionMultiplexer.Connect(configurationOptions);
                }
                catch (Exception ex)
                {
                   IoC.Container.Resolve<IErrorLog>().Error(ex);
                    return null;
                }
            }
        }
        return _redis.GetDatabase();
    }

    public static void Dispose()
    {
        _redis.Dispose();
    }
}

Actually dispose is not being used right now. Also I have some specifics of the implementation which could cause such behavior (I'm only using hashes): 1. Add, Remove hashes - async 2. Get -sync

Could somebody help me how to avoid this behavior?

Thanks a lot in advance!

SOLVED - Increasing Client connection timeout after evaluating network capabilities.

UPDATE 2: Actually it didn't solve the problem. When cache volume starting to get increased e.g. from 2GB. Then I saw the same pattern actually these timeouts were happend about every 5 minutes. And our sites were frozen for some period of time every 5 minutes until fork operation was finished. Then I found out that there is an option to make a fork (save to disk) every x seconds:

save 900 1
save 300 10
save 60 10000

In my case it was "save 300 10" - save in every 5 minutes if at least 10 updates were happened. Also I found out that "fork" could be very expensive. Commented "save" section resolved the problem at all. We can commented "save" section as we are using only Redis as "cache in memory" - we don't need any persistance. Here is configuration of our cache servers "Redis 2.4.6" windows port: https://github.com/rgl/redis/downloads

Maybe it has been solved in recent versions of Redis windows port in MSOpentech: http://msopentech.com/blog/2013/04/22/redis-on-windows-stable-and-reliable/ but I haven't tested yet.

Anyway StackExchange.Redis has nothing to do with this issue and it works pretty stable in our production environment, thanks to Marc Gravell.

FINAL UPDATE: Redis is single-threaded solution - it is ultimately fast but when it comes to the point of releasing the memory (Removing items that are stale or expired) the problems are emerged due to one thread should reclaim the memory (that is not fast operation - whatever algorithm is used) and the same thread should handle GET, SET operations. Of course it happens when we are talking about medium-loaded production environment. Even if you use a cluster with slaves when the memory barrier is reached it will have the same behavior.


Solution 1:

It looks like in most cases this exception is a client issue. Previous versions of StackExchange.Redis used Win32 socket directly which sometimes has a negative impact. Probably Asp.net internal routing somehow related to it.
The good news is that StackExchange.Redis's network infra was completely rewritten recently. The last version is 2.0.513. Try it and there is a good chance that your problem will go.