ElasticSearch Server Randomly Stops Working

Usual suspects for ES with Kibana are :

  • *Too small amount of memory available for ES** (which you can investigate with any probe system such as Marvel, or something that will send you JVM data outside the VM for monitoring)
  • Long GC durations (turn on GC logging and see if they do not happen when the ES stop responding)

Also the "usual" setup for ES is 3 servers to allow better redundancy when one server is down. But YMMV.

You can try the new G1 garbage collector too, which has (in my case) a much better behavior than CMS in my Kibana ES.

The GC duration problem is usually the one that happens when you're looking somewhere else and will typically lead to a loss of data because ES stops responding.

Good luck with these :)