HBASE Space Used Started Climbing Rapidly

Solution 1:

I think my replication went bad. I seems that .oldlogs is where the Write-Ahead-Logs (WALS) go according to this safari article. They should be cleaned up but were not for some reason.

I used the following to clean it up:

HADOOP_USER_NAME=hdfs hadoop fs -rm -skipTrash /hbase/.oldlogs/*

Since I noticed this while working on building a replacement cluster as the replication target, I've stopped replication for now, and doesn't seem that the directory is growing unbounded anymore. This is something I will monitor going forward. In particular because it seems this might be a bug according to hbase issue 3489.

Solution 2:

HBase is crash-safe and .logs is the location of WALs (hlogs) that are needed for crash recovery. Once the memory of regionservers is flushed to hfiles, WALs are no longer needed for crash recovery and they are moved to .oldlogs. Old logs are usually used for cluster-to-cluster replication. .oldlogs have a configurable retention period, for example 3 days. In this case, if something broke your replication, then you have 3 days to fix replication without the need to reseed. Hope this helps to investigate what happened on Nov 24 causing a growth in .oldlogs size and when to expect automatic deletion of hlogs in .oldlogs