Apache's htcacheclean doesn't scale: How to tame a huge Apache disk_cache?

through my recent investigations, triggered by similar travails with htcacheclean, i have concluded that the main problem with cleaning large or deep caches, especially those that involve Vary headers, is an issue with the design of the utility itself.

based on poking around in the source code, and watching the output from strace -e trace=unlink, the general approach seems to be as follows:

  1. iterate over all top-level directories (/htcache/B/x/, above)
    • delete any .header and .data files for already-expired entries
    • gather the metadata for all nested entries (/htcache/B/x/i_iGfmmHhxJRheg8NHcQ.header.vary/A/W/oGX3MAV3q0bWl30YmA_A.header, above)
  2. iterate over all nested entry metadata and purge those with response time, .header modtime or .data modtime in the future
  3. iterate over all nested entry metadat and purge those that have expired
  4. iterate over all nested entry metadata to find the oldest; purge it; repeat

and any of the last three steps will return from the purging subroutine once the cache size has dropped below the set threshold.

so with a fast-growing and/or already-large cache, the rate of growth during the extended time required for step #1, can easily prove insurmountable even once you progress to steps #2-#4.

further compounding the problem, if you have not yet satisfied the size limits by the end of step #2, the fact that you have to iterate over all of the metadata for the nested entries to find the oldest, in order to only delete that single entry, then do the same thing all over again, means that the cache is again being allowed to grow faster than you will ever be able to trim it.

/* process remaining entries oldest to newest, the check for an emtpy
 * ring actually isn't necessary except when the compiler does
 * corrupt 64bit arithmetics which happend to me once, so better safe
 * than sorry
 */
while (sum > max && !interrupted && !APR_RING_EMPTY(&root, _entry, link)) {
    oldest = APR_RING_FIRST(&root);

    for (e = APR_RING_NEXT(oldest, link);
         e != APR_RING_SENTINEL(&root, _entry, link);
         e = APR_RING_NEXT(e, link)) {
        if (e->dtime < oldest->dtime) {
            oldest = e;
        }
    }

    delete_entry(path, oldest->basename, pool);
    sum -= oldest->hsize;
    sum -= oldest->dsize;
    entries--;
    APR_RING_REMOVE(oldest, link);
}

the solution?

obviously fast(er) disks would help. but it is not at all clear to me how much of an increase in IO throughput would be required to overcome the inherent problems in the current approach taken by htcacheclean. no dig against the creators or maintainers, but it sure seems like this design was either not tested against, or not ever expected to perform well against, broad, deep, fast-growing caches.

but what does seem to work, and i am still confirming right now, is to trigger htcacheclean from within a bash script that itself loops over the top-level directories.

#!/bin/bash

# desired cache size in integer gigabytes
SIZE=12;
# divide that by the number of top-level directories (4096),
# to get the per-directory limit, in megabytes
LIMIT=$(( $SIZE * 1024 * 1024 * 1024 / 4096 / 1024 / 1024 ))M;

while true;
do
  for i in /htcache/*/*;
  do
    htcacheclean -t -p$i -l$LIMIT;
  done;
done;

basically, this approach allows you to get to the purging steps(#2-#4) much more quickly and frequently, even if only for a small subset of entries. this means that you have a fighting chance of purging content at a rate faster than it is being added to the cache. again, it seems to be working for us, but i've only been testing it for a few days. and our cache targets and growth seem to be on par with yours, but ultimately your mileage may vary.

of course the main point of this posting is that maybe it will be helpful to someone else who stumbles across this question the same way that i did.


10 secs for dir read sounds like you might not be using dir_index

check with

/sbin/tune2fs /dev/wherever | grep dir_index

how to turn on

tune2fs -O dir_index /dev/wherever

but this will only affect newly created dirs, to reindex everything run

e2fsck -D -f /dev/wherever