Automatically suspend/hibernate a process when too much memory taken
Solution 1:
What you are referring to is process checkpointing. There is some work in the later kernels to offer this (in conjunction with the freezer cgroup) but its not ready yet.
This is actually very difficult to achieve well unfortunately because certain resources which are shared go stale after being unavailable for a fixed period of time (TCP springs to mind, although this may also apply to applications that use a wall clock, or perhaps some shared memory that changes state during a processes offline period).
As for stopping the process when it reaches a certain memory utilization, theres a hack I can think of that will do this.
- You create a cgroup that contains the freezer and memory subsystems.
- Place your task(s) inside of the cgroup.
- Attach a process to
cgroup.event_control
and set a memory threshold that you do not want to exceed (this is somewhat explained in the kernel documentation.) - At exceed time you freeze the cgroup. The kernel should eventually evict these pages to swap (providing your cgroup has enough).
Note the "freeze" cgroup will not evict pages to a media persistent location, but it will swap the pages out when enough time has passed and the pages are needed for something else.
Even if this does work (its pretty hacky if it did) you need to consider whether or not this is really doing anything to solve your problem.
- How do you know it wouldn't be better to allow a process using a lot of memory to just go faster to finish quickly its memory intensive period and relinquish the memory?
- If you try to wake processes up fairly by round-robining processes - you could argue you're doing a worse job than what the CPU scheduler is already doing for you.
- If some processes are more important than others (and should be woken up longer/finish quicker) its probably better to just allocate them more cpu time than keeping other processes completely frozen.
- Whilst it would be slow -- you could add a lot of swap (so you can never overcommit) then greatly reduce the interactivity of the scheduler to try to help you reduce aggressive page evictions. This is done in
sched_min_granularity_ns
.
Unfortunately, the best solution would be the ability to checkpoint your tasks. Its a shame that most of the implementations are just not that concrete enough yet.
Alternatively, you could wait a couple of years for proper checkpoint/restore to be available in the kernel!