How to monitor memory usage for alarming purpose

We have embedded Linux system without swap.

Currenly we must raise alarm when memory usage % increases over a threashold. And reboot when memory usage % increases over a (higher) threshold.

Why we want to do that: If some program leaks, we can do safety reboot, before kernel start killing our processes (which may lead to data corruption or unavailability).

But we have a problem:

How to count memory usage-% which can be used for our purpose?

We tried to count memory usage by using values of /proc/meminfo:

/ # cat /proc/meminfo
MemTotal:       126744 kB
MemFree:         58256 kB
Buffers:         16740 kB
Cached:          31308 kB
SwapCached:          0 kB
Active:          37580 kB
Inactive:        24000 kB

Without success:

(MemTotal - MemFree) is not usable, because it contains for example caches.

(MemTotal - MemFree - Buffers - Cached) did ignore effect of Inactive. So it also gives too big memory usage values.

(MemTotal - MemFree - Buffers - Cached - Inactive) is unusable, because result can be negative.


Solution 1:

Monitor system via free

[root@localhost ~]# free
          total       used       free     shared    buffers     cached
Mem:    2058240    1776788     281452          0      89780    1335840
-/+ buffers/cache:  351168    1707072
Swap:   4095992        100    4095892

Look at the -/+ buffers/cache line used and free

Monitor each process via /proc

I used this python script and /proc/pid/stat to monitor the memory of a process:

http://phacker.org/2009/02/20/monitoring-virtual-memory-usage-with-python/

you would probably like to translate something like this to c.

Limit resource for each process

or use ulimit / setrlimit

https://stackoverflow.com/questions/4983120/limit-memory-usage-for-a-single-linux-process

Solution 2:

#!/bin/bash

threshold=90
threshold2=95

freemem=$(($(free -m |awk 'NR==2 {print $3}') * 100))

usage=$(($freemem / 512))

if [ "$usage" -gt "$threshold" ]

then

/etc/init.d/service_name restart

     if [ "$usage" -gt "$threshold2" ]

     then

     echo "The memory usage has reached $usage% on $HOSTNAME." | mail -s "High Memory Usage Alert" [email protected]


     fi
fi

Name this as alert.sh and execute the command: chmod +x alert.sh

Configure a cron to run this script every 10 minutes

Make sure to replace '512' with your server total memory in MB and '[email protected]' with actual email address. This will send an email alert whenever memory usage goes beyond 95 % and will restart the service "service_name" if it reaches 90%

Solution 3:

You can use a shell script in cron with the free command to monitor the memory and act acording its values. For example, to monitor RAM memory:

#!/bin/bash

LOG_DIR=/var/log/memory_monitor.log

DATE=$(date +%d/%m/%Y)
TIME=$(date +%H:%M)
TIMESTAMP="$DATE $TIME"

MONITOR=$(free | grep Mem)
MEM_USED=$(echo $MONITOR | awk '{ print $3 }')
MEM_FREE=$(echo $MONITOR | awk '{ print $4 }')

echo "$TIMESTAMP $MEM_USED $MEM_FREE" >> $LOG_DIR

Instead of echoing the output, you could eval the values to the limits you want and mail, reboot or whatever action you want:

if [ eval_values > threshold ]
then
    # Do stuff (mail, reboot, etc)
fi

Then you add it to crontab to be run in the intervals you want.