Is there a way to know why a service is restarted and who did it?
- Ubuntu 14.04
- clamav 0.98.7
The problem is clamav-daemon
is restarted nearly daily:
Sep 1 06:30:00 x-master clamd[6778]: Pid file removed.
clamd[6778]: --- Stopped at Tue Sep 1 06:30:00 2015
clamd[5979]: clamd daemon 0.98.7 (OS: linux-gnu, ARCH: x86_64, CPU: x86_64)
clamd[5979]: Running as user root (UID 0, GID 0)
clamd[5979]: Log file size limited to 4294967295 bytes.
clamd[5979]: Reading databases from /var/lib/clamav
clamd[5979]: Not loading PUA signatures.
clamd[5979]: Bytecode: Security mode set to "TrustSigned".
It caused a problem if clamdscan
is running:
/etc/cron.daily/clamav_scan:
ERROR: Could not connect to clamd on x.x.x.x: Connection refused
Notice that I said "nearly" at the beginning:
/var/log/syslog:Sep 1 06:30:00 x-master clamd[6778]: Pid file removed.
/var/log/syslog.1:Aug 31 06:27:54 x-master clamd[20128]: Pid file removed.
/var/log/syslog.4.gz:Aug 28 06:28:34 x-master clamd[4475]: Pid file removed.
/var/log/syslog.5.gz:Aug 27 06:27:47 x-master clamd[21466]: Pid file removed.
As you can see:
- it did not happened at 29 and 30 Aug
-
it often restarted around the 06:27 which is the time
cron.daily
is run27 6 * * * root nice -n 19 ionice -c3 run-parts --report /etc/cron.daily
The content of /etc/cron.daily/clamav_scan
:
find / $exclude_string ! \( -path "/tmp/clamav-*.tmp" -prune \) ! \( -path "/var/lib/elasticsearch" -prune \) ! \( -path "/var/lib/mongodb" -prune \) ! \( -path "/var/lib/graylog-server" -prune \) -mtime -1 -type f -print0 | xargs -0 clamdscan --quiet -l "$status_file" || retval=$?
There is a logrotate file for clamav-daemon:
/var/log/clamav/clamav.log {
rotate 12
weekly
compress
delaycompress
create 640 clamav adm
postrotate
/etc/init.d/clamav-daemon reload-log > /dev/null
endscript
}
but it just reload the log:
Sep 1 02:30:24 uba-master clamd[6778]: SIGHUP caught: re-opening log file.
I know that we can use auditd
to monitor the binary file and here're an example log:
ausearch -f /usr/sbin/clamd [2/178]
----
time->Tue Sep 1 07:56:44 2015
type=PATH msg=audit(1441094204.559:15): item=1 name=(null) inode=2756458 dev=fc:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
type=PATH msg=audit(1441094204.559:15): item=0 name="/usr/sbin/clamd" inode=3428628 dev=fc:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
type=CWD msg=audit(1441094204.559:15): cwd="/"
type=EXECVE msg=audit(1441094204.559:15): argc=1 a0="/usr/sbin/clamd"
type=SYSCALL msg=audit(1441094204.559:15): arch=c000003e syscall=59 success=yes exit=0 a0=7ffd277e03dc a1=7ffd277dfa78 a2=7ffd277dfa88 a3=7ffd277df570 items=2
ppid=5708 pid=5946 auid=4294967295 uid=109 gid=114 euid=109 suid=109 fsuid=109 egid=114 sgid=114 fsgid=114 tty=pts1 ses=4294967295 comm="clamd" exe="/usr/sbin/clamd" key=(null)
109 is the UID of... clamav
user:
getent passwd clamav clamav:x:109:114::/var/lib/clamav:/bin/false
Is there another way to troubleshoot in this case?
Reply to @HBruijn:
Possibly freshclam after updating AV definitions?
I thought about it. Here's the log:
Sep 1 05:31:04 x-master freshclam[16197]: Received signal: wake up
Sep 1 05:31:04 x-master freshclam[16197]: ClamAV update process started at Tue Sep 1 05:31:04 2015
Sep 1 05:31:04 x-master freshclam[16197]: main.cvd is up to date (version: 55, sigs: 2424225, f-level: 60, builder: neo)
Sep 1 05:31:05 x-master freshclam[16197]: Downloading daily-20865.cdiff [100%]
Sep 1 05:31:09 x-master freshclam[16197]: daily.cld updated (version: 20865, sigs: 1555338, f-level: 63, builder: neo)
Sep 1 05:31:10 x-master freshclam[16197]: bytecode.cvd is up to date (version: 268, sigs: 47, f-level: 63, builder: anvilleg)
Sep 1 05:31:13 x-master freshclam[16197]: Database updated (3979610 signatures) from db.local.clamav.net (IP: 168.143.19.95)
Sep 1 05:31:13 x-master freshclam[16197]: Clamd successfully notified about the update.
Sep 1 05:31:13 x-master freshclam[16197]: --------------------------------------
Sep 1 04:34:10 x-master clamd[6778]: SelfCheck: Database status OK.
Sep 1 05:31:13 x-master clamd[6778]: Reading databases from /var/lib/clamav
Sep 1 05:31:22 x-master clamd[6778]: Database correctly reloaded (3974071 signatures)
I'm not sure about this but looks like freshclam has an "internal mechanism" to notify clamd about the update. And after that it can just reload the database, no need to restart the process. Can you confirm?
Moreover, from the timestamp, I saw that clamav-daemon restarted after freshclam update database one hour. Is it normal?
UPDATE Tue Sep 1 22:10:49 ICT 2015
but looks like freshclam has an "internal mechanism" to notify clamd about the update. And after that it can just reload the database, no need to restart the process.
I can confirm this is correct by doing a test:
- edit the freshclam.conf file to change interval to a minutes (
Checks 1440
) - restart clamav-freshclam
- cd /var/lib/clamav
- rm daily.cvd
-
wait for one minute
Sep 1 14:49:25 p freshclam[7654]: Downloading daily.cvd [100%] Sep 1 14:49:28 p freshclam[7654]: daily.cvd updated (version: 19487, sigs: 1191913, f-level: 63, builder: neo) Sep 1 14:49:28 p freshclam[7654]: Reading CVD header (bytecode.cvd): Sep 1 14:49:28 p freshclam[7654]: OK Sep 1 14:49:28 p freshclam[7654]: bytecode.cvd is up to date (version: 245, sigs: 43, f-level: 63, builder: dgoddard) Sep 1 14:49:31 p freshclam[7654]: Database updated (3616181 signatures) from clamav.local (IP: 10.0.2.2) Sep 1 14:49:31 p freshclam[7654]: Clamd successfully notified about the update. Sep 1 14:49:31 p freshclam[7654]: -------------------------------------- Sep 1 14:49:32 p clamd[6693]: Reading databases from /var/lib/clamav Sep 1 14:49:39 p clamd[6693]: Database correctly reloaded (3610621 signatures)
and the clamav-daemon is not restarted.
Please check if you're using any configuration management system, e.g Puppet, Chef, CFEngine, etc. They can interfere with services at a regular interval. For the exact action to be taken to rectify this depends upon how the service is used in the config management system.
Note to myself.
The output from job cache:
----------
ID: clamav-daemon
Function: service.running
Result: True
Comment: Service restarted
Started: 06:27:52.736890
Duration: 12997.632 ms
Changes:
----------
clamav-daemon:
True
Look at the clamav formula:
clamav-daemon:
service:
- running
- order: 50
- require:
- service: clamav-freshclam
- watch:
- pkg: clamav-daemon
- file: clamav-daemon
- user: clamav
Nothing in the watch
ed states was changed:
----------
ID: clamav-daemon
Function: pkg.latest
Result: True
Comment: Package clamav-daemon is already up-to-date.
Started: 06:27:51.531415
Duration: 53.224 ms
Changes:
----------
ID: clamav-daemon
Function: file.managed
Name: /etc/clamav/clamd.conf
Result: True
Comment: File /etc/clamav/clamd.conf is in the correct state
Started: 06:27:51.760019
Duration: 625.075 ms
Changes:
----------
ID: clamav
Function: user.present
Result: True
Comment: User clamav is present and up to date
Started: 06:27:51.590214
Duration: 2.455 ms
Changes:
Why the service has been restarted?
Looking for the watch_in
, I found a state that manage the pid file and the service will be restarted if the pid file got change:
{%- macro manage_pid(path, user, group, watch_in_service, mode=644) -%}
{%- if salt['file.file_exists'](path) %}
{{ path }}:
file:
- managed
- user: {{ user }}
- group: {{ group }}
- mode: {{ mode }}
- replace: False
{%- if caller is defined -%}
{%- for line in caller().split("\n") -%}
{%- if loop.first %}
- require:
{%- endif %}
{{ line|trim|indent(6, indentfirst=True) }}
{%- endfor -%}
{%- endif %}
- watch_in:
- service: {{ watch_in_service }}
{%- else %}
# {{ path }} does not exist, no need to manage
{%- endif -%}
{%- endmacro -%}
{%- call manage_pid('/var/run/clamav/clamd.pid', 'clamav', 'clamav', 'clamav-daemon', 664) %}
- pkg: clamav-daemon
{%- endcall %}
In the output of salt-run jobs.lookup_jid <job id number>
, I saw this:
----------
ID: /var/run/clamav/clamd.pid
Function: file.managed
Result: True
Comment:
Started: 06:27:52.392555
Duration: 2.364 ms
Changes:
----------
group:
clamav
user:
clamav
So, the owner/group of that pid file has been changed to clamav
. Finally, I found the reason is that clamav daemon is running in the network mode as root
user. Therefore, the pid file was created as root
. So, the state that manage the pid file must be changed to something like:
{%- call manage_pid('/var/run/clamav/clamd.pid', 'root', 'root', 'clamav-daemon', 664) %}
- pkg: clamav-daemon
{%- endcall %}