Can't kill a sleeping process
I don't seem to be able to kill -9 a process which is in an interruptible sleep (S) state:
[root@jupiter ~]# ps -elf | grep yum
4 S root 16790 1 0 75 0 - 73779 - Jan15 ? 00:00:04 /usr/bin/python /usr/bin/yum -y install python-pip
[root@jupiter ~]# kill -9 16790
[root@jupiter ~]# ps -elf | grep yum
4 S root 16790 1 0 75 0 - 73779 - Jan15 ? 00:00:04 /usr/bin/python /usr/bin/yum -y install python-pip
How is this possible? Is there any way to kill the process without rebooting?
BOUNTY: I am really more interested in an explanation of how it is possible for this to occur.
UPDATE: This is the output of lsof:
[root@jupiter ~]# lsof -p 16790 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME yum 16790 root cwd DIR 1166,56842 4096 16886249 /home/del yum 16790 root rtd DIR 253,0 4096 2 / yum 16790 root txt REG 253,0 8304 336177337 /usr/bin/python yum 16790 root mem REG 253,0 144776 346128569 /lib64/ld-2.5.so yum 16790 root mem REG 253,0 1718232 346128573 /lib64/libc-2.5.so yum 16790 root mem REG 253,0 23360 346128599 /lib64/libdl-2.5.so yum 16790 root mem REG 253,0 145872 346128584 /lib64/libpthread-2.5.so yum 16790 root mem REG 253,0 615136 346128602 /lib64/libm-2.5.so yum 16790 root mem REG 253,0 1244792 336171087 /usr/lib64/libpython2.4.so.1.0 yum 16790 root mem REG 253,0 95464 346128744 /lib64/libselinux.so.1 yum 16790 root mem REG 253,0 53448 346128750 /lib64/librt-2.5.so yum 16790 root mem REG 253,0 13960 336187564 /usr/lib64/libplds4.so yum 16790 root mem REG 253,0 58400 346128752 /lib64/libgcc_s-4.1.2-20080825.so.1 yum 16790 root mem REG 253,0 78384 336173796 /usr/lib64/libelf-0.137.so yum 16790 root mem REG 253,0 1139672 336187570 /usr/lib64/librpmdb-4.4.so yum 16790 root mem REG 253,0 407792 336187568 /usr/lib64/librpmio-4.4.so yum 16790 root mem REG 253,0 233144 336171420 /usr/lib64/libnspr4.so yum 16790 root mem REG 253,0 375656 336187569 /usr/lib64/libsqlite3.so.0.8.6 yum 16790 root mem REG 253,0 17992 336187563 /usr/lib64/libplc4.so yum 16790 root mem REG 253,0 386784 336187571 /usr/lib64/librpm-4.4.so yum 16790 root mem REG 253,0 154776 336170228 /usr/lib64/librpmbuild-4.4.so yum 16790 root mem REG 253,0 647608 346128759 /lib64/libglib-2.0.so.0.1200.3 yum 16790 root mem REG 253,0 1297136 336176959 /usr/lib64/libxml2.so.2.6.26 yum 16790 root mem REG 253,0 15584 346128756 /lib64/libtermcap.so.2.0.8 yum 16790 root mem REG 253,0 1234328 336187566 /usr/lib64/libnss3.so yum 16790 root mem REG 253,0 18152 346128670 /lib64/libutil-2.5.so yum 16790 root mem REG 253,0 34240 336177071 /usr/lib64/libpopt.so.0.0.0 yum 16790 root mem REG 253,0 67792 336187567 /usr/lib64/libbz2.so.1.0.3 yum 16790 root mem REG 253,0 143144 346128763 /lib64/libexpat.so.0.5.0 yum 16790 root mem REG 253,0 56434416 336184082 /usr/lib/locale/locale-archive yum 16790 root mem REG 253,0 132656 336560181 /usr/lib64/python2.4/site-packages/rpm/_rpmmodule.so yum 16790 root mem REG 253,0 154016 336187565 /usr/lib64/libnssutil3.so yum 16790 root mem REG 253,0 96885 345638632 /usr/local/greenplum-loaders-3.3.0.0-build-3/lib/libz.so.1.2.3 yum 16790 root mem REG 253,0 247496 346128741 /lib64/libsepol.so.1 yum 16790 root mem REG 253,0 369144 336168883 /usr/lib64/libsoftokn3.so yum 16790 root mem REG 253,0 312336 336178453 /usr/lib64/libfreebl3.so yum 16790 root mem REG 253,0 20240 336530067 /usr/lib64/python2.4/lib-dynload/timemodule.so yum 16790 root mem REG 253,0 25048 336529953 /usr/lib64/python2.4/lib-dynload/stropmodule.so yum 16790 root mem REG 253,0 18984 336530051 /usr/lib64/python2.4/lib-dynload/cStringIO.so yum 16790 root mem REG 253,0 21816 336529943 /usr/lib64/python2.4/lib-dynload/collectionsmodule.so yum 16790 root mem REG 253,0 52152 336530044 /usr/lib64/python2.4/lib-dynload/_socketmodule.so yum 16790 root mem REG 253,0 17200 336530045 /usr/lib64/python2.4/lib-dynload/_ssl.so yum 16790 root mem REG 253,0 315080 346128749 /lib64/libssl.so.0.9.8e yum 16790 root mem REG 253,0 1366912 346128748 /lib64/libcrypto.so.0.9.8e yum 16790 root mem REG 253,0 190976 336187552 /usr/lib64/libgssapi_krb5.so.2.2 yum 16790 root mem REG 253,0 613928 336184245 /usr/lib64/libkrb5.so.3.3 yum 16790 root mem REG 253,0 11760 346128747 /lib64/libcom_err.so.2.1 yum 16790 root mem REG 253,0 153720 336181723 /usr/lib64/libk5crypto.so.3.1 yum 16790 root mem REG 253,0 35984 336177832 /usr/lib64/libkrb5support.so.0.1 yum 16790 root mem REG 253,0 9472 346128681 /lib64/libkeyutils-1.2.so yum 16790 root mem REG 253,0 92816 346128730 /lib64/libresolv-2.5.so yum 16790 root mem REG 253,0 75384 336530050 /usr/lib64/python2.4/lib-dynload/cPickle.so yum 16790 root mem REG 253,0 23736 336530064 /usr/lib64/python2.4/lib-dynload/structmodule.so yum 16790 root mem REG 253,0 27336 336528958 /usr/lib64/python2.4/lib-dynload/operator.so yum 16790 root mem REG 253,0 21520 336529958 /usr/lib64/python2.4/lib-dynload/zlibmodule.so yum 16790 root mem REG 253,0 37944 336528952 /usr/lib64/python2.4/lib-dynload/itertoolsmodule.so yum 16790 root mem REG 253,0 21528 336528929 /usr/lib64/python2.4/lib-dynload/_localemodule.so yum 16790 root mem REG 253,0 21208 336529939 /usr/lib64/python2.4/lib-dynload/binascii.so yum 16790 root mem REG 253,0 12080 336530062 /usr/lib64/python2.4/lib-dynload/shamodule.so yum 16790 root mem REG 253,0 13168 336530058 /usr/lib64/python2.4/lib-dynload/md5module.so yum 16790 root mem REG 253,0 18000 336529947 /usr/lib64/python2.4/lib-dynload/mathmodule.so yum 16790 root mem REG 253,0 12504 336529934 /usr/lib64/python2.4/lib-dynload/_randommodule.so yum 16790 root mem REG 253,0 15320 336528948 /usr/lib64/python2.4/lib-dynload/fcntlmodule.so yum 16790 root mem REG 253,0 32816 336530049 /usr/lib64/python2.4/lib-dynload/bz2.so yum 16790 root mem REG 253,0 8608 336529946 /usr/lib64/python2.4/lib-dynload/grpmodule.so yum 16790 root mem REG 253,0 38696 336529819 /usr/lib64/python2.4/site-packages/cElementTree.so yum 16790 root mem REG 253,0 42672 336530047 /usr/lib64/python2.4/lib-dynload/arraymodule.so yum 16790 root mem REG 253,0 9368 336528915 /usr/lib64/python2.4/lib-dynload/_bisect.so yum 16790 root mem REG 253,0 74992 336529944 /usr/lib64/python2.4/lib-dynload/datetime.so yum 16790 root mem REG 253,0 372912 336560510 /usr/lib64/python2.4/site-packages/M2Crypto/__m2crypto.so yum 16790 root mem REG 253,0 7120 336529937 /usr/lib64/python2.4/lib-dynload/_weakref.so yum 16790 root mem REG 253,0 17496 336528966 /usr/lib64/python2.4/lib-dynload/selectmodule.so yum 16790 root mem REG 253,0 46448 336528961 /usr/lib64/python2.4/lib-dynload/pyexpat.so yum 16790 root mem REG 253,0 33896 336529820 /usr/lib64/python2.4/site-packages/_sqlite.so yum 16790 root mem REG 253,0 41784 336530075 /usr/lib64/python2.4/site-packages/_sqlitecache.so yum 16790 root mem REG 253,0 25104 336530066 /usr/lib64/python2.4/lib-dynload/termios.so yum 16790 root mem REG 253,0 7280 336530065 /usr/lib64/python2.4/lib-dynload/syslog.so yum 16790 root mem REG 253,0 25464 336265457 /usr/lib64/gconv/gconv-modules.cache yum 16790 root mem REG 253,0 66544 336528926 /usr/lib64/python2.4/lib-dynload/_cursesmodule.so yum 16790 root mem REG 253,0 380336 336181932 /usr/lib64/libncurses.so.5.5 yum 16790 root mem REG 253,0 405880 336529957 /usr/lib64/python2.4/lib-dynload/unicodedata.so yum 16790 root mem REG 253,0 24576 236520047 /var/lib/rpm/__db.001 yum 16790 root mem REG 253,0 53880 346128424 /lib64/libnss_files-2.5.so yum 16790 root mem REG 253,0 23736 346128408 /lib64/libnss_dns-2.5.so yum 16790 root mem REG 253,0 1318912 236520050 /var/lib/rpm/__db.002 yum 16790 root mem REG 253,0 663552 236520051 /var/lib/rpm/__db.003 yum 16790 root mem REG 253,0 769074 336174965 /usr/share/locale/en_US/LC_MESSAGES/redhat-dist.mo yum 16790 root 0u CHR 136,8 0t0 10 /dev/pts/8 (deleted) yum 16790 root 1u CHR 136,8 0t0 10 /dev/pts/8 (deleted) yum 16790 root 2u CHR 136,8 0t0 10 /dev/pts/8 (deleted) yum 16790 root 3u unix 0xffff8104388d2e40 0t0 4675113 socket yum 16790 root 4w REG 253,0 0 236522326 /var/log/yum.log yum 16790 root 5u REG 253,0 605184 236520025 /var/cache/yum/WANdisco-dev/primary.xml.gz.sqlite yum 16790 root 6u REG 253,0 20480 236524002 /var/cache/yum/addons/primary.sqlite.old.tmp (deleted) yum 16790 root 7u REG 253,0 12578816 236519970 /var/cache/yum/base/primary.xml.gz.sqlite.old.tmp (deleted) yum 16790 root 8u REG 253,0 17972224 236523993 /var/cache/yum/epel/317109b44f1b0b40d910dc60c9080e62c7f4b16a-primary.sqlite.old.tmp (deleted) yum 16790 root 9u REG 253,0 967680 236524055 /var/cache/yum/extras/primary.sqlite.old.tmp (deleted) yum 16790 root 10u REG 253,0 459776 246415366 /var/cache/yum/pgdg92/primary.sqlite.old.tmp (deleted) yum 16790 root 11u REG 253,0 4927488 236524060 /var/cache/yum/updates/primary.sqlite.old.tmp (deleted) yum 16790 root 12r REG 253,0 65204224 236519434 /var/lib/rpm/Packages yum 16790 root 13r REG 253,0 45056 236519438 /var/lib/rpm/Name yum 16790 root 14u IPv4 4675317 0t0 TCP jupiter.example.com:33597->riksun.riken.go.jp:http (ESTABLISHED) yum 16790 root 15u IPv4 4675939 0t0 TCP jupiter.example.com:52708->freedom.itsc.cuhk.edu.hk:http (CLOSE_WAIT) yum 16790 root 16r REG 253,0 65204224 236519434 /var/lib/rpm/Packages yum 16790 root 17r REG 253,0 45056 236519438 /var/lib/rpm/Name yum 16790 root 18r REG 253,0 12288 236519440 /var/lib/rpm/Pubkeys yum 16790 root 20r FIFO 0,6 0t0 4676024 pipe yum 16790 root 24w FIFO 0,6 0t0 4676024 pipe
A process in S or D state is usually in a blocking system call, such as reading or writing to a file or the network, waiting for a called program to finish, or while waiting on semaphores or other synchronization primitives. It will go into the sleep state while waiting.
You can't "wake it up" - it will only proceed when the data/resource it is waiting for becomes available. This is all normal and expected, and only a problem when trying to kill it.
You can try and use strace -p pid
to find out which system call is currently happening
for process pid.
From wikipedia :
An uninterruptible sleep state is a sleep state that won't handle a signal right away. It will wake only as a result of a waited-upon resource becoming available or after a time-out occurs during that wait (if specified when put to sleep). It is mostly used by device drivers waiting for disk or network IO (input/output). When the process is sleeping uninterruptibly, signals accumulated during the sleep will be noticed when the process returns from the system call or trap.
A process blocked in a system call is in uninterruptible sleep, which as its name says, is really uninterruptible even by root.
Normally, processes cannot block SIGKILL. But kernel code can, and processes execute kernel code when they call system calls, during which Kernel code blocks all signals. So if a system call blocks indefinitely, there may effectively be no way to kill the process. The SIGKILL will only take effect whenever the process completes the system call.
Background on a sleeping process
You might want to take a look at this Unix & Linux post.
- https://unix.stackexchange.com/questions/5642/what-if-kill-9-does-not-work
Specifically this answer, https://unix.stackexchange.com/a/5648/7453.
Excerpt from that post
kill -9 (SIGKILL) always works, provided you have the permission to kill the process. Basically either the process must be started by you and not be setuid or setgid, or you must be root. There is one exception: even root cannot send a fatal signal to PID 1 (the init process).
However kill -9 is not guaranteed to work immediately. All signals, including SIGKILL, are delivered asynchronously: the kernel may take its time to deliver them. Usually, delivering a signal takes at most a few microseconds, just the time it takes for the target to get a time slice. However, if the target has blocked the signal, the signal will be queued until the target unblocks it.
Normally, processes cannot block SIGKILL. But kernel code can, and processes execute kernel code when they call system calls. Kernel code blocks all signals when interrupting the system call would result in a badly formed data structure somewhere in the kernel, or more generally in some kernel invariant being violated. So if (due to a bug or misdesign) a system call blocks indefinitely, there may effectively be no way to kill the process. (But the process will be killed if it ever completes the system call.)
...
...
I highly suggest reading the the rest of that answer!
Killing a process that's blocked by a resource (file or network)
Here are 2 things to try.
1. Removing yum's .pid file
Is there a yum lock file present? What happens when you remove that lock file? I think that might allow it to proceed.
rm /var/run/yum.pid
2. Forcing any hanging CLOSE_WAIT
TCP connections closed
A CLOSE_WAIT
is described as follows:
CLOSE_WAIT Indicates that the server has received the first FIN signal from the client and the connection is in the process of being closed
So this essentially means that his is a state where socket is waiting for the application to execute close()
A socket can be in CLOSE_WAIT state indefinitely until the application closes it. Faulty scenarios would be like filedescriptor leak, server not being execute close() on socket leading to pile up of close_wait sockets
NOTE: Excerpt from technet website.
There are 2 tools you can try to use to accomplish this.
- cutter
- Killcx
These tools work by simulating the FIN-ACK-RST exchange that is necessary for a TCP connection to be closed completely.
Killcx works by creating a fake SYN packet with a bogus SeqNum, spoofing the remote client IP/port and sending it to the server. It will fork a child process that will capture the server response, extract the 2 magic values from the ACK packet and use them to send a spoofed RST packet. The connection will then be closed.
NOTE: Excerpt from the Killcx website.
Using cutter
Cuts the specific connection between the two ip/port number pairs given.
# cutter ip-address-1 port-1 ip-address-2 port-2
% cutter 200.1.2.3 22 10.10.0.45 32451
Using Killcx
Cuts connections to remote ip & port.
# killcx remote-ip-address:port
% killcx 120.121.122.123:1234
Resources
- serverfault - How to Forcibly close a socket in time wait
You could try killing the parent process. Use ps to check:
ps xjf -C yum
Then kill -9
any parent processes.