Can't kill a sleeping process

I don't seem to be able to kill -9 a process which is in an interruptible sleep (S) state:

[root@jupiter ~]# ps -elf | grep yum
4 S root     16790     1  0  75   0 - 73779 -      Jan15 ?        00:00:04 /usr/bin/python /usr/bin/yum -y install python-pip
[root@jupiter ~]# kill -9 16790
[root@jupiter ~]# ps -elf | grep yum
4 S root     16790     1  0  75   0 - 73779 -      Jan15 ?        00:00:04 /usr/bin/python /usr/bin/yum -y install python-pip

How is this possible? Is there any way to kill the process without rebooting?

BOUNTY: I am really more interested in an explanation of how it is possible for this to occur.

UPDATE: This is the output of lsof:

[root@jupiter ~]# lsof -p 16790
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
yum     16790 root  cwd    DIR         1166,56842     4096  16886249 /home/del
yum     16790 root  rtd    DIR              253,0     4096         2 /
yum     16790 root  txt    REG              253,0     8304 336177337 /usr/bin/python
yum     16790 root  mem    REG              253,0   144776 346128569 /lib64/ld-2.5.so
yum     16790 root  mem    REG              253,0  1718232 346128573 /lib64/libc-2.5.so
yum     16790 root  mem    REG              253,0    23360 346128599 /lib64/libdl-2.5.so
yum     16790 root  mem    REG              253,0   145872 346128584 /lib64/libpthread-2.5.so
yum     16790 root  mem    REG              253,0   615136 346128602 /lib64/libm-2.5.so
yum     16790 root  mem    REG              253,0  1244792 336171087 /usr/lib64/libpython2.4.so.1.0
yum     16790 root  mem    REG              253,0    95464 346128744 /lib64/libselinux.so.1
yum     16790 root  mem    REG              253,0    53448 346128750 /lib64/librt-2.5.so
yum     16790 root  mem    REG              253,0    13960 336187564 /usr/lib64/libplds4.so
yum     16790 root  mem    REG              253,0    58400 346128752 /lib64/libgcc_s-4.1.2-20080825.so.1
yum     16790 root  mem    REG              253,0    78384 336173796 /usr/lib64/libelf-0.137.so
yum     16790 root  mem    REG              253,0  1139672 336187570 /usr/lib64/librpmdb-4.4.so
yum     16790 root  mem    REG              253,0   407792 336187568 /usr/lib64/librpmio-4.4.so
yum     16790 root  mem    REG              253,0   233144 336171420 /usr/lib64/libnspr4.so
yum     16790 root  mem    REG              253,0   375656 336187569 /usr/lib64/libsqlite3.so.0.8.6
yum     16790 root  mem    REG              253,0    17992 336187563 /usr/lib64/libplc4.so
yum     16790 root  mem    REG              253,0   386784 336187571 /usr/lib64/librpm-4.4.so
yum     16790 root  mem    REG              253,0   154776 336170228 /usr/lib64/librpmbuild-4.4.so
yum     16790 root  mem    REG              253,0   647608 346128759 /lib64/libglib-2.0.so.0.1200.3
yum     16790 root  mem    REG              253,0  1297136 336176959 /usr/lib64/libxml2.so.2.6.26
yum     16790 root  mem    REG              253,0    15584 346128756 /lib64/libtermcap.so.2.0.8
yum     16790 root  mem    REG              253,0  1234328 336187566 /usr/lib64/libnss3.so
yum     16790 root  mem    REG              253,0    18152 346128670 /lib64/libutil-2.5.so
yum     16790 root  mem    REG              253,0    34240 336177071 /usr/lib64/libpopt.so.0.0.0
yum     16790 root  mem    REG              253,0    67792 336187567 /usr/lib64/libbz2.so.1.0.3
yum     16790 root  mem    REG              253,0   143144 346128763 /lib64/libexpat.so.0.5.0
yum     16790 root  mem    REG              253,0 56434416 336184082 /usr/lib/locale/locale-archive
yum     16790 root  mem    REG              253,0   132656 336560181 /usr/lib64/python2.4/site-packages/rpm/_rpmmodule.so
yum     16790 root  mem    REG              253,0   154016 336187565 /usr/lib64/libnssutil3.so
yum     16790 root  mem    REG              253,0    96885 345638632 /usr/local/greenplum-loaders-3.3.0.0-build-3/lib/libz.so.1.2.3
yum     16790 root  mem    REG              253,0   247496 346128741 /lib64/libsepol.so.1
yum     16790 root  mem    REG              253,0   369144 336168883 /usr/lib64/libsoftokn3.so
yum     16790 root  mem    REG              253,0   312336 336178453 /usr/lib64/libfreebl3.so
yum     16790 root  mem    REG              253,0    20240 336530067 /usr/lib64/python2.4/lib-dynload/timemodule.so
yum     16790 root  mem    REG              253,0    25048 336529953 /usr/lib64/python2.4/lib-dynload/stropmodule.so
yum     16790 root  mem    REG              253,0    18984 336530051 /usr/lib64/python2.4/lib-dynload/cStringIO.so
yum     16790 root  mem    REG              253,0    21816 336529943 /usr/lib64/python2.4/lib-dynload/collectionsmodule.so
yum     16790 root  mem    REG              253,0    52152 336530044 /usr/lib64/python2.4/lib-dynload/_socketmodule.so
yum     16790 root  mem    REG              253,0    17200 336530045 /usr/lib64/python2.4/lib-dynload/_ssl.so
yum     16790 root  mem    REG              253,0   315080 346128749 /lib64/libssl.so.0.9.8e
yum     16790 root  mem    REG              253,0  1366912 346128748 /lib64/libcrypto.so.0.9.8e
yum     16790 root  mem    REG              253,0   190976 336187552 /usr/lib64/libgssapi_krb5.so.2.2
yum     16790 root  mem    REG              253,0   613928 336184245 /usr/lib64/libkrb5.so.3.3
yum     16790 root  mem    REG              253,0    11760 346128747 /lib64/libcom_err.so.2.1
yum     16790 root  mem    REG              253,0   153720 336181723 /usr/lib64/libk5crypto.so.3.1
yum     16790 root  mem    REG              253,0    35984 336177832 /usr/lib64/libkrb5support.so.0.1
yum     16790 root  mem    REG              253,0     9472 346128681 /lib64/libkeyutils-1.2.so
yum     16790 root  mem    REG              253,0    92816 346128730 /lib64/libresolv-2.5.so
yum     16790 root  mem    REG              253,0    75384 336530050 /usr/lib64/python2.4/lib-dynload/cPickle.so
yum     16790 root  mem    REG              253,0    23736 336530064 /usr/lib64/python2.4/lib-dynload/structmodule.so
yum     16790 root  mem    REG              253,0    27336 336528958 /usr/lib64/python2.4/lib-dynload/operator.so
yum     16790 root  mem    REG              253,0    21520 336529958 /usr/lib64/python2.4/lib-dynload/zlibmodule.so
yum     16790 root  mem    REG              253,0    37944 336528952 /usr/lib64/python2.4/lib-dynload/itertoolsmodule.so
yum     16790 root  mem    REG              253,0    21528 336528929 /usr/lib64/python2.4/lib-dynload/_localemodule.so
yum     16790 root  mem    REG              253,0    21208 336529939 /usr/lib64/python2.4/lib-dynload/binascii.so
yum     16790 root  mem    REG              253,0    12080 336530062 /usr/lib64/python2.4/lib-dynload/shamodule.so
yum     16790 root  mem    REG              253,0    13168 336530058 /usr/lib64/python2.4/lib-dynload/md5module.so
yum     16790 root  mem    REG              253,0    18000 336529947 /usr/lib64/python2.4/lib-dynload/mathmodule.so
yum     16790 root  mem    REG              253,0    12504 336529934 /usr/lib64/python2.4/lib-dynload/_randommodule.so
yum     16790 root  mem    REG              253,0    15320 336528948 /usr/lib64/python2.4/lib-dynload/fcntlmodule.so
yum     16790 root  mem    REG              253,0    32816 336530049 /usr/lib64/python2.4/lib-dynload/bz2.so
yum     16790 root  mem    REG              253,0     8608 336529946 /usr/lib64/python2.4/lib-dynload/grpmodule.so
yum     16790 root  mem    REG              253,0    38696 336529819 /usr/lib64/python2.4/site-packages/cElementTree.so
yum     16790 root  mem    REG              253,0    42672 336530047 /usr/lib64/python2.4/lib-dynload/arraymodule.so
yum     16790 root  mem    REG              253,0     9368 336528915 /usr/lib64/python2.4/lib-dynload/_bisect.so
yum     16790 root  mem    REG              253,0    74992 336529944 /usr/lib64/python2.4/lib-dynload/datetime.so
yum     16790 root  mem    REG              253,0   372912 336560510 /usr/lib64/python2.4/site-packages/M2Crypto/__m2crypto.so
yum     16790 root  mem    REG              253,0     7120 336529937 /usr/lib64/python2.4/lib-dynload/_weakref.so
yum     16790 root  mem    REG              253,0    17496 336528966 /usr/lib64/python2.4/lib-dynload/selectmodule.so
yum     16790 root  mem    REG              253,0    46448 336528961 /usr/lib64/python2.4/lib-dynload/pyexpat.so
yum     16790 root  mem    REG              253,0    33896 336529820 /usr/lib64/python2.4/site-packages/_sqlite.so
yum     16790 root  mem    REG              253,0    41784 336530075 /usr/lib64/python2.4/site-packages/_sqlitecache.so
yum     16790 root  mem    REG              253,0    25104 336530066 /usr/lib64/python2.4/lib-dynload/termios.so
yum     16790 root  mem    REG              253,0     7280 336530065 /usr/lib64/python2.4/lib-dynload/syslog.so
yum     16790 root  mem    REG              253,0    25464 336265457 /usr/lib64/gconv/gconv-modules.cache
yum     16790 root  mem    REG              253,0    66544 336528926 /usr/lib64/python2.4/lib-dynload/_cursesmodule.so
yum     16790 root  mem    REG              253,0   380336 336181932 /usr/lib64/libncurses.so.5.5
yum     16790 root  mem    REG              253,0   405880 336529957 /usr/lib64/python2.4/lib-dynload/unicodedata.so
yum     16790 root  mem    REG              253,0    24576 236520047 /var/lib/rpm/__db.001
yum     16790 root  mem    REG              253,0    53880 346128424 /lib64/libnss_files-2.5.so
yum     16790 root  mem    REG              253,0    23736 346128408 /lib64/libnss_dns-2.5.so
yum     16790 root  mem    REG              253,0  1318912 236520050 /var/lib/rpm/__db.002
yum     16790 root  mem    REG              253,0   663552 236520051 /var/lib/rpm/__db.003
yum     16790 root  mem    REG              253,0   769074 336174965 /usr/share/locale/en_US/LC_MESSAGES/redhat-dist.mo
yum     16790 root    0u   CHR              136,8      0t0        10 /dev/pts/8 (deleted)
yum     16790 root    1u   CHR              136,8      0t0        10 /dev/pts/8 (deleted)
yum     16790 root    2u   CHR              136,8      0t0        10 /dev/pts/8 (deleted)
yum     16790 root    3u  unix 0xffff8104388d2e40      0t0   4675113 socket
yum     16790 root    4w   REG              253,0        0 236522326 /var/log/yum.log
yum     16790 root    5u   REG              253,0   605184 236520025 /var/cache/yum/WANdisco-dev/primary.xml.gz.sqlite
yum     16790 root    6u   REG              253,0    20480 236524002 /var/cache/yum/addons/primary.sqlite.old.tmp (deleted)
yum     16790 root    7u   REG              253,0 12578816 236519970 /var/cache/yum/base/primary.xml.gz.sqlite.old.tmp (deleted)
yum     16790 root    8u   REG              253,0 17972224 236523993 /var/cache/yum/epel/317109b44f1b0b40d910dc60c9080e62c7f4b16a-primary.sqlite.old.tmp (deleted)
yum     16790 root    9u   REG              253,0   967680 236524055 /var/cache/yum/extras/primary.sqlite.old.tmp (deleted)
yum     16790 root   10u   REG              253,0   459776 246415366 /var/cache/yum/pgdg92/primary.sqlite.old.tmp (deleted)
yum     16790 root   11u   REG              253,0  4927488 236524060 /var/cache/yum/updates/primary.sqlite.old.tmp (deleted)
yum     16790 root   12r   REG              253,0 65204224 236519434 /var/lib/rpm/Packages
yum     16790 root   13r   REG              253,0    45056 236519438 /var/lib/rpm/Name
yum     16790 root   14u  IPv4            4675317      0t0       TCP jupiter.example.com:33597->riksun.riken.go.jp:http (ESTABLISHED)
yum     16790 root   15u  IPv4            4675939      0t0       TCP jupiter.example.com:52708->freedom.itsc.cuhk.edu.hk:http (CLOSE_WAIT)
yum     16790 root   16r   REG              253,0 65204224 236519434 /var/lib/rpm/Packages
yum     16790 root   17r   REG              253,0    45056 236519438 /var/lib/rpm/Name
yum     16790 root   18r   REG              253,0    12288 236519440 /var/lib/rpm/Pubkeys
yum     16790 root   20r  FIFO                0,6      0t0   4676024 pipe
yum     16790 root   24w  FIFO                0,6      0t0   4676024 pipe

A process in S or D state is usually in a blocking system call, such as reading or writing to a file or the network, waiting for a called program to finish, or while waiting on semaphores or other synchronization primitives. It will go into the sleep state while waiting.

You can't "wake it up" - it will only proceed when the data/resource it is waiting for becomes available. This is all normal and expected, and only a problem when trying to kill it.

You can try and use strace -p pid to find out which system call is currently happening for process pid.

From wikipedia :

An uninterruptible sleep state is a sleep state that won't handle a signal right away. It will wake only as a result of a waited-upon resource becoming available or after a time-out occurs during that wait (if specified when put to sleep). It is mostly used by device drivers waiting for disk or network IO (input/output). When the process is sleeping uninterruptibly, signals accumulated during the sleep will be noticed when the process returns from the system call or trap.

A process blocked in a system call is in uninterruptible sleep, which as its name says, is really uninterruptible even by root.

Normally, processes cannot block SIGKILL. But kernel code can, and processes execute kernel code when they call system calls, during which Kernel code blocks all signals. So if a system call blocks indefinitely, there may effectively be no way to kill the process. The SIGKILL will only take effect whenever the process completes the system call.


Background on a sleeping process

You might want to take a look at this Unix & Linux post.

  • https://unix.stackexchange.com/questions/5642/what-if-kill-9-does-not-work

Specifically this answer, https://unix.stackexchange.com/a/5648/7453.

Excerpt from that post

kill -9 (SIGKILL) always works, provided you have the permission to kill the process. Basically either the process must be started by you and not be setuid or setgid, or you must be root. There is one exception: even root cannot send a fatal signal to PID 1 (the init process).

However kill -9 is not guaranteed to work immediately. All signals, including SIGKILL, are delivered asynchronously: the kernel may take its time to deliver them. Usually, delivering a signal takes at most a few microseconds, just the time it takes for the target to get a time slice. However, if the target has blocked the signal, the signal will be queued until the target unblocks it.

Normally, processes cannot block SIGKILL. But kernel code can, and processes execute kernel code when they call system calls. Kernel code blocks all signals when interrupting the system call would result in a badly formed data structure somewhere in the kernel, or more generally in some kernel invariant being violated. So if (due to a bug or misdesign) a system call blocks indefinitely, there may effectively be no way to kill the process. (But the process will be killed if it ever completes the system call.)

...

...

I highly suggest reading the the rest of that answer!

Killing a process that's blocked by a resource (file or network)

Here are 2 things to try.

1. Removing yum's .pid file

Is there a yum lock file present? What happens when you remove that lock file? I think that might allow it to proceed.

rm /var/run/yum.pid

2. Forcing any hanging CLOSE_WAIT TCP connections closed

A CLOSE_WAIT is described as follows:

CLOSE_WAIT Indicates that the server has received the first FIN signal from the client and the connection is in the process of being closed

So this essentially means that his is a state where socket is waiting for the application to execute close()

A socket can be in CLOSE_WAIT state indefinitely until the application closes it. Faulty scenarios would be like filedescriptor leak, server not being execute close() on socket leading to pile up of close_wait sockets

NOTE: Excerpt from technet website.

There are 2 tools you can try to use to accomplish this.

  • cutter
  • Killcx

These tools work by simulating the FIN-ACK-RST exchange that is necessary for a TCP connection to be closed completely.

Killcx works by creating a fake SYN packet with a bogus SeqNum, spoofing the remote client IP/port and sending it to the server. It will fork a child process that will capture the server response, extract the 2 magic values from the ACK packet and use them to send a spoofed RST packet. The connection will then be closed.

NOTE: Excerpt from the Killcx website.

Using cutter

Cuts the specific connection between the two ip/port number pairs given.

# cutter ip-address-1 port-1 ip-address-2 port-2
% cutter 200.1.2.3 22 10.10.0.45 32451

Using Killcx

Cuts connections to remote ip & port.

# killcx remote-ip-address:port
% killcx 120.121.122.123:1234

Resources

  • serverfault - How to Forcibly close a socket in time wait

You could try killing the parent process. Use ps to check:

ps xjf -C yum

Then kill -9 any parent processes.