Why does my system hang when I run ps, w and possibly other commands?
I had that happen once when an NFS server went down.
The fact that it's hung trying to read information about pid 17398, and pid 17398 is in D
(disk wait) state, suggests that could be the cause for you too.
read(6, "Name:\tconvert\nState:\tD (disk sle"..., 1023) = 664
open("/proc/17398/cmdline", O_RDONLY) = 6
If you do have NFS mounts, I think the best option is to try to bring the NFS server back up.
Otherwise, umount -f <mount>
might help.
sigh closed-question handling is pretty poor, this'll be the third time I try typing this, so please forgive the terseness.
First, use intr
NFS mounts. The default hard
NFS mounts hang forever. soft
NFS mounts error out after a time out (which might be stupid for transient errors.) intr
lets you decide to interrupt a hung NFS operation. Just right.
Second, to fix this stupid problem, I've used a stupid trick before, it probably still works. Bring up an interface alias on lo
with the NFS Server's IP Address (edit: ifconfig eth0:0 <ipaddress>
). Create an /etc/exports
file that contains a line to export the filesystem that you're hung on (edit: export a filesystem with the same name as the 'hung' filesystem; you'll have to create the same pathname as what you've mounted). Start your NFS server on your local machine, and hopefully your hung program can error out with "file not found" or "directory not found" or something like that, letting you get on with your work without rebooting.
Don't forget to turn off your NFS server again and remove the interface alias when you're done.
I'm not sure why the focus on NFS? Is the asker running NFS? Didn't see anything about that.
Anyways, this is a very strange problem since its /proc. Try the following things to give yourself more info about the problem:
- Go into /proc and find other pid directories and try reading the cmdline files from those directories.
- Try reading /proc/pid/stat as well, if that doesn't work, I'd say your system is having kernel issues.
- Are you able to run netstat -n? This reads from different parts of /proc so it might work and would indicate less of a problem with the proc interface.
- Try remounting /proc with mount -o remount /proc although I have no idea what this would do in this situation.
I would just suggest rebooting. If you can't read stuff from proc, I'm not sure what you're going to find through other methods. If it happens again, then start worrying.