What effect does it have on a server when you kill all root processes?

Solution 1:

A quick answer is that you killed sshd (and lord knows what else) and will be unable to log back into the system using SSH. Unless you have some other method of gaining access to the system (such a remote console, IPMI, etc), you will need to reboot the system which will restore the SSH service and other services.

Hopefully you have physical access to the box, in which case you probably just need to hit the power button. Realize that you killed many processes and be prepared for some corruption. Linux is designed to recover from a system crash, and you essentially triggered a 'manual' crash. Most things should recover fine after a reboot. You may have all sorts of interesting error messages in the logfiles.


Long answer:

This is a great thought experiment and a good job interview question. "What happens if you did X..." This is a fun thing to try on your own private virtual machine, but should never be done on a real box. Everyone makes mistakes. Remember and learn from your mistake. Making mistakes is the best way to learn. Making mistakes on production is a painful lesson that will happen occasionally in your career.

pkill -KILL -u root

This command will send a 'SIGKILL' (e.g. kill -9. KILL is an alias for SIGKILL) to all processes owned by root. It is a very bad thing to do on a system. kill -9 should be avoided except as a last resort.

Your command aggressively killed all process owned by root, the processes were killed immediately and were not given a chance to clean up. To get a sense of what you killed, log into a healthy Linux box and list the processes owned by root, using a command like one of these. You typically do not need to be root to run these commands:

$ pgrep -u root -l
$ ps aux | grep root

You may have killed Init (PID #1) which spawns new processes. Your system may be unable to create new processes. So, it may continue to function for now but is sick and needs to be repaired as soon as possible. As time goes on, the system will get more and more sick. The longer you wait, the worse it will get.

UPDATE: Web server is still running. But I can't connect by SSH now. I have no idea what I've done.

I am guessing that you are using Apache. It appears that the child processes of the webserver are still running because they are not owned by user 'root'. However, the parent webserver process is normally owned by root and you killed it. As a result, new child processes will not spawn. This will be fine for a time, because you probably have enough child processes to serve requests, and typically those child processes will persist until they are killed or they crash. Again, the quickest fix is to reboot the machine.

Solution 2:

You will most likely have to restart your system as you have killed pretty much every critical service on it. How you do that depends on what tools you have or what transport you have to get to the data centre.

Solution 3:

The system is running because the kernel is running. You can't access sshd because you have killed the daemon. Probably init has been terminated too, meaning you can't create new processes. So, new apache connections might not establish (configuration parameters applied ;)).

You can't send a signal to the kernel threads, that is why the system is running but the root owned services have been terminated and for a normal resurrection, you ought to reboot it.