How are these 'bad bots' finding my closed webserver?

I've installed Apache a while ago, and a quick look at my access.log shows that all sorts of unknown IPs are connecting, mostly with a status code 403, 404, 400, 408. I have no idea how they're finding my IP, because i only use it for personal use, and added a robots.txt hoping it'd keep search engines away. I block indexes and there's nothing really important on it.

How are these bots (or people) finding the server? Is it common for this to happen? Are these connections dangerous/what can I do about it?

Also, lots of the IPs come from all sorts of countries, and don't resolve a hostname.

Here's a bunch of examples of what comes through:

in one large sweep, this bot tried to find phpmyadmin:

"GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 403 243 "-" "ZmEu"
"GET /3rdparty/phpMyAdmin/scripts/setup.php HTTP/1.1" 404 235 "-" "ZmEu"
"GET /admin/mysql/scripts/setup.php HTTP/1.1" 404 227 "-" "ZmEu"
"GET /admin/phpmyadmin/scripts/setup.php HTTP/1.1" 404 232 "-" "ZmEu"

i get plenty of these:

"HEAD / HTTP/1.0" 403 - "-" "-"

lots of "proxyheader.php", i get quite a bit requests with http:// links in the GET

"GET http://www.tosunmail.com/proxyheader.php HTTP/1.1" 404 213 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

"CONNECT"

"CONNECT 213.92.8.7:31204 HTTP/1.0" 403 - "-" "-"

"soapCaller.bs"

"GET /user/soapCaller.bs HTTP/1.1" 404 216 "-" "Morfeus Fucking Scanner"

and this really sketchy hex crap..

"\xad\r<\xc8\xda\\\x17Y\xc0@\xd7J\x8f\xf9\xb9\xc6x\ru#<\xea\x1ex\xdc\xb0\xfa\x0c7f("400 226 "-" "-"

empty

"-" 408 - "-" "-"

That's just the gist of it. I get all sorts of junk, even with win95 user-agents.

Thanks.


Welcome to the internet :)

  • How they found you: Chances are, brute force IP scanning. Just like their constant stream of vulnerability scanning on your host once they found it.
  • To prevent in the future: While not totally avoidable, you can inhibit security tools like Fail2Ban on Apache or rate limits - or manually banning - or setting up ACL's
  • It's very common to see this on any outside accessible hardware that responds on common ports
  • It's only dangerous if you have unpatched versions of software on the host that may be vulnerable. These are merely blind attempts to see if you've got anything 'cool' for these script kiddies to tinker with. Think of it as someone walking around the parking lot trying car doors to see if they're unlocked, make sure yours is and chances are he'll leave yours alone.

These are just people trying to find vulnerabilities in servers. Almost certainly done by comprimised machines.

It'll just be people scanning certain IP ranges -- you can see from the phpMyAdmin one, that it is trying to find a badly secured pre-install version of PMA. Once it's found one, it can get surprising access to the system.

Ensure that your system is kept up to date, and you don't have any services that aren't required.


These are robots scanning for known security exploits. They simply scan entire network ranges and will therefore find unadvertised servers like yours. They're not playing nice and don't care about your robots.txt. If they find a vulnerability, they'll either log it (and you can expect a manual attack shortly) or will automatically infect your machine with a rootkit or similar malware. There is very little you can do about this and it's just normal business on the internet. They are the reason why it's important to always have the latest security fixes for your software installed.


As other have noted, they are likely doing brute force scanning. If you are on a dynamic IP address they might be more likely to scan your address. (The following advice assumes Linux/UNIX, but most may be applied to Windows Servers.)

The easiest ways to block them are:

  • Firewall port 80 and only allow a limited range of IP addresses to access your server.
  • Configure ACL(s) in your apache configuration which only allows certain address to access your sever. (You can have different rules for different content.)
  • Require authentication for access from the Internet.
  • Change the server signature to exclude your build. (Not a lot of increased security, but makes version specific attacks a little more difficult.
  • Install a tool like fail2ban, to automatically block their address. Getting the matching pattern(s) right may take a bit of work, but if 400 series errors are uncommon for your sight may not be that difficult.

To limit the damage they can do to your system make sure that the apache process can only write to directories and files that it should be able to change. In most cases the server only needs read access to the content it serves.