Apache access.log interpretation

Okay, well i can get that exact log message using the following python code

import urllib
proxies = {'http':'http://myapacheserevr'}
file_handle = urllib.urlopen('http://www.siasatema.com',proxies=proxies)

Which gives me the log entry

192.168.0.28 - - [18/Mar/2011:14:40:40 +0200] "GET http://www.siasatema.com HTTP/1.0" 200 453 "-" "Python-urllib/1.17"

Incidently all i get back from this is the contents of my default web page,

So, yes something is proxying through your webserver it's probably hackers looking for open, badly configured, proxies to connect to and abuse another site. To stop it set:-

ProxyRequests Off 

Incidently i can replicate the other one's by doing

$ nc ubuntuvm 80
telnet 64.12.244.203 80

Which gives:- 192.168.0.28 - - [18/Mar/2011:14:58:47 +0200] "telnet 64.12.244.203 80" 400 505 "-" "-"


Your server is a open proxy which is a security problem. Spammers can send mails through your server. Others can look at child pr0n from your server. Fix it asap.

Change

ProxyRequests Off

,in proxy configuration, to fix it.


I do not think that your apache is an open proxy.

174.34.231.19 - - [18/Mar/2011:02:24:56 +0200] "GET http://www.siasatema.com HTTP/1.1" 200 469 "-" "Python-urllib/2.4"

Could be someone that is sending a HTTP request to your apache server using www.siasatema.com as hostname in the request. The apache server will serve in this case the default page (the first virtual host). You can configure apache to discard this kind of requests or to redirect to your main page.

187.35.50.61 - - [18/Mar/2011:01:28:20 +0200] "POST http://72.26.198.222:80/log/normal/ HTTP/1.0" 404 491 "-" "Octoshape-sua/1010120"
87.117.203.177 - - [18/Mar/2011:01:29:59 +0200] "CONNECT 64.12.244.203:80 HTTP/1.0" 405 556 "-" "-"
87.117.203.177 - - [18/Mar/2011:01:29:59 +0200] "open 64.12.244.203 80" 400 506 "-" "-"
87.117.203.177 - - [18/Mar/2011:01:30:04 +0200] "telnet 64.12.244.203 80" 400 506 "-" "-"
87.117.203.177 - - [18/Mar/2011:01:30:09 +0200] "64.12.244.203 80" 400 301 "-" "-"

Those are unsuccessful requests that apache has discarded.


I believe the case is very simple: someone comes to your server, with correct IP but not with the name you expect, and gets the front page. This can be generated for example with the following:

A broadband user adds to his hosts file (/etc/hosts, or /windows/system32/drivers/etc/hosts) the entry

aa.bb.cc.dd rubbishrubbishrubbish.com

where aa.bb.cc.dd is your IP address, and after that uses any browser to access http://rubbishrubbishrubbish.com

You will see in the log file the access, and the "GET" for the / of rubbishrubbishrubbish.com. Typical apache installations are not interested in the hostname part of the URL, but just in the rest of it, thus returning your homepage.

Note also that your server can of course be accessed by using it's IP address or FQDN, or possible nicknames, which (unless forced explicitly) do not need to be known to the apache server. My home web server can be reached with multiple names over ipv4 and ipv6, but my server itself does not know any of those domain names.

The remaining question is: why? I guess the answer is: to test if you act as proxy. And you do not. And Python-urllib/2.4 has also been reported to be a BOT (but not always).