Mysterious misdirected Chinese traffic : How can I find out what DNS server an HTTP request used?
Solution 1:
There is one theoretical way of determining the DNS resolver of your clients, but it's quite advanced and I don't know any off-the-shelf software that will do that for you. You'll for sure have to run a authoritative DNS server for that in addition to your nginx.
In case the HTTP Host header is incorrect, serve an error-document and include a request to a dynamically created, unique FQDN for each and every request which you log to a database. eg.
http://e2665feebe35bc97aff1b329c87b87e7.example.com/img.png
As long as Chinas great firewall doesn't fiddle with that request and the client requests the document from that unique FQDN+URI, each request will result a new DNS lookup to your authoritative DNS for example.com where you can log the IP of the DNS resolver and later correlate this with your dynamically generated URIs.
Solution 2:
I've heard the great firewall used to redirect "blocked" traffic to a handful of phony IPs, but this was causing their blocks to be easily spotted (I'm not sure if it allowed easy subversion). In any case the administrators have started redirecting to random IPs. This has led to some Chinese users getting porn, instead of facebook or vpns, apparently.
I suspect one of your IPs has turned out to be a recipient of blocked chinese traffic - hence you seeing Facebook IPI user agents.
This means the host-header check should be a good one. Most user agents support SNI these days, so you should be able to drop no-host-header traffic with relative impunity.
Edit: http://www.infosecurity-magazine.com/news/great-firewall-upgrade-redirects/
Solution 3:
How can I find out what DNS server those customers are using ?
Contact Chinanet and ask? Seriously, DNS is configurable on the client side. Most people get DNS settings via DHCP, but OpenDNS and Google's DNS offering wouldn't have a business model if you couldn't change them.
Is there anyway to determine if an HTTP request is coming from a VPN ?
Not really, except that the IP would be of the VPN, not the end user in China.
What is really going on here ?
That I can't tell you, but perhaps there's some kind of misconfiguration in the Great Firewall of China?