How do sites detect bots behind proxies or company networks

Solution 1:

No, they'll ban the public IP and everyone who is NAT'd to that IP will also be banned.

Although at least At stack if we think we are going to ban a college or something like that we'll reach out to their abuse contact to get them to track the offender down and stop the issue.

Solution 2:

A site cannot directly ban an IP which is behind NAT. It could act on IPs passed through non-anonymising HTTP proxies - when such a proxy forwards a request on, it typically appends that address to an X-Forwarded-For header, so if access from your private network actually has to go via such a proxy the internal IP could be exposed; however most sites (wikipedia included) wouldn't trust the information in that header anyway because it's easy to spoof to implicate innocent IPs or evade bans.

There are other techniques that attempt to uniquely identify users independently of IP address however. You can interrogate a web browser for a lot of information about it and the system it's running on, such as the user-agent, screen resolution, list of plugins, etc. - see https://github.com/carlo/jquery-browser-fingerprint for an example of this in practice. You could use such fingerprints to control access, though depending on site design you may be able to interact with it without engaging with the fingerprinting process, and even if you can't a bot could provide spurious and randomised data in order to avoid having a consistent fingerprint if you are aware this kind of protection is in place. This method of control also runs the risk of false positives especially when it comes to mobile devices where there will probably be large numbers of clients running identical stock clients on identical stock hardware (most people on a specific model of iPhone running a specific version of iOS, for instance, would probably get the same fingerprint). Fingerprinting like this is normally just used for user tracking rather than to enforce controls but I am aware of places which do use fingerprinting to implement bans when there is concern that an IP block would be too broad, and could be effective against a naive bot.