How does Google reCAPTCHA v2 work behind the scenes?

This is speculation, but based on Google's reference to the "risk analysis engine" they use (http://googleonlinesecurity.blogspot.com/2014/12/are-you-robot-introducing-no-captcha.html)

I would assume it looks at how you behaved prior to clicking, how your cursor moved on its way to the check (organic path/acceleration), which part of the checkbox was clicked (random places, or dead on center every time), browser fingerprint, Google cookies & contents, click location history tied to your fingerprint or account if it detects one etc.

It's fairly difficult to fake "organic" behavior in such a way that it would fool a continuously learning pattern detection engine. In the cases where it's not sure, it still prompts you to match an actual CAPTCHA string.


A new paper has been released with several tests against reCAPTCHA:

https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf

Some highlights:

  • By keeping a cookie active for +9 days (by browsing sites with Google resources), you can then pass reCAPTCHA by only clicking the checkbox;
  • There are no restrictions based on requests per IP;
  • The browser's user agent must be real, and Google run tests against your environment to ensure it matches the user agent;
  • Google tests if the browser can render a Canvas;
  • Screen resolution and mouse events don't affect the results;

Google has already fixed the cookie vulnerability and is probably restricting some behaviors based on IPs.

Another interesting finding is that Google runs a VM in JavaScript that obfuscates much of reCAPTCHA code and behavior. This VM is known as botguard and is used to protect other services besides reCAPTCHA:

https://github.com/neuroradiology/InsideReCaptcha

UPDATE 2017

A recent paper (from August) was published on WOOT 2017 achieving 85% accuracy in solving noCAPTCHA reCAPTCHA audio challenges:

http://uncaptcha.cs.umd.edu/papers/uncaptcha_woot17.pdf

UPDATE 2018

Google is introducing reCAPTCHA v3, which looks like a "human score prediction engine" that is calibrated per website. It can be installed into different pages of a website (working like a Google Analytics script) to help reCAPTCHA and the website owner to understand the behaviour of humans vs. bots before filling a reCAPTCHA.

https://www.google.com/recaptcha/intro/v3beta.html