Why do some query strings work even if parameters are not URL-encoded?

Here's an example:

https://drive.google.com/viewerng/viewer?embedded=true&url=http://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf

The url parameter, http://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf, is not encoded. It contains reserved characters, like the colon, slashes, and question mark.

Why does this still work? And why bother encoding if it works without it?


The reserved characters of an URI are mostly used as delimiters -- it doesn’t mean that they may not be used, it only means that they have a special purpose, and if you don’t need them for this purpose, you have to percent-encode them.

The query component starts with the first ? and ends with the first # (if any, otherwise with the end of the URI). For the query component itself, there are no reserved characters defined.

The URI standard RFC 3986 defines that the query component can contain these characters:

  • a-z, A-Z
  • 0-9
  • / ? : @ ! $ & ' ( ) * + , ; = - . _ ~
  • percent-encoded characters

It even explicitly mentions:

The characters slash ("/") and question mark ("?") may represent data within the query component.


The query component of your example URI is this:

embedded=true&url=http://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf

Apart from letters, it contains =, &, :, /, ., ?, _, all of which are allowed in the query.

Note that the name=value format (separated by &) in the query component is just a convention, not something defined in the specification.


Because in a url some characters have special meanings, a question mark (?) is used to separate the path from the query, an ampersand (&) is used as a separator between key value pairs. So for characters like this, if we were to use them as a value in a query string the browser would get confused, we use encoding so that we can be sure that the data is not ambiguous. All these characters you have shown are not treated ambiguously as they are used in valid places according to the http URL schema.