What is the minimum valid JSON?

I've carefully read the JSON description http://json.org/ but I'm not sure I know the answer to the simple question. What strings are the minimum possible valid JSON?

  • "string" is the string valid JSON?
  • 42 is the simple number valid JSON?
  • true is the boolean value a valid JSON?
  • {} is the empty object a valid JSON?
  • [] is the empty array a valid JSON?

Solution 1:

At the time of writing, JSON was solely described in RFC4627. It describes (at the start of "2") a JSON text as being a serialized object or array.

This means that only {} and [] are valid, complete JSON strings in parsers and stringifiers which adhere to that standard.

However, the introduction of ECMA-404 changes that, and the updated advice can be read here. I've also written a blog post on the issue.


To confuse the matter further however, the JSON object (e.g. JSON.parse() and JSON.stringify()) available in web browsers is standardised in ES5, and that clearly defines the acceptable JSON texts like so:

The JSON interchange format used in this specification is exactly that described by RFC 4627 with two exceptions:

  • The top level JSONText production of the ECMAScript JSON grammar may consist of any JSONValue rather than being restricted to being a JSONObject or a JSONArray as specified by RFC 4627.

  • snipped

This would mean that all JSON values (including strings, nulls and numbers) are accepted by the JSON object, even though the JSON object technically adheres to RFC 4627.

Note that you could therefore stringify a number in a conformant browser via JSON.stringify(5), which would be rejected by another parser that adheres to RFC4627, but which doesn't have the specific exception listed above. Ruby, for example, would seem to be one such example which only accepts objects and arrays as the root. PHP, on the other hand, specifically adds the exception that "it will also encode and decode scalar types and NULL".

Solution 2:

There are at least four documents which can be considered JSON standards on the Internet. The RFCs referenced all describe the mime type application/json. Here is what each has to say about the top-level values, and whether anything other than an object or array is allowed at the top:

RFC-4627: No.

A JSON text is a sequence of tokens. The set of tokens includes six structural characters, strings, numbers, and three literal names.

A JSON text is a serialized object or array.

JSON-text = object / array

Note that RFC-4627 was marked "informational" as opposed to "proposed standard", and that it is obsoleted by RFC-7159, which in turn is obsoleted by RFC-8259.

RFC-8259: Yes.

A JSON text is a sequence of tokens. The set of tokens includes six structural characters, strings, numbers, and three literal names.

A JSON text is a serialized value. Note that certain previous specifications of JSON constrained a JSON text to be an object or an array. Implementations that generate only objects or arrays where a JSON text is called for will be interoperable in the sense that all implementations will accept these as conforming JSON texts.

JSON-text = ws value ws

RFC-8259 is dated December 2017 and is marked "INTERNET STANDARD".

ECMA-262: Yes.

The JSON Syntactic Grammar defines a valid JSON text in terms of tokens defined by the JSON lexical grammar. The goal symbol of the grammar is JSONText.

Syntax JSONText :

JSONValue

JSONValue :

JSONNullLiteral

JSONBooleanLiteral

JSONObject

JSONArray

JSONString

JSONNumber

ECMA-404: Yes.

A JSON text is a sequence of tokens formed from Unicode code points that conforms to the JSON value grammar. The set of tokens includes six structural tokens, strings, numbers, and three literal name tokens.

Solution 3:

According to the old definition in RFC 4627 (which was obsoleted in March 2014 by RFC 7159), those were all valid "JSON values", but only the last two would constitute a complete "JSON text":

A JSON text is a serialized object or array.

Depending on the parser used, the lone "JSON values" might be accepted anyway. For example (sticking to the "JSON value" vs "JSON text" terminology):

  • the JSON.parse() function now standardised in modern browsers accepts any "JSON value"
  • the PHP function json_decode was introduced in version 5.2.0 only accepting a whole "JSON text", but was amended to accept any "JSON value" in version 5.2.1
  • Python's json.loads accepts any "JSON value" according to examples on this manual page
  • the validator at http://jsonlint.com expects a full "JSON text"
  • the Ruby JSON module will only accept a full "JSON text" (at least according to the comments on this manual page)

The distinction is a bit like the distinction between an "XML document" and an "XML fragment", although technically <foo /> is a well-formed XML document (it would be better written as <?xml version="1.0" ?><foo />, but as pointed out in comments, the <?xml declaration is technically optional).

Solution 4:

JSON stands for JavaScript Object Notation. Only {} and [] define a Javascript object. The other examples are value literals. There are object types in Javascript for working with those values, but the expression "string" is a source code representation of a literal value and not an object.

Keep in mind that JSON is not Javascript. It is a notation that represents data. It has a very simple and limited structure. JSON data is structured using {},:[] characters. You can only use literal values inside that structure.

It is perfectly valid for a server to respond with either an object description or a literal value. All JSON parsers should be handle to handle just a literal value, but only one value. JSON can only represent a single object at a time. So for a server to return more than one value it would have to structure it as an object or an array.