How to describe JSON data in a spec?

I would recommend my js-schema JavaScript library. The primary motivation behind it was the same what you describe in the question. It is a simple and easy to understand notation to describe JSON schemas (or specification, if you want).

An example schema described in JSON Schema:

{
  "type":"object",
  "properties":{
    "id":{
      "type":"number",
      "required":true
    },
    "name":{
      "type":"string",
      "required":true
    },
    "price":{
      "type": "number",
      "minimum":0,
      "required":true
    },
    "tags":{
      "type":"array",
      "items":{
        "type":"string"
      }
    }
  }
}

and the same schema description with js-schema:

{
  "id"    : Number,
  "name"  : String,
  "price" : Number.min(0),
  "?tags" : Array.of(String)
}

The library is able to validate object against schemas, generate random objects conforming to a given schema, and serialize/deserialize to/from JSON Schema.


I know this is an older question, but it might be useful to someone else: When looking for methods to describe JSON-data I stumbled upon Orderly. Here's the abstract right of the front page:

Orderly is a textual format for describing JSON. Orderly can be compiled into JSONSchema. It is designed to be easy to read and write.

I can agree with that, but I have only tried it with relatively simple structures so far.


How about using some kind of extended BNF?

PERSON <- { "firstname": FIRSTNAMES, "lastname": LASTNAME, "age": AGE, "version": VERSION, "parents" <- PARENTS }

FIRSTNAMES <- [ FIRSTNAME+ ]

FIRSTNAME <- STRING

LASTNAME <- STRING

PARENTS <- [ PERSON{0,2} ]

AGE <- INTEGER

VERSION <- 1 | 2

You'd have to define the meaning of atomic type descriptions like INTEGER and STRING somewhere. If you wanted to add non-hardcoded keys for dictionaries, you would define that as follows:

BREADLOOKUP <- { (TYPE : HOWMANY)+ }

TYPE <- "white" | "dark" | "french" | "croissant"

HOWMANY <- POSITIVE-INTEGER

This would allow stuff like

{ "white": 5, 
  "french": 2
}

Since both regular expressions and BNF are pretty well known, this might be an easy way to go. ?, +, *, {n}, {min,max} would be easy ways to specify a number of elements (taken from regexes) and the rest is pretty much pure BNF.

If you did that rigorously enough, it might even be parsable for a validator.


You could combine a W3C XML Schema, or some less ugly schema like RelaxNG, with conversion conventions.