Explanation of JSONB introduced by PostgreSQL
First, hstore
is a contrib module, which only allows you to store key => value pairs, where keys and values can only be text
s (however values can be sql NULL
s too).
Both json
& jsonb
allows you to store a valid JSON value (defined in its spec).
F.ex. these are valid JSON representations: null
, true
, [1,false,"string",{"foo":"bar"}]
, {"foo":"bar","baz":[null]}
- hstore
is just a little subset compared to what JSON is capable (but if you only need this subset, it's fine).
The only difference between json
& jsonb
is their storage:
-
json
is stored in its plain text format, while -
jsonb
is stored in some binary representation
There are 3 major consequences of this:
-
jsonb
usually takes more disk space to store thanjson
(sometimes not) -
jsonb
takes more time to build from its input representation thanjson
-
json
operations take significantly more time thanjsonb
(& parsing also needs to be done each time you do some operation at ajson
typed value)
When jsonb
will be available with a stable release, there will be two major use cases, when you can easily select between them:
- If you only work with the JSON representation in your application, PostgreSQL is only used to store & retrieve this representation, you should use
json
. - If you do a lot of operations on the JSON value in PostgreSQL, or use indexing on some JSON field, you should use
jsonb
.
Peeyush:
The short answer is:
- If you are doing a lot of JSON manipulation inside PostgreSQL, such as sorting, slicing, splicing, etc., you should use JSONB for speed reasons.
- If you need indexed lookups for arbitrary key searches on JSON, then you should use JSONB.
- If you are doing neither of the above, you should probably use JSON.
- If you need to preserve key ordering, whitespace, and duplicate keys, you should use JSON.
For a longer answer, you'll need to wait for me to do a full "HowTo" writeup closer to the 9.4 release.
A simple explanation of the difference between json and jsonb (original image by PostgresProfessional):
SELECT '{"c":0, "a":2,"a":1}'::json, '{"c":0, "a":2,"a":1}'::jsonb;
json | jsonb
------------------------+---------------------
{"c":0, "a":2,"a":1} | {"a": 1, "c": 0}
(1 row)
- json: textual storage «as is»
- jsonb: no whitespaces
- jsonb: no duplicate keys, last key wins
- jsonb: keys are sorted
More in speech video and slide show presentation by jsonb developers. Also they introduced JsQuery, a pg.extension that provides the powerful jsonb query language.
-
hstore
is more of a "wide column" storage type, it is a flat (non-nested) dictionary of key-value pairs, always stored in a reasonably efficient binary format (a hash table, hence the name). -
json
stores JSON documents as text, performing validation when the documents are stored, and parsing them on output if needed (i.e. accessing individual fields); it should support the entire JSON spec. Since the entire JSON text is stored, its formatting is preserved. -
jsonb
takes shortcuts for performance reasons: JSON data is parsed on input and stored in binary format, key orderings in dictionaries are not maintained, and neither are duplicate keys. Accessing individual elements in the JSONB field is fast as it doesn't require parsing the JSON text all the time. On output, JSON data is reconstructed and initial formatting is lost.
IMO, there is no significant reason for not using jsonb
once it is available, if you are working with machine-readable data.