Using json_array_elements() to break out rows with elements of JSON array

I have a table of the following format with just one column. There are around 700 entries in total, here are 5 samples:

{"year":"2021","category":"chemistry","laureates":[{"id": "1002", "firstname": "Benjamin", "surname": "List", "motivation": "\"for the development of asymmetric organocatalysis\"", "share": "2"}, {"id": "1003", "firstname": "David", "surname": "MacMillan", "motivation": "\"for the development of asymmetric organocatalysis\"", "share": "2"}],"reason":null}
{"year":"2021","category":"economics","laureates":[{"id": "1007", "firstname": "David", "surname": "Card", "motivation": "\"for his empirical contributions  to labour economics\"", "share": "2"}, {"id": "1008", "firstname": "Joshua", "surname": "Angrist", "motivation": "\"for their methodological contributions to the analysis  of causal relationships\"", "share": "4"}, {"id": "1009", "firstname": "Guido", "surname": "Imbens", "motivation": "\"for their methodological contributions to the analysis  of causal relationships\"", "share": "4"}],"reason":null}
{"year":"2021","category":"literature","laureates":[{"id": "1004", "firstname": "Abdulrazak", "surname": "Gurnah", "motivation": "\"for his uncompromising and compassionate penetration of the effects of colonialism and the  fate of the refugee in the gulf between cultures and continents\"", "share": "1"}],"reason":null}
{"year":"2021","category":"peace","laureates":[{"id": "1005", "firstname": "Maria", "surname": "Ressa", "motivation": "\"for their efforts to safeguard freedom of expression, which  is a precondition for democracy and lasting peace\"", "share": "2"}, {"id": "1006", "firstname": "Dmitry", "surname": "Muratov", "motivation": "\"for their efforts to safeguard freedom of expression, which  is a precondition for democracy and lasting peace\"", "share": "2"}],"reason":null}
{"year":"2021","category":"physics","laureates":[{"id": "999", "firstname": "Syukuro", "surname": "Manabe", "motivation": "\"for the physical modelling of Earth’s climate, quantifying variability and reliably predicting global warming\"", "share": "4"}, {"id": "1000", "firstname": "Klaus", "surname": "Hasselmann", "motivation": "\"for the physical modelling of Earth’s climate, quantifying variability and reliably predicting global warming\"", "share": "4"}, {"id": "1001", "firstname": "Giorgio", "surname": "Parisi", "motivation": "\"for the discovery of the interplay of disorder and fluctuations in physical systems from atomic to planetary scales\"", "share": "2"}],"reason":"for groundbreaking contributions to our understanding of complex physical systems"}

I want to print the output in this format:

year category r
"2021" "chemistry" {"id": "1002", "firstname": "Benjamin", "surname": "List", "motivation": ""for the development of asymmetric organocatalysis"", "share": "2"}
"2021" "chemistry" {"id": "1003", "firstname": "David", "surname": "MacMillan", "motivation": ""for the development of asymmetric organocatalysis"", "share": "2"}
"2021" "economics" {"id": "1007", "firstname": "David", "surname": "Card", "motivation": ""for his empirical contributions to labour economics"", "share": "2"}
"2021" "economics" {"id": "1008", "firstname": "Joshua", "surname": "Angrist", "motivation": ""for their methodological contributions to the analysis of causal relationships"", "share": "4"}
"2021" "economics" {"id": "1009", "firstname": "Guido", "surname": "Imbens", "motivation": ""for their methodological contributions to the analysis of causal relationships"", "share": "4"}

I was able to get the output above with the following query:

SELECT tuple->'year' AS year, tuple->'category' AS category,json_array_elements(tuple->'laureates') FROM prizes LIMIT 5;

What I want is to output all values in the table with a query like:

SELECT tuple->'year' AS year, tuple->'category' AS category,json_array_elements(tuple->'laureates') FROM prizes;

When I run this query I get this error:

cannot call json_array_elements on a scalar

I don't understand why this is happening and how I would go about outputting all possible elements.


Since json_array_elements() can only be used with JSON arrays, split cases and UNION ALL:

SELECT tuple->'year' AS year, tuple->'category' AS category, json_array_elements(tuple->'laureates') AS r
FROM   prizes
WHERE  json_typeof(tuple->'laureates') = 'array'
UNION ALL
SELECT tuple->'year' AS year, tuple->'category' AS category, tuple->'laureates'
FROM   prizes
WHERE  json_typeof(tuple->'laureates') IS DISTINCT FROM 'array';

The manual:

json_typeof ( json ) → text
[...]

Returns the type of the top-level JSON value as a text string. Possible types are object, array, string, number, boolean, and null. (The null result should not be confused with an SQL NULL; [...])

Note the use of IS DISTINCT FROM in the second leg. The operator <> would eliminate rows with (tuple->'laureates') IS NULL.

jsonb_path_query() in Postgres 12 or later

With jsonb_path_query() we don't need to split cases (like Edouard suggested). But to include rows without laureate, we need to LEFT JOIN to it:

SELECT p.id, p.tuple->>'year' AS year, p.tuple->>'category' AS category, l.r
FROM   prizes p
LEFT   JOIN LATERAL jsonb_path_query((tuple->'laureates')::jsonb, '$[*]', '{"foo":"bar"}') AS l(r) ON true;

db<>fiddle here

See:

  • What is the difference between LATERAL JOIN and a subquery in PostgreSQL?

If you are using PostGreSQL version 12 or above, then you can use jsonb_path_query instead of json_array_elements like this :

SELECT tuple->>'year'
     , tuple->>'category'
     , jsonb_path_query((tuple->'laureates') :: jsonb, '$[*]')
  FROM prizes

jsonb_path_query is going to break down tuple->'laureates' when it is a json array, but is won't fail when tuple->'laureates' is not a json array.

see the test result in dbfiddle.

see the SQL/JSON Path Language in the manual.