How to prettyprint a JSON file?

I have a JSON file that is a mess that I want to prettyprint. What's the easiest way to do this in Python?

I know PrettyPrint takes an "object", which I think can be a file, but I don't know how to pass a file in. Just using the filename doesn't work.


Solution 1:

The json module already implements some basic pretty printing in the dump and dumps functions, with the indent parameter that specifies how many spaces to indent by:

>>> import json
>>>
>>> your_json = '["foo", {"bar":["baz", null, 1.0, 2]}]'
>>> parsed = json.loads(your_json)
>>> print(json.dumps(parsed, indent=4, sort_keys=True))
[
    "foo", 
    {
        "bar": [
            "baz", 
            null, 
            1.0, 
            2
        ]
    }
]

To parse a file, use json.load():

with open('filename.txt', 'r') as handle:
    parsed = json.load(handle)

Solution 2:

You can do this on the command line:

python3 -m json.tool some.json

(as already mentioned in the commentaries to the question, thanks to @Kai Petzke for the python3 suggestion).

Actually python is not my favourite tool as far as json processing on the command line is concerned. For simple pretty printing is ok, but if you want to manipulate the json it can become overcomplicated. You'd soon need to write a separate script-file, you could end up with maps whose keys are u"some-key" (python unicode), which makes selecting fields more difficult and doesn't really go in the direction of pretty-printing.

You can also use jq:

jq . some.json

and you get colors as a bonus (and way easier extendability).

Addendum: There is some confusion in the comments about using jq to process large JSON files on the one hand, and having a very large jq program on the other. For pretty-printing a file consisting of a single large JSON entity, the practical limitation is RAM. For pretty-printing a 2GB file consisting of a single array of real-world data, the "maximum resident set size" required for pretty-printing was 5GB (whether using jq 1.5 or 1.6). Note also that jq can be used from within python after pip install jq.

Solution 3:

You could use the built-in module pprint (https://docs.python.org/3.9/library/pprint.html).

How you can read the file with json data and print it out.

import json
import pprint

json_data = None
with open('file_name.txt', 'r') as f:
    data = f.read()
    json_data = json.loads(data)

print(json_data)
{"firstName": "John", "lastName": "Smith", "isAlive": "true", "age": 27, "address": {"streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": "10021-3100"}, 'children': []}

pprint.pprint(json_data)
{'address': {'city': 'New York',
             'postalCode': '10021-3100',
             'state': 'NY',
             'streetAddress': '21 2nd Street'},
 'age': 27,
 'children': [],
 'firstName': 'John',
 'isAlive': True,
 'lastName': 'Smith'}

The output is not a valid json, because pprint use single quotes and json specification require double quotes.

If you want to rewrite the pretty print formated json to a file, you have to use pprint.pformat.

pretty_print_json = pprint.pformat(json_data).replace("'", '"')

with open('file_name.json', 'w') as f:
    f.write(pretty_print_json)

Solution 4:

Pygmentize + Python json.tool = Pretty Print with Syntax Highlighting

Pygmentize is a killer tool. See this.

I combine python json.tool with pygmentize

echo '{"foo": "bar"}' | python -m json.tool | pygmentize -l json

See the link above for pygmentize installation instruction.

A demo of this is in the image below:

demo