What are the differences between json and simplejson Python modules?

I have seen many projects using simplejson module instead of json module from the Standard Library. Also, there are many different simplejson modules. Why would use these alternatives, instead of the one in the Standard Library?


Solution 1:

json is simplejson, added to the stdlib. But since json was added in 2.6, simplejson has the advantage of working on more Python versions (2.4+).

simplejson is also updated more frequently than Python, so if you need (or want) the latest version, it's best to use simplejson itself, if possible.

A good practice, in my opinion, is to use one or the other as a fallback.

try:
    import simplejson as json
except ImportError:
    import json

Solution 2:

I have to disagree with the other answers: the built in json library (in Python 2.7) is not necessarily slower than simplejson. It also doesn't have this annoying unicode bug.

Here is a simple benchmark:

import json
import simplejson
from timeit import repeat

NUMBER = 100000
REPEAT = 10

def compare_json_and_simplejson(data):
    """Compare json and simplejson - dumps and loads"""
    compare_json_and_simplejson.data = data
    compare_json_and_simplejson.dump = json.dumps(data)
    assert json.dumps(data) == simplejson.dumps(data)
    result = min(repeat("json.dumps(compare_json_and_simplejson.data)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json dumps {} seconds".format(result)
    result = min(repeat("simplejson.dumps(compare_json_and_simplejson.data)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson dumps {} seconds".format(result)
    assert json.loads(compare_json_and_simplejson.dump) == data
    result = min(repeat("json.loads(compare_json_and_simplejson.dump)", "from __main__ import json, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "      json loads {} seconds".format(result)
    result = min(repeat("simplejson.loads(compare_json_and_simplejson.dump)", "from __main__ import simplejson, compare_json_and_simplejson", 
                 repeat = REPEAT, number = NUMBER))
    print "simplejson loads {} seconds".format(result)


print "Complex real world data:" 
COMPLEX_DATA = {'status': 1, 'timestamp': 1362323499.23, 'site_code': 'testing123', 'remote_address': '212.179.220.18', 'input_text': u'ny monday for less than \u20aa123', 'locale_value': 'UK', 'eva_version': 'v1.0.3286', 'message': 'Successful Parse', 'muuid1': '11e2-8414-a5e9e0fd-95a6-12313913cc26', 'api_reply': {"api_reply": {"Money": {"Currency": "ILS", "Amount": "123", "Restriction": "Less"}, "ProcessedText": "ny monday for less than \\u20aa123", "Locations": [{"Index": 0, "Derived From": "Default", "Home": "Default", "Departure": {"Date": "2013-03-04"}, "Next": 10}, {"Arrival": {"Date": "2013-03-04", "Calculated": True}, "Index": 10, "All Airports Code": "NYC", "Airports": "EWR,JFK,LGA,PHL", "Name": "New York City, New York, United States (GID=5128581)", "Latitude": 40.71427, "Country": "US", "Type": "City", "Geoid": 5128581, "Longitude": -74.00597}]}}}
compare_json_and_simplejson(COMPLEX_DATA)
print "\nSimple data:"
SIMPLE_DATA = [1, 2, 3, "asasd", {'a':'b'}]
compare_json_and_simplejson(SIMPLE_DATA)

And the results on my system (Python 2.7.4, Linux 64-bit):

Complex real world data:
json dumps 1.56666707993 seconds
simplejson dumps 2.25638604164 seconds
json loads 2.71256899834 seconds
simplejson loads 1.29233884811 seconds

Simple data:
json dumps 0.370109081268 seconds
simplejson dumps 0.574181079865 seconds
json loads 0.422876119614 seconds
simplejson loads 0.270955085754 seconds

For dumping, json is faster than simplejson. For loading, simplejson is faster.

Since I am currently building a web service, dumps() is more important—and using a standard library is always preferred.

Also, cjson was not updated in the past 4 years, so I wouldn't touch it.