what is the difference between json and xml [closed]
What is the difference between JSON and XML?
Solution 1:
The fundamental difference, which no other answer seems to have mentioned, is that XML is a markup language (as it actually says in its name), whereas JSON is a way of representing objects (as also noted in its name).
A markup language is a way of adding extra information to free-flowing plain text, e.g
Here is some text.
With XML (using a certain element vocabulary) you can put:
<Document>
<Paragraph Align="Center">
Here <Bold>is</Bold> some text.
</Paragraph>
</Document>
This is what makes markup languages so useful for representing documents.
An object notation like JSON is not as flexible. But this is usually a good thing. When you're representing objects, you simply don't need the extra flexibility. To represent the above example in JSON, you'd actually have to solve some problems manually that XML solves for you.
{
"Paragraphs": [
{
"align": "center",
"content": [
"Here ", {
"style" : "bold",
"content": [ "is" ]
},
" some text."
]
}
]
}
It's not as nice as the XML, and the reason is that we're trying to do markup with an object notation. So we have to invent a way to scatter snippets of plain text around our objects, using "content" arrays that can hold a mixture of strings and nested objects.
On the other hand, if you have typical a hierarchy of objects and you want to represent them in a stream, JSON is better suited to this task than HTML.
{
"firstName": "Homer",
"lastName": "Simpson",
"relatives": [ "Grandpa", "Marge", "The Boy", "Lisa", "I think that's all of them" ]
}
Here's the logically equivalent XML:
<Person>
<FirstName>Homer</FirstName>
<LastName>Simpsons</LastName>
<Relatives>
<Relative>Grandpa</Relative>
<Relative>Marge</Relative>
<Relative>The Boy</Relative>
<Relative>Lisa</Relative>
<Relative>I think that's all of them</Relative>
</Relatives>
</Person>
JSON looks more like the data structures we declare in programming languages. Also it has less redundant repetition of names.
But most importantly of all, it has a defined way of distinguishing between a "record" (items unordered, identified by names) and a "list" (items ordered, identified by position). An object notation is practically useless without such a distinction. And XML has no such distinction! In my XML example <Person>
is a record and <Relatives>
is a list, but they are not identified as such by the syntax.
Instead, XML has "elements" versus "attributes". This looks like the same kind of distinction, but it's not, because attributes can only have string values. They cannot be nested objects. So I couldn't have applied this idea to <Person>
, because I shouldn't have to turn <Relatives>
into a single string.
By using an external schema, or extra user-defined attributes, you can formalise a distinction between lists and records in XML. The advantage of JSON is that the low-level syntax has that distinction built into it, so it's very succinct and universal. This means that JSON is more "self describing" by default, which is an important goal of both formats.
So JSON should be the first choice for object notation, where XML's sweet spot is document markup.
Unfortunately for XML, we already have HTML as the world's number one rich text markup language. An attempt was made to reformulate HTML in terms of XML, but there isn't much advantage in this.
So XML should (in my opinion) have been a pretty limited niche technology, best suited only for inventing your own rich text markup languages if you don't want to use HTML for some reason. The problem was that in 1998 there was still a lot of hype about the Web, and XML became popular due to its superficial resemblance to HTML. It was a strange design choice to try to apply to hierarchical data a syntax actually designed for convenient markup.
Solution 2:
They are both data formats for hierarchical data, so while the syntax is quite different, the structure is similar. Example:
JSON:
{
"persons": [
{
"name": "Ford Prefect",
"gender": "male"
},
{
"name": "Arthur Dent",
"gender": "male"
},
{
"name": "Tricia McMillan",
"gender": "female"
}
]
}
XML:
<persons>
<person>
<name>Ford Prefect</name>
<gender>male</gender>
</person>
<person>
<name>Arthur Dent</name>
<gender>male</gender>
</person>
<person>
<name>Tricia McMillan</name>
<gender>female</gender>
</person>
</persons>
The XML format is more advanced than shown by the example, though. You can for example add attributes to each element, and you can use namespaces to partition elements. There are also standards for defining the format of an XML file, the XPATH language to query XML data, and XSLT for transforming XML into presentation data.
The XML format has been around for some time, so there is a lot of software developed for it. The JSON format is quite new, so there is a lot less support for it.
While XML was developed as an independent data format, JSON was developed specifically for use with Javascript and AJAX, so the format is exactly the same as a Javascript literal object (that is, it's a subset of the Javascript code, as it for example can't contain expressions to determine values).
Solution 3:
The difference between XML and JSON is that XML is a meta-language/markup language and JSON is a lightweight data-interchange. That is, XML syntax is designed specifically to have no inherent semantics. Particular element names don't mean anything until a particular processing application processes them in a particular way. By contrast, JSON syntax has specific semantics built in stuff between {} is an object, stuff between [] is an array, etc.
A JSON parser, therefore, knows exactly what every JSON document means. An XML parser only knows how to separate markup from data. To deal with the meaning of an XML document, you have to write additional code.
To illustrate the point, let me borrow Guffa's example:
{ "persons": [
{
"name": "Ford Prefect",
"gender": "male"
},
{
"name": "Arthur Dent",
"gender": "male"
},
{
"name": "Tricia McMillan",
"gender": "female"
}
]
}
The XML equivalent he gives is not really the same thing since while the JSON example is semantically complete, the XML would require to be interpreted in a particular way to have the same effect. In effect, the JSON is an example uses an established markup language of which the semantics are already known, whereas the XML example creates a brand new markup language without any predefined semantics.
A better XML equivalent would be to define a (fictitious) XJSON language with the same semantics as JSON, but using XML syntax. It might look something like this:
<xjson>
<object>
<name>persons</name>
<value>
<array>
<object>
<value>Ford Prefect</value>
<gender>male</gender>
</object>
<object>
<value>Arthur Dent</value>
<gender>male</gender>
</object>
<object>
<value>Tricia McMillan</value>
<gender>female</gender>
</object>
</array>
</value>
</object>
</xjson>
Once you wrote an XJSON processor, it could do exactly what JSON processor does, for all the types of data that JSON can represent, and you could translate data losslessly between JSON and XJSON.
So, to complain that XML does not have the same semantics as JSON is to miss the point. XML syntax is semantics-free by design. The point is to provide an underlying syntax that can be used to create markup languages with any semantics you want. This makes XML great for making up ad-hoc data and document formats, because you don't have to build parsers for them, you just have to write a processor for them.
But the downside of XML is that the syntax is verbose. For any given markup language you want to create, you can come up with a much more succinct syntax that expresses the particular semantics of your particular language. Thus JSON syntax is much more compact than my hypothetical XJSON above.
If follows that for really widely used data formats, the extra time required to create a unique syntax and write a parser for that syntax is offset by the greater succinctness and more intuitive syntax of the custom markup language. It also follows that it often makes more sense to use JSON, with its established semantics, than to make up lots of XML markup languages for which you then need to implement semantics.
It also follows that it makes sense to prototype certain types of languages and protocols in XML, but, once the language or protocol comes into common use, to think about creating a more compact and expressive custom syntax.
It is interesting, as a side note, that SGML recognized this and provided a mechanism for specifying reduced markup for an SGML document. Thus you could actually write an SGML DTD for JSON syntax that would allow a JSON document to be read by an SGML parser. XML removed this capability, which means that, today, if you want a more compact syntax for a specific markup language, you have to leave XML behind, as JSON does.