It Had to be Said: XML vs. JSON
James Clark in XML vs. The Web has finally said what needed to be said -- that XML is a singularly bad format for data transmission. Here is the crux of what Mr. Clark had to say:
It's "Yay", because for important use cases JSON is dramatically better than XML. In particular, JSON shines as a programming language-independent representation of typical programming language data structures. This is an incredibly important use case and it would be hard to overstate how appallingly bad XML is for this. The fundamental problem is the mismatch between programming language data structures and the XML element/attribute data model of elements.For me, XML is great at describing documents. XML is also (IMHO) reasonable for describing hierarchies of nodes with attributes (like config files) -- although XML is a little wordy for my tastes when used to describe hierarchies. XML, however (and you knew there had to be a however in there), is a painful format for data transmission. XML's overhead is just way too high for the simple task of transmitting common data structures; for example, the simplest XML representation I can think of for x = 1 is:
<i n="x">1</i>
which is a total of 16 Unicode characters. In JSON (if I understand the format correctly), x = 1 could be expressed as:
x: 1
for a grand total of 6 Unicode characters (including the CRLF line ending). Much better.
(James Clark wrote the XSLT spec and came up with the name XML, so his opinion on XML vs. JSON should probably be listened to.
(As a Perl programmer, I was hoping for YAML, but I think the lack of a Java YAML parser for several years led to the triumph of JSON.)
I can't say he's saying anything new, but I'm happy someone influential is stating the obvious. Maybe people will be listening.
XML is for documents
JSON is for data
It really is and should be that simple.
I don't disagree with anything you're saying here but I have to question this part.
the simplest XML representation I can think of for x = 1 is:
<i n="x">1</i>
Really? That's the simplest you could think of?
<x>1</x>
Didn't come to mind? Yes it's more verbose thanx: 1
... but far simpler than you're implying XML requires.“Finally”? Hasn’t this been being said for, what, a decade or so?
The simplest representation in XML is
<x v="1"/>
. (It’s a bit longer than<x>1</x>
, but it would win for any meaningful key names.)The simplest representation in JSON in
{x:1}
. You cannot leave off the braces.The difference is not huge: 4 characters (5 vs 9). And it would get overwhelmed by the payload if you used longer keys and values than 1 character each.
The problem is not with verbosity of the syntax anyway. Why does everyone doggedly fixate on that? No, the problem is how you go about accessing the data inside a program. With XML, you get a DOM. JSON maps directly onto native data structures. That is the difference.
You are mistaken. The complexity of the YAML spec is what led to the triumph of JSON. I am not sure there is even one fully compliant YAML parser for Perl on the CPAN yet. (I know that less than 2 years ago, there wasn’t.) Have you ever looked at the YAML spec? If not, you’re in for a rude shock.
In contrast, any competent programmer can write a good stab at a JSON parser for the language of his choice in a couple of hours. (Writing a really correct one is trickier than that. However, it is nowhere near as huge a task as a YAML parser.)
It’s a pity, because sure, if we were talking about a subset of YAML that’s roughly equivalent to JSON’s expressive power (YAML Tiny essentially), then I would agree: that is a great idea. But real full YAML is complex beyond sanity.
Thank you for the post, and thanks to those who commented. +5 Interesting!
It's worth mentioning that some JSON parsers (Python's, for example, IIRC) have problems with unquoted keys and values. That is: {x:1} should be {"x":"1"}, though I'm not certain about the "1" value, since it's numeric.
I agree with Aristotle. YAML is nicer to look at, but insane to parse correctly. The fact of the matter is that YAML::Tiny leads the way even though it admittingly doesn't even try to accomplish 100% of the spec. This might have changed recently since Adam Kennedy took hold of the YAML dist.
Sorry for being pedantic, but the simplest representation in JSON is {"x":1}. You can't leave off the quotation marks either, or can you?
Yes, it’s true – you have to quote the key. I forgot.
I left out the "as integer" phrase. If you don't need to specify that the value is an integer, then you can specify x=1 more simply in JSON.
"Finally" is because someone you would hope is an authority to be listened to on the subject has said it. (We have all likely seen this opinion before, but not IIRC from someone of Clark's stature in the XML field).
IMHO, a Java implementation on the level of YAML::Tiny would have gone a long way towards pushing YAML to where JSON is now, but when I was really deep into YAML in 2005-2006 the Java parsers were pre-alpha quality -- you could not expect to do even simple real work with them.