JSON Lesson

I just learned (maybe everyone else already knew) that it’s legal to have duplicate keys in JSON text. But please don’t.

There are three definitions of JSON. The lovely graphical one at json.org, the less lovely monospaced ASCII in RFC 4627, and Section 15.12 of ECMAScript 5.1.

json.org says nothing about duplicate keys. RFC 4627 (section 2.2), says “The names within an object SHOULD be unique.” ECMAScript (15.12.2) says “NOTE In the case where there are duplicate name Strings within an object, lexically preceding values for the same key shall be overwritten.”

So I guess this is legal:

{ "key":"a345", "key":"b678" }

Gag. Blecch. Puke. It seems that different software implementations do different things; I guess the last-dupe-wins ECMAScript behavior is the most common, but one hears of JSON parsers that blow up when encountering dupes.

Like many others, I’ve been using JSON for nearly all my new Net-protocol work for years. I haven’t been bitten by this, but I gather some have.

Anyhow, I think most reasonable people will agree: Encountering duplicate keys in incoming JSON is evidence of insanity on the other end of the line. There’s serious consideration, over in the IETF, of maybe redoing the RFC to say this in a stern voice (plus a few other little clean-ups). I’m not sure whether that’s worth doing, but probably more people need to be aware of this little duplicate-keys gotcha.

Contributions

Comment feed for ongoing:

From: peter keane (Feb 21 2013, at 11:19)

Might be worth thinking about parallels w/ query parameters. Duplicate query parameters ar OK:

example.com?sort=by_title&sort=by_author

One end of the spectrum sees JSON as simplified XML, other end sees it as fancy application/x-www-form-urlencoded

[link]

From: Mike Capp (Feb 21 2013, at 13:11)

For such a simple spec, JSON is surprising warty in practice. You've got the RFC/ECMA disagreement over whether or not non-collection values can be at the top level; you've got the optional and seemingly-pointless character escape for solidus which turns out to be really quite important for JSONP; you've got those pesky LINE and PARAGRAPH SEPARATORs that prevent all valid JSON being valid JavaScript; and you've got this.

Maybe it was to allow indefinitely long streams of JSON, where keeping track of every key used so far would be prohibitively expensive. But I struggle to see the use case for that that couldn't be done just as well another way.

[link]

From: Peter Morlion (Feb 22 2013, at 02:00)

I might be interesting to implement this as HTML was set up: "Be strict in what you send, but generous in what you receive".

So don't construct your JSON with duplicate keys. I share your distaste for this. But libraries might as well accept the fact that it could happen and not crash on it.

[link]

From: Bhasker Kode (Feb 25 2013, at 21:12)

It's definitely an interesting thought.

My first thought was that this would be faily common in things like xml-to-json parser, csv to json parsers, list to json converters, etc where although we wouldnt want to encounter something like this: it's easily foreseeable in parsers.

Ideally duplicates should not be tolerated. But through an example it might help show that perhaps the developer needs to decide what's best.

Here's an example where duplicates would create confusion, but at the same time - the developer decides what course of action to take.

Consider a hypothetical word count / map-reduce example, where a object is of the form:

<pre>

foo 1

bar 1

foo 1

</pre>

I say hypothetical, because I think you'd agree that the example above should have been a list instead of an object.

A map-reduce over the above should give

<pre>

foo 2

bar 1

</pre>

It's down to you to know take care of integrity.

Something on the lines of:

<pre>

if(! count[word] ) {

count[word] = 0;

}

count++;

</pre>

There could easily be another case where a variant logic is applied for duplicate terms. So to re-iterate, the developer should be able to handle such cases - because like you said - more often these occur in conditions where you're converting some not-so-structured input, into a more structured output.

I still have mixed feelings about this though. Thanks for bringing this up.

[link]

From: Bob Foster (Mar 05 2013, at 15:20)

I consider myself a reasonable person and I don't agree that duplicate keys in a JSON object reflect on the mental health of the sender. Duplicate keys are conformant with existing standards. In ECMAScript they have a well-defined behavior; in the json.org description they are simply allowed without comment, and in RFC 4627 they are mildly discouraged but not prohibited. Why did Douglas Crockford write SHOULD instead of MUST? Someone SHOULD ask him.

Here's another perspective: duplicate keys are harmless as long as both sender and receiver agree on what they mean. If some pair of senders and receivers can agree on a useful meaning they are free to do so. For example, a reader might interpret the object as a multi-map. Perhaps a reader translates the object to XML in a simple-minded way, by mapping keys to element names and their values to contents.

In any case, the cat is already out of the bag. It's good you highlighted this point: there is a need for senders and receivers to agree what duplicate keys mean or to agree not to use them. That's all.

[link]

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

February 21, 2013
· Technology (90 fragments)
· · Internet (116 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!