RFC 7493: The I-JSON Message Format

The Olde ASCII is at rfc7493.txt. I’ll put a nicely-formatted HTML version here as soon as I pull a few pieces together. This is really, really simple stuff and should be about as controversy-free as an RFC can be.

Back story · Basically, RFC 7159 is “the JSON RFC”; it describes the existing panoply of JSON specs, and also more-or-less unifies the (small) incompatibilities between them. The history is here, from which I quote: ¶

If you’re interested, I recommend opening up the HTML version and searching forward for the string “interop”. There are 17 occurrences. If you’re generating JSON — something a lot of us do all the time — and make sure you avoid the mistakes highlighted in those 17 places, you’re very unlikely to cause pain or breakage in software that’s receiving it.

I-JSON is just a note saying that if you construct a chunk of JSON and avoid the interop failures described in RFC 7159, you can call it an “I-JSON Message”. If any known JSON implementation creates an I-JSON message and sends it to any other known JSON implementation, the chance of software surprises is vanishingly small.

Which I think is a good thing to do.

Conformance? · The RFC doesn’t actually describe anything called “I-JSON conformance”. It just specifies an object called an “I-JSON Message”; the idea is that anyone writing an Internet-protocol spec can specify that the payload be an I-JSON Message. ¶

For security geeks · JSON is starting to be used a lot in security-related protocols: Crypto, authentication/authorization, and so on. It turns out that the security people worry about Bad People and Government Employees using Stupid JSON Tricks like duplicate keys and carefully-malformed Unicode to attack these protocols. ¶

So if you specify that your payload MUST be I-JSON message, and the receiver checks that, there’s one particular class of attacks that you no longer have to worry about. Which has to be a good thing.

Bonus: Protocol pointers! · The RFC has a section “4. Recommendations for Protocol Design” that summarize a bunch of lessons hard-won over the years about things that make JSON-based protocols work better and more interoperably. ¶

Things like using RFC3339 dates, making your message’s top level a JSON object so you can have a Must-Ignore policy, being careful with numeric precision and range because JavaScript, and so on. I suspect most readers of this space will nod their heads, unsurprised, at each and every one.

Thanks · To the WG and chairs and Area Directors and IESG and RFC Editor; every step of the IETF process improved the initial draft. ¶

Contributions

Comment feed for ongoing:

From: Ivan Sagalaev (Mar 22 2015, at 22:35)

First of all, thanks for doing this! I've posted my suggestions regarding this on my blog: http://softwaremaniacs.org/blog/2015/03/22/json-encoding-problem/en/

[link]

From: Graham Hay (Mar 23 2015, at 01:24)

A typo (JOSN instead of JSON) in the section:

"Bonus: Protocol pointers! · The RFC has a section “4. Recommendations for Protocol Design” that summarize a bunch of lessons hard-won over the years about things that make JOSN-based protocols work better and more interoperably."

[link]

From: John Cowan (Mar 23 2015, at 06:33)

The main advantage of the \uxxxx escapes (to which Ivan objects) is that when writing JSON by hand, which does happen, you can insert an arbitrary character into it, and you can be sure that you have done so. The string "foo\u2002bar" is much more obviously a seven-character string than "foobar" is.

[link]

From: Andrew Cherry (Mar 23 2015, at 07:26)

Interesting and sensible evolution. You also have a JOSN message towards the end, which I couldn't find a spec for... :)

[link]

From: Mike Capp (Mar 23 2015, at 07:44)

One interop omission/nitpick:

Section 12 states that JSON is "a subset of JavaScript"; this is almost but not entirely true.

Section 7.3 of ECMA-262 states that "Line Terminators" cannot occur inside a string. Section 7.8.4 repeats this. Two of the terminator characters listed - \u2028 "Line separator" and \u2029 "Paragraph separator" - are not among the characters that JSON requires to be escaped.

For maximum interoperability, including the still-important JSONP use case, these characters SHOULD be escaped too.

[link]

From: Paul J. Davis (Mar 23 2015, at 08:16)

One question about duplicate keys, the RFC mentions "identical sequences of Unicode characters", should that be code points instead of characters? Or is there an expectation that JSON generators be able to access a library like ICU to do normalization before string comparison?

[link]

From: Kevin Johnson (Mar 23 2015, at 08:34)

Typo in "Bonus" paragraph, JOSN -> JSON.

[link]

From: Mark S (Mar 23 2015, at 09:02)

Can you reflect on "Don't Invent XML Languages" that you wrote in 2006?

http://www.tbray.org/ongoing/When/200x/2006/01/08/No-New-XML-Languages

What's your current take on the Big Five (XHTML, DocBook, ODF, UBL, Atom) then? Also what's your take on Markdown/CommonMark?

[link]

From: Larry West (Mar 23 2015, at 09:52)

It would be helpful to see a forward link (something like "updated by RFC-7493", though I'm not sure what verb is appropriate here) in RFC-7159.

[link]

From: Deron Meranda (Mar 23 2015, at 10:54)

Interoperable JSON is something easily overlooked, so thanks for making this more explicit.

For those interested my "jsonlint" tool (part of my "demjson" python module) should catch all of these potential problems and more, except for the ISO date formats. See http://deron.meranda.us/python/demjson/ or on Github as dmeranda/demjson

[link]

From: skagedal (Mar 25 2015, at 08:50)

From section 4.3: '...and that optional trailing seconds be included even when their value is "00".'

You mean the fraction part here, right?

[link]

From: Hugh Fisher (Mar 31 2015, at 01:50)

Something that ought to be on the list of protocol recommendations and isn't yet in JSON: allow for numerically typed arrays.

So far we've seen it with SVG, Python, and JavaScript: dynamically typed polymorphic arrays are neat and elegant ... until some graphics programmer or scientist wants to dump 100,000 floating point numbers into one. There are times when the right thing to do is strong typing.

The SVG designers realised that parsing <x>1.0</x><y>0.0</y> over and over would be ridiculously inefficient in both time and memory so created a new notation. Python has the array module and it's supercharged numpy add-on for homogenous arrays of numbers. And JavaScript got typed arrays because WebGL won't work without them.

Sure, typed arrays are maybe not needed right now, but JSON 2?

[link]

From: Roger Costello (Apr 29 2015, at 10:17)

RFC7493 says:

"\uDEAD" is invalid because it is an

unpaired surrogate, while "\uD800\uDEAD" would be legal.

Would someone explain that please? I thought that "surrogate" is a block of code points that are not (and never will be) assigned a character. \uDEAD is in that block. Yes? If yes, then how can \uDEAD ever be valid in JSON?

It seems like the recommendation can be stated much more simply. Why isn't this the recommendation: When using \uxxxx in JSON, don't specify a hex value for xxxx that doesn't map to a character. Yes?

[link]

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

March 23, 2015
· Technology (90 fragments)
· · Internet (116 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!