I’ve edit­ed a cou­ple of the JSON RFCs, and am work­ing on the de­sign of a fair­ly com­plex DSL, so I think I can claim to have dug deep­er in the JSON mines than most. We can eas­i­ly agree on what’s wrong with JSON, and I can’t help won­der­ing if it’d be worth fix­ing it.

Ma­jor ir­ri­tan­t: Com­mas · Hand-editing JSON may not be the most im­por­tant way of in­ter­act­ing with it, but it shouldn’t be as hard as it is. In par­tic­u­lar, when I’m mov­ing things around in a chunk of JSON I can nev­er, as in NEVER, get the com­mas right.

The fix is easy: Just re­move them. They’re inessen­tial to the gram­mar, just there for JavaScript com­pat­i­bil­i­ty.

Al­ter­na­tive­ly, you could make them op­tion­al, or you could al­low a com­ma af­ter the last mem­ber of an ar­ray or ob­jec­t. But, in In­ter­net pro­to­col­s, less is more. Just nuk­ing the com­mas and re­quir­ing whites­pace sep­a­ra­tors is the best way for­ward. Like so:

{
  "IDs": [116 943 234 38793]
  "Settings": { "Aperture": 2.8 "Shutter speed": "1/250" }
}

Irritant: Timestamps · JSON is chiefly used, these days, in HTTP requests and results. Among the RESTful APIs that I can think of, exactly zero don’t have timestamps.

This one is easy to fix: Just introduce them already. Use the well-established well-debugged RFC 3339 format, insisting on capital T and Z.

Strictly speaking you don’t need any syntax because grammatically a timestamp can’t be a number and a number can’t be a timestamp. However, I think a syntactic signal might be nice: I was thinking of prefixing timestamps with “@”, like so:

{ "Capture Time": @2016-08-01T18:15:00Z }

Here’s the canonical example of a JSON object, originally due to Doug Crockford, as modified in RFC 7159, and slightly extended to illustrate the cleanups I’m talking about.

 {
     "Image": {
         "Width":  800
         "Height": 600
         "Title":  "View from 15th Floor"
         "Thumbnail": {
             "Url":    "http://www.example.com/image/481989943"
             "Height": 125
             "Width":  100
         }
         "Animated" : false
         "Capture Time": @2016-08-01T18:15:00Z
         "IDs": [ 116 943 234 38793 ]
       }
   }

Major irritant: Schemas · Specifying a JSON DSL is a major pain in the ass. JSON Schema will get you part of the way there, but if you use its modularity features, centered around “anyOf”, it becomes very hard for implementations to generate helpful high-quality error messages. The world has room for something considerably better, and I may be driven to a proposal myself.

But those syntax-irritant fruits around commas and timestamps hang so low that it’d be lovely to grab ’em. I raised the idea in the IETF JSON working group and the consensus seemed to be “It’s not that terrible, there are too many markup-language proposals already, just live with it.” Feaugh.



Contributions

Comment feed for ongoing:Comments feed

From: Paul Hoffman (Aug 20 2016, at 11:11)

Sounds like you are inventing an input language for something that could be converted to JSON. Or CBOR. :-)

[link]

From: Phil Hunt (Aug 20 2016, at 11:28)

Tim

We had a lot of this discussion about JSON in SCIM (RFC7643,7644).

The main requirement we had was telling what data parsers should look for. Is this thing a user, a group?

We felt a need to avoid using schema for validation and enforcement. We felt a "robust" approach even in security was key to interop and fit many of the current patterns in many json apis.

Yet people many have a narrow definition of schema based on xsl. This continues to cause confusion as people have to let go.

[link]

From: Anonymous (Aug 20 2016, at 11:43)

The reason JSON objects have commas as that JavaScript, like a lot of other languages, doesn't care about endlines or amounts of whitespace.

Removing commas is fine when you provide the endlines and indents you provided, but without them, if it is collapsed as it should be for transmission over the network, it would be difficult for a human to see differentiation between key pairs.

You *don't* have to put whitespace at all between key pairs. You can just use a comma, so it does not save space.

[link]

From: Daniel (Aug 20 2016, at 12:01)

JSON is a horrible choice for a data interchange format, there are other, typesafe options out there.

[link]

From: Juri Pakaste (Aug 20 2016, at 12:07)

One more thing: comments. Everyone's using JSON for configuration anyway, so it would be nice if the format wasn't actively hostile.

[link]

From: Stan (Aug 20 2016, at 12:16)

You have described Rich Hickey's EDN format.

https://github.com/edn-format/edn

[link]

From: tony kerz (Aug 20 2016, at 12:35)

irritant: mandatory quotes around keys

irritant: double quotes

i would love to see syntax closer to an actual javascript object:

{

a: 'foo',

b: 'bar,

'c.d': 'baz'

}

[link]

From: Mathis (Aug 20 2016, at 12:37)

No thoughts on comments? I think that's one of the benefits of YAML.

[link]

From: Michael Manoochehri (Aug 20 2016, at 12:40)

Commas are not a (huge) annoyance when editing JSON a modern IDE. However, lack of a standard way to include comments in a JSON document is a huge productivity sink.

[link]

From: Nick (Aug 20 2016, at 12:50)

You could go a step further and remove the quotes from keys. In your example, only "Capture Time" needs to have quotes surrounding it. Everything else can just be used as-is: image, width, height, etc. don't need quotes around them.

[link]

From: katox (Aug 20 2016, at 12:52)

What about adopting Rich Hickey's Transit? It's backwards compatible and it solves the timestamp problem almost they way you suggested (but it is more general).

As a bonus is that there are already working implementations for serveral languages:

https://github.com/cognitect/transit-format

Regarding the comma thing - the Clojure solution is a really nice one too. Commas are just whitespace. It works great!

[link]

From: Joe Hildebrand (Aug 20 2016, at 13:08)

To be fair, my response was that you just described hjson (https://hjson.org/) which already does what you want. The developer of hjson wants to standardize it, to the point of having started an Internet-Draft: https://raw.githubusercontent.com/hjson/hjson/master/rfc/hjson.txt

[link]

From: Jim deVos (Aug 20 2016, at 13:20)

Hi Tim, interesting article! With respect to commas: I agree they are an annoyance (I often get burned by leaving a trailing comma in a list) but to say it's "just" for javascript compatibility feels (to me) a bit like saying that wood is "just" for keeping the dinghy afloat.

Javascript compatibility is arguably an essential feature of JSON. In taking that away from a new JSON spec, do you fear it would hamstring adoption?

[link]

From: Anon (Aug 20 2016, at 13:27)

Hexadecimal numbers.

[link]

From: Carlos Vergara (Aug 20 2016, at 13:57)

Hi Tim, great fan, first time commenter here.

At that point, isn't it better to just pass data around in YAML?

[link]

From: Domen Kožar (Aug 20 2016, at 14:07)

It's important to mention http://jsonnet.org/ at this point as it's implemented on top of JSON and allows you to have the expressive power of turing completeness.

[link]

From: Schmidty (Aug 20 2016, at 14:35)

When I whiteboard javascript I usually use a shorthand that incorporates some python and css concepts. It's indent-aware, has no closing delimiters, and by assuming a particular constructor pattern can elegantly describe mixins. I've long itched to write a transpiler for it but it seems like kind of a big project.

.MasterClass.Element/

(constructorArg) {

this.innerHTML = constructorArg

.Decorator/

kind = "div"

someNames[

"foo"

"bar"

var button = .MasterClass.Decorator(BUTTON_HTML)/

class = "button"

onClick (event) {

// ...

[link]

From: Bob Foster (Aug 20 2016, at 14:51)

Please don't "fix" JSON. Most of it is computer-generated. Programs don't get the commas wrong. You can represent any data type as a string. And so on. It's very close to the simplest thing that could possibly work (STTCPW). Invent more things like that.

[link]

From: Anton (Aug 20 2016, at 15:15)

Hello,

Just being curious: apparently, you worked on/with many of the contemporary digital formats. Which ones do you like most? I'm asking because I find designing new formats very interesting so perhaps you could give me/us an advice on where to look for good practices, taste, valuable lessons learned, etc. in a few examples?

Thanks a lot in advance!

[link]

From: alisdair (Aug 20 2016, at 15:24)

tighter specifications on types would be nice. the rfcs imply but don't spell out that strings are utf8 and numbers are ieee 754 doubles but most implementations don't enforce either of these constraints

[link]

From: Jan Moren (Aug 20 2016, at 16:41)

We use JSON for simulation checkpointing, serializing, simulation parameter sets, and things like that. A major irritation is that the float number format doesn't allow the entire IEEE 754 standard; specifically, NaN and Inf are not allowed values in "standard" JSON. But they frequently do occur in simulation data. We get around it by simply writing our own parser (they're trivial after all), and accept those anyway.

If you're touching commas, make them optional rather than disallowed. You'll break a lot of existing JSON files otherwise.

Timestamps: you can just put them in a string:

"Capture Time": "2016-08-01T18:15:00Z"

No pressing need to add another data type just for such a special case, no? If you're going to add more types, surely there are other, less corner-casely ones to focus on first. Complex numbers come to mind (you have to serialize them as strings now).

[link]

From: Edoc (Aug 20 2016, at 17:32)

Douglas Crockford drew a lot of his inspiration for JSON from Rebol, a homoiconic functional style scripting language. Rebol uses blocks to store itself (both code and data, which are the same thing in this language). Rebol does not use commas to separate values in a block-- it might have been better if DC kept that aspect intact.

http://blog.revolucent.net/2009/05/javascript-rebol.html?m=1

Http://www.red-Lang.org

[link]

From: David Glasser (Aug 20 2016, at 22:06)

The fact that so far commenters have mentioned at least 6 distinct projects that fix these aspects of JSON, none of which have even a hundredth of the mind share of JSON, seems to make this quest a little quixotic. The best thing about JSON is not anything about its contents but just the fact that every programming environment these days has an easily accessible essentially compatible implementation. (And yes, I realize I'm speaking in the home of somebody who is intimately aware of the limitation of the word "essentially" there.)

[link]

From: Rob Speer (Aug 20 2016, at 22:39)

Alisdair: It's not the *strings* that are UTF-8.

JSON is made of Unicode codepoints. When you have bytes representing JSON, you first decode them into Unicode. Hopefully your file format or protocol intends for those bytes to be UTF-8.

So if you wanted to encode JSON in UTF-16 (boooo) you would have null bytes interspersed between all the characters, not just within strings.

[link]

From: Nathan (Aug 21 2016, at 01:19)

Looking through the comments, fixing JSON nitpicks does seem to invite quite a bit of bikeshedding. I think you can claim a higher degree of expertise in this area than most, but suggesting breaking JS compatibility because comma syntax is hard is, to put it mildly, a questionable decision given the name of the format. Once you break away from the "a valid JavaScript object is valid JSON, and vice-versa" mold, you aren't describing JSON anymore. You're describing something else entirely.

I humbly suggest that the thing you are iterating toward is, in fact, YAML. I have used XML, JSON, and YAML for a variety of purposes over the years, and I find each has its place. I think rather than trying to turn JSON into YAML, perhaps the conversation needs to be choosing the correct representation for your data.

[link]

From: Aleksander Gurin (Aug 21 2016, at 01:42)

A year ago I wrote simple object notation (SON), which is similar and has comments.

https://github.com/aleksandergurin/simple-object-notation

[link]

From: Mihailik (Aug 21 2016, at 12:14)

Breaking changes to JSON screw SO MANY other people and systems up, so -10000 points to Gryffindor for this idea straight away.

Now making it easier to copy-paste helps some people, so get +1 point o Gryffindor for that.

Of course making it easier to copy-paste also encourages the very worst instincts in practice (talk about configuration!), so again -10 points again.

All in all, -10009 points.

Great! Sounds like an idea a technical committee will definitely accept, and half of browsers adopt for a good measure too! Brilliant, well done Gryffindor. TBray — Malfoys thank you ALOT :-)

[link]

From: PeterL (Aug 21 2016, at 14:38)

As I recall, XML was never intended for people to write in. Rather, the intention was that tools would generate XML and XML would help in debugging because it's somewhat human readable. And then people started writing raw XML ... And then abominations like jquery were created ...

Same with JSON. So, this is how I do JSON: https://docs.python.org/3.4/library/json.html (which allows me to spew extra commas into my source, and even lets me leave out quotes in some situations (using dict(kw="value") instead of {"kw": "value"}) ... it doesn't solve the "time" format problem but a slight extension to it could. (For the query problem, something like this helps: http://www.swi-prolog.org/pldoc/man?section=jsonsupport)

I refuse to go anywhere near the schema stuff because that leads to Type Theory and that leads to madness.

[link]

From: Walter Underwood (Aug 21 2016, at 17:20)

Geez, the commas. I get them wrong all the time. Python allows a final trailing comma, just allow that.

Yeah, not having date types is a botch, but I can test for those in JSON Schema, so it isn't a disaster, just wrong.

Please, please, please can we have comments in JSON.

[link]

From: Jilles van Gurp (Aug 22 2016, at 01:33)

Sounds like these would be good improvements. Why not create a jackson plugin to prototype this? I maintain a small Java library (jsonj) to facilitate working with json that leans heavily on jackson. Over time I've added support for yaml, hocon, plist and bson; mostly through jackson plugins. Particularly json dialects like hocon try to improve over json by adding e.g. multi line comments, variable substitution, comments, etc.

IMHO the problem with solutions for schemas in json is that they add a lot of API bloat in the form of namespaced attributes, schema urls, and other bloat. If double digit percentages of your API traffic are meaningless schema urls, something is wrong. The whole point of json is to be minimalistic. I suspect this is a big reason many developers ignore the handful of libraries out there that attempt to support stuff like this. However, I'd love to have some sort of DSL to generate validation logic from.

[link]

From: pudge (Aug 22 2016, at 08:39)

I like commas and don't find them hard.

But on the timestamp syntax, I think the @ is irrelevant as you said, but maybe it's reasonable as a marker to humans. But if it is a marker to humans, most humans think of @ as referring to a person. Plus, @ reminds me of Swatch time. *shudder*

[link]

From: Joe (Aug 22 2016, at 09:01)

Remove commas? No. ISO8601 timestamps? Yes, but without the proprietary non-standard '@' symbol.

The entire thing about JSON is that there are only a few simple rules. More complex representations require knowledge and marshallers. For instance, there's no definitive means of transmitting a long. Try it with '1' sometime. In every typesafe language and library I've used, that comes out as an int by default.

As such, this entire proposal pretty much comes down to "I like JSON, but need more and am too lazy to write the extra 3 line wrapper to process type 'x'." I'd say no thanks.

Keep It Simple, Stupid.

[link]

From: Muhammad Rehan Saeed (Aug 23 2016, at 01:26)

Irritant 3:

Douglass Crockford has no problem with comments in JSON if its for a config file, so why is the community actively hostile to this idea?

Why do NPM and Bower have dozens of issues with several hundred comments asking for comments to be added. One workaround people have used is to use the following syntax:

{

"//": "This is apparently a comment"

}

Why force an ugly workaround to add a comment to a config file, when we have a perfectly workable solution already?

My 2 pence...

[link]

From: rektide de la faye (Aug 31 2016, at 10:05)

Yup. http://json5.org has these changes, I believe.

[link]

From: PaulTaylor (Sep 15 2016, at 12:17)

Json is a major irritant for me, really I cannot see any advantages to it over XML except if writing Javascript.

Xml is powerful and expressive and easy to read, whereas Json is just horrible.

Xml Schema make it easy to convert from Xml to Code and vice versa whereas support for this in Json is much weaker.

Anyone agree ?

[link]

From: Matěj Cepl (Oct 24 2016, at 00:14)

a) What Mihailik said. Unstable format is a complete disaster (see falling respect for whatever-is-the-current-version of RSS or Markdown for that matter).

b) However, if anything then comments and optional trailing comma in lists and objects. Please!

[link]

author · Dad · software · colophon · rights
picture of the day
August 20, 2016
· Technology (79 fragments)
· · Internet (105 more)

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.