ongoing by Tim Bray

Well, well, we now have two freshly-baked HTTP-based Web Resource CRUD protocols which advertise themselves as being RESTful. Microsoft’s new Web3S is designed to support remote update of Live Contacts, which is, and I quote: “the central data store in Windows Live for address book information. All Hotmail contacts, Messenger buddies and Spaces’ friends are recorded in Live Contacts. There are currently approximately 500,000,000 active address books in Live Contacts.” See Yaron Goland’s intro APP and Dare, the sitting duck (read the comments too), then the draft spec Web Structured, Schema’d & Searchable (Web3S) and its FAQ. There’s a reaction from David Ing, Not Your Father’s MData; the comments below might be a good place to aggregate more links. [Update: Yaron Goland has addressed the issues I raised here, FAQ-style, in a comment below.]

History · With all due respect to Yaron, it is really unfortunate that the first time the world heard of Web3S it was in a post of Dare’s aimed at explaining APP’s failings, which unfortunately revealed that he hadn’t read the spec and didn’t understand it. The treatment he got, especially from me, was pretty harsh; but then the post was pretty clueless, and in my experience Dare normally isn’t, so Occam’s razor naturally generated suspicion of what lay behind the content.

And guys, if you’re going to work at Microsoft, you’ll just have to accept that you’re in a world where you’re routinely suspected of the worst. It’s too late to go back and change the history that led here; they teach distrust-of-Microsoft as an MBA-level business basic these days. Learn to deal with it or find a new job.

It might have been nice if the Live team had, you know, actually tried interacting with the other group which was noisily and in public designing a general-purpose REST-based protocol, but that’s water under the bridge.

Elephant in the Room · Um... LDAP? I thought the technology for accessing and maintaining online address-books and directories was kind of well-established and there were lots of perfectly good commercial products that exist to do just this. Why re-invent the world? Anyhow, this deserves an entry in the FAQ.

Web3S · I spent an hour with the spec and the FAQ and, while there are things that make you go “Huh!?” there’s nothing obviously insane. If you decide you totally can’t model your world as collections of entries populated with hyperlinks to express relationships, well then I guess APP’s not for you. And at the level of engineering intuition, I have to say that a monster online address book does feel different at a deep level from most online “publications” (I thought that was why we had LDAP... but I repeat myself).

A Suggestion for the Web3S Team · Get yourself a test suite! APP has already been helped by the existence of code from Joe Gregorio, me, and others. Test suites matter way more than specs, in the big picture.

Design Notes · This is not a comprehensive technical review; I’m too overloaded for a deep-dive right now. Rather, it’s a series of half-cooked questions that went through my mind as I read the spec & FAQ.

LDAP? (See above).
XML and JSON... why not just use JSON? Having two serializations makes implementors’ work twice as difficult, and forces spec-writers into stinky contortions and abstractions like EII and SII; the APP spec gets to talk just about “elements” and “attributes” and is easier to read. I don’t see where, in an LDAP-like protocol, you’d need “document-like” constructs, they look mostly like relational data records; so I’d just just wrap ’em in JSON, which means you get compulsory UTF-8 too, further simplifying implementors’ lives.
If you’re going to abstract your data store as a big XML doc, mightn’t XQuery be useful? There’s already some talk of XPath.
Damn, those are some butt-ugly URIs. I guess having a straight formal mapping from data model to URI simplifies things, though, and I suspect that people mostly won’t see them anyhow. Addressability is good.
I’m having trouble sorting out this reverse-domain-name to XML-namespace mapping. In the example, you have:
```
<LiveContacts xmlns=”Web3SBase:com.live.livecontacts”
              xmlns:Web3S = “Web3S:”>
   …
   <Contacts>
```
But then shouldn’t the namespace of the Contacts element be com.live.livecontacts.contacts? Needs more digging.

Also, I suspect the invention of a new URI scheme is really misbegotten. But tons of engineering groups make that particular mistake. You might want to consult Webarch §2.4 on this subject.
In Example 2, they say “the ID is intentionally omitted” but in fact it seems to be there in a slightly variant form, 123A instead of 123ABC. Error? Or just a little more explanation needed?
I really have trouble believing that inventing the new UPDATE HTTP method is a good idea. They want to do a potentially complicated in-place update of their data structure while preserving schema-validity, which is perfectly reasonable. Furthermore, they define a “delta” data type to be used to accomplish this, which seems really architecturally sound.

So why not just use POST? All it does is say “I’m going to change the state of the system in a way that depends on lots more than just the verb.” You’re not going to be able to understand the change anyhow without a full implementation of Web3S semantics, so what value is gained by having the verb be UPDATE instead of POST?

Introducing a new HTTP verb to the Web infrastructure (firewall administrators will just love you) and to the HTTP libraries of C, C++ C#, Perl, Python, Ruby, Java, and all the other languages people use is a big fucking deal, and it would be really nice not to go there.
Order matters on the wire but not on disk? Yow, my brain hurts.
This notion of “annotation” is hard to understand, maybe they should start with some motivating examples of why you’d want this and what you’re trying to accomplish.
Open Issue 3SAFR on white-space... not sure why this is hard. You have to be clear about grammar rules inside your elements, but who cares what if any WS occurs between them?

Conclusion · I’m an APP partisan, but I’m a bigger REST partisan. I’m not close enough to the Live Contacts problem to have a useful technical opinion on how to solve it, or whether it needs something new as opposed to something that’s already here. But if they need to invent a Web interface I think it’s smart to make it RESTful.

Good luck to them; they’ll need it, and a test suite too.

Contributions

Comment feed for ongoing:

From: Aristotle Pagaltzis (Jun 18 2007, at 04:32)

Or at least: why not rekindle PATCH? (If nothing else, that’s much less vaguely named; “UPDATE” could refer to a number of distinct operation concepts.)

[link]

From: Sam Ruby (Jun 18 2007, at 05:17)

My take: http://intertwingly.net/blog/2007/06/18/Web3S

[link]

From: David Smith (Jun 18 2007, at 08:32)

Posts like this keep me coming back - thanks.

You don't have to be part of an MBA program to have a little Microsoft distrust - a few decades of close relationships will do the same thing. Respect, yes, but distrust as well.

[link]

From: Robert Sayre (Jun 18 2007, at 09:13)

Two points:

1.) JSON can be encoded using UTF-8, UTF-16, or UTF-32. See the RFC. Everyone uses UTF-8 or ASCII, though. ASCII? Yes, most serious JSON encoders end up getting patched with a mode that escapes all non-ASCII.

2.) Most of the novel things Web3S tries to do were either formally rejected by the Atom working group or filibustered by a few of its louder members. So, I'm not sure the "why didn't you come talk to us" pleas can be taken very seriously.

[link]

From: Yaron Y. Goland (Jun 26 2007, at 16:54)

Your questions made for great FAQ entries so I wrote them up that way.

Why not use LDAP? – http://dev.live.com/livedata/web3sFAQ.htm#WhyNoLDAP

Why not just mandate JSON? – http://dev.live.com/livedata/web3sFAQ.htm#XMLJSON

Why not just use XQUERY? - http://dev.live.com/livedata/web3sFAQ.htm#XQUERYSupport

UGLY URLS – http://dev.live.com/livedata/web3sFAQ.htm#LongURLs

Question about reverse domain to XML-namespace mapping – You got it exactly right, it’s com.live.livecontacts.Contacts.

Question about inventing yet another URI scheme – I have to admit that I’m not that worried about new schemes that are not dereferencable. But still we could use tag. Although tag:ietf.org,2007:web3sBase:com.microsoft.foobar looks a little long winded but I suppose it’s not too bad. Obviously can’t use ietf.org before we are an RFC (assuming we go to the IETF and assuming we become an RFC). So I suppose that means we have to use tag:live.com,2007:web3sBase:com.microsoft.foobar. That will make transitioning to the standard later a real pain. Where as if we invent Web3sBase:com.microsoft.foobar the situation is easier.

The real question is – what’s the benefit that we and the rest of the universe get from using tag over inventing Web3sBase?

Example 2 – 123A is the ID for the telephone entry. That is distinct from the ID for the contact which is 123ABC. The 123ABC ID was specified in the request-URI. So no bug. It is confusing but that is mildly intentional as I want to hammer home the point that IDs are only unique within their local context. So, for example, you will see a lot of <Web3S:ID>1</Web3S:ID> in our data.

New UPDATE Method – http://dev.live.com/livedata/web3sFAQ.htm#WhyUPDATENotPOST

Ordered in message but not in DB – ATOM does exactly the same thing. Section 4.1.1 of RFC 4287 explicitly states that by default there is no semantic meaning to the order of entries in a feed. But section 10 of APP explicitly stays that ATOM feeds retrieved via the protocol should be returned with a specific serialization order, in this case, the order specified by their “app:edited” property. The distinction being that the data model is by default unordered but the serialization must, by definition, have an ordering and that ordering could have semantic relevance. Web3S is no different. Data stored on the server is unordered. But clients can request that when a representation is serialized of that data that serialization should have an order. For example a client can submit a sort value that it would like its serialization to be sorted by.

Annotations are hard to understand – I think that’s a good suggestion. I have left myself a ToDo to try and rework the annotation discussion around an example.

Issue 3SAFR & Whitespace – The problem is that we really want to let people ‘pretty print’ their Web3S data structures but this can introduce extraneous whitespace. And then we have the problem of how to treat white space in annotations which might theoretically have different rules than Web3S infoset content. My guess is that I just don’t really understand whitespace handling in XML and need to do more research.

[link]

From: Julian Reschke (Jun 27 2007, at 00:28)

Yaron,

wrt to choosing a URI: you man want to consider a URN in the "IETF" namespace, see RFC2648 (and RFC4791 for an example).

..."tag" is only one of many alternatives.

In general, I really would avoid minting new URI schemes as long as existing works, your use case seems to be totally covered by things that already exist.

Best regards, Julian

[link]

From: Yaron Y. Goland (Jul 08 2007, at 16:27)

The IETF URN solution doesn't work because we aren't an RFC and won't be any time soon and I have no desire to see a huge name translation when we become an RFC. Besides, this brings up interesting issues about what happens when an RFC gets updated and thus generates a new RFC number. I believe it's fine to have URNs point to RFCs but I think it's an abuse of the mechanism to use those URNs to point to protocol elements.

Using URNs in general is an interesting question. I personally tend to see URNs as being a location independent lookup into a location specific mechanism. E.g. I personally only see URNs as being useful when they can be translated into a URL. It is certainly legal to have a URN that can't be translated into a URL but as RFC 1737 says "It is strongly recommended that there be a mapping between the names generated by each naming authority and URLs." This resolvability is explicitly a non-goal in our case which I believe argues that URNs are in fact the wrong solution to use.

[link]

From: Julian Reschke (Jul 09 2007, at 01:20)

Yaron,

the IETF URN scheme doesn't require the RFC # to appear in the URN; for instance, CalDAV uses "urn:ietf:params:xml:ns:caldav".

WRT to URN schemes recommend to be resolvable... You are quoting a spec back from when people liked the distinction between URLs and URNs. The thing you currently already is a URN, so the resolvability requirement would apply to it as well.

In general, I see little point in trying to mind namespace names that can not be resolved. It's certainly against W3C's recommended practice.

Best regards, Julian

[link]

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

June 17, 2007
· Technology (90 fragments)
· · Microsoft (28 more)
· · Web (398 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!