I cooked up a RelaxNG schema for Pie/Not-Echo or whatever you want to call it, in its 0.1 snapshot form. Which, as a side-effect, generates a W3C XML Schema. This note includes specific conmmentary on this schema, general commentary on schemas (summary: Why would you ever use XML Schema?), and some recommendations for pruning Pie/Not-Echo.
Pie.rnc v0.1 ·
The schema is available at
as the snapshot versions advance, I’ll try to make sure there are
snapshot schemas under directories named by the version number; so the 0.2
schema will be in
http//www.tbray.org/ongoing/pie/0.2/pie.rnc, and so on.
While I’ve fooled around with RelaxNG, this is my first attempt to take on something substantial from scratch. It’s perfectly possible that I’ve done this in a way that is stupid or wrong or suboptimal, I’d be delighted to get feedback and will incorporate to the extent possible. I’ve created a discussion page at the Wiki; feedback there, please.
Here are a list of points with reference to the existing schema, in no particular order:
The XSD version doesn’t apply some of the same controls as the RNC
version; I’m not enough of an XSD expert to know whether XSD just
can’t do this stuff, or whether Trang doesn’t know how to generate
the XSD. In particular the XSD doesn’t do the selection magic based on
src= attributes of
I’d welcome feedback on the quality of the XSD as well as the RNC.
I can’t get Trang to generate a DTD, because there are just too many things in the RNC that have no remote equivalent in DTD’s.
I changed the namespace, because
the snapshot uses
one based in
example.com, and it’s just not OK to use that
for anything but an example. So for the moment I’m using
This version of the schema forces the top-level
attribute to have the value
here would just be incorrect and dangerous.
I tried to follow the snapshot as closely as possible.
The elements inside
allowed to appear in any order.
I’m not sure this is cost-effective.
Since these things are usually going to be machine-generated, it might
be a good idea to lock down the order of the elements.
It might also be a good idea to force any foreign-namespace elements off into
a ghetto at the end of the parent element.
It would provide another level of sanity-checking and simplify the lives of
those who are doing quickie jobs with regular expressions or whatever.
<content mode="xml"> (the default), the most common
contents will be XHTML.
So for the moment, there’s a rule that allows any mixture of elements in
the XHTML namespace, with any attributes at all.
This means that you have to have a topmost XHTML element (for example
immediately inside the
This will be useful anyhow because you have to have somewhere
to declare the XHTML namespace. Alternately, if you had declared a prefix
for the XHTML namespace higher-up in the feed, you could just plunge into
mixed XHTML content with all the elements prefixed.
If there’s demand for that scenario it would be easy enough to re-write
But requiring a top-level element feels cleaner anyhow to me.
For this cut, I didn’t put in support for embedding other things like
<ent:topic> found in the example.
This is trivially easy to add later with RelaxNG, let’s get the base
language right first.
I used the
Jing tool to
a slightly-modified version of the example in the snapshot (namespace name,
version, and so on).
I’m not planning to post the modified version, anyone who is close
enough to the problem to care is capable of grabbing Jing and fixing it up
I will also intermittently create a Pie version of the ongoing feed at
the one there right now validates with
The RNC makes use of the XSD preclared datatypes
dateTime, which are now built-in to Jing.
What Needs Fixing in Pie · The elements and attributes that are in the 0.1 snapshot are OK, except there are too many of them. The following need removal forthwith, simply because previous generations of syndication technology got by without them just fine, and we’re not here to invent stuff:
subtitle · Exactly what can we not do if we don’t have this? What prior art demonstrates its necessity?
weblog/homepage · The debate over in the Wiki had, I thought, some
crushing arguments in favor of just having a
per-person; the extras are at best un-necessary and in some cases actively
Why do we ever need more than one
<content> element per
This has never been proved necessary in previous syndication formats, and now
is the wrong time to invent it.
We have the ability to embed XML in the
and XML provides many nice mechanisms for marking-up lists of things, so
anybody who really needs this functionality can work out the bugs in that
sandbox until we know what needs to go in at the Pie level.
<content src= ·
Content-by-reference is a bold new idea, and we don’t need bold new
ideas, we need to write down what already works.
<content> can contain XML, and XML provides
excellent ways to insert hyperlinks to other things.
Work it out there and when you prove that you understand the issues, then
it’s a candidate for first-clss citizenship in the syndication format
...But the Glass is Half-Full · These gripes aside, the Pie format feels reasonably well-baked to me. All we have to do is lose the superfluous bits, find it a name, sort out a pure-HTTP API and derive XML-RPC and SOAP versions from that (let the market sort ’em out), figure out a neutral, long-lived home for the spec, and declare victory.
RelaxNG vs W3C XML Schemas ·
I invite people, even those who don’t think they’re schema weenies
(I for example am definitely not a schema guy) to have a look at
that RelaxNG compact-syntax schema.
It’s readable, it only took me two hours to get it working (that includes
downloading the Jing and Trang software, downlaoding and installing Java 1.4
from Apple, rebooting, and sorting out the usual
It does some pretty magical things with the allowed content of
<content>, based on attribute values.
It calls out to precooked definitions of dates and URIs, and it generates
XML Schema files for free.
I’d really like to see a best effort from an XML Schemas maven which duplicates the functionality of the RNC as closely as possible, as readably as possible; and maybe does some more things that the RNC can’t do.
Until I’ve seen that, my provisional conclusion is that XML Schemas are basically second-rate in terms of functionality and usability, and you can get them for free by starting with Relax NG.
So, why would you use anything else?