Atom Community Meeting ---------------------- Tim Bray, convener 2 Jun 2004 Not a decision-making meeting, or an authoritative group. Discuss issues, get to know each other. Development of syndication technology has been marred by bad behavior and personal attacks. Let's leave that behind. Morning is about process, progress, charters, and what organization to go with. Afternoon: multi-modal post issue, requirements gathering, how to handle entries in isolation (from pubsub). Paul Hoffman: not a representative of the IETF. The IETF is almost entirely volunteer, with a few employees in the secretariat. This would be an IETF working group in the applications area. First propose a charter to an area director, Scott Hollenback in this case. He would be our area director. Ted Hardy is the other director for this area. Working group chairs are appointed by the area director. Working group document authors are appointed by the working group chair. Individual submissions can always be submitted. Working Groups provide a review cycle and a place to collect the docs. Drafts from the working group are considered by the IESG to have had more review than most individual submissions. Working groups are inherently open. Scott has asked Paul and Tim to volunteer as working group chairs. In a good applications area working group, only about 10% of the mailing list membership will post at least once per year. XMLRFC is a wonderful tool, but not required for RFC submission. A working group chair will also decide whether contributions are within the charter and manage changes to the charter. A good chair will do this in public. In the IETF *all* work must be done on the mailing list. Blog or wiki discussions are OK, but they are not working group work, by definition. Face-to-face group meetings can make decisions, but they are not official until posted and discussed on the mailing list. This is good, because it means better review at leisure. The overriding goal for decisions are to make them by consensus based on technical merit. Decisions can be appealed. Face-to-face voting is usually done by humming to avoid vote counting and voter identity. Please read the Tao of the IETF posted to atom-syntax or here: http://www.ietf.org/tao.html Editorial workloads seem to be similar in the IETF and W3C, but the tools are "lighter" for IETF docs. W3C doc syntax/format is enforced at a much earlier point in the process. Eric: is a representative of the W3C. Active in URI, URN, U-R-kidding working groups, including the transition from IETF working group to W3C. Also worked in ISO, Dublin Core. W3C is a membership organization, it does have paid staff (65 people). Work is done in activities. It strives for good social engineering over process. There is process for bad behavior. Has considerable infrastructure, floor bots, teleconferencing, explicit working group membership, includes members and invited experts. Also have horizontal activities: internationalization, accessability. There are checks and balances to help the various specs complement each other. Working groups are chartered for a specified time. Membership decides on extensions. Recommend a 3 month pulse on draft publication. Working group members are part of the voting process. Consensus is greatly preferred to voting. There are lots of straw polls, but few votes. After the draft is finished, the membership has a full vote on the recommendation. Interest groups are different. Open membership, but non-voting. Considered to be much larger, open for comment. Comment can be picked up by the working group at the working group at the working group's discretion. Interest groups may be made private according to the charter of the group. Tim: in W3C, many of the critical decisions are made in teleconferences or face-to-face. Eric: running a group with no teleconferences or face-to-face could be done by specifying it in the charter. It is a bit foreign to the usual method of working. In many cases, the technical work happens in the interest group, and the voting happens in the working group. Voting is by company, not by member. Invited experts also get votes. Working group membership rules are determined in the charter along with the expected number of meetings and participation level. Discussion: Can we do both? IETF and W3C? Would have the overhead of both, but would lose some unique benefits of each. In the past, this has not worked well. When the marketing departments get involved with "we won" in the IETF process, which interfered with the progress of the group. The IETF process is vulnerable to vendors trying to inflence the standards through public announcements. In those cases the standards work is better done in another group, like W3C and OASIS. It looks like about 3/4 of the people here are not W3C members. What is the chance that there are parallel groups, one at each? W3C is very interested in this work, but is proposing this to the Atom community, not trying to run the show. WebDAV worked like this, in the IETF process, but with continuing W3C interest. If this is an IETF group, can we depend on W3C involvement in IETF meetings? No promises, but this fits with accessibility and semantic web. Atom does belong with the W3C as a web protocol, but worried about delays and complications caused by required compatability with all the W3C architecture. But do we want to get something done quickly? This is a legitimate fear (Eric), but the liaison and coordination needs to get done anyway, now that Atom is above the radar. Fronloading the work will take longer at the beginning, but will get to the end faster. The director said that RDF/XML serialization format would not be a requirement. The RDF model might be or might not be a requirement. Going with the IETF you would lose the additional tools and structures like straw poll tools. Going to the W3C would lose a lot of people who would not be able to participate fully (interest group members cannot submit drafts). Also would lose security expertise which mainly lives in the IETF. RFCs are somewhat more widely distributed than W3C recommendations. The risk of going with the IETF is chaos, risk with the W3C is being run by companies. The W3C process is percieved as closed, so an open group there is perceived as less open, even when the charter is explictly open. Conspicuous openness is important to this effort. We've already done quite a bit of work going down the IETF path, so a W3C process needs to be more than 10% better to justify a switch. Why do we need a group at all? Sam: because there are things which are not coming to a close, and we need process to make more progress. Intellectual property -- both organizations are averse to IP-encumbered specs. The IETF has a stated preference for non-discrimatory licensing, but no guarantee of royalty-free licensing. The W3C has a formal process for accepting or rejecting IP-encumbered recommendations, but IETF does not. The open discussion and consensus in the IETF will impede encumbered specs. In both cases, the only standards which will require licensed IP will be where it is unavoidable. There is some lingering disagreement about whether there are differences in the IP processes and policies of the two organizations. IETF is based on rough consensus and running code. W3C is restricted by full consensus and can be swayed by a small number of very committed individuals, something that is quite likely in the syndication community. Also, addressing every comment is quite a bit of overhead. The W3C is better equipped to do use cases, test cases, and validation. Translation and outreach is funded in the W3C. Tighter coordination is a feature, not a bug. Staffing helps things move along more quickly. Difference in produced specs. W3C controls future versions, Atom 2.0, and that would be decided on by the membership. In the IETF, Atom 2.0 would be a proposal to the area chair. On balance, the working groups within the W3C have a greater affinity with Atom than those in the IETF. Experts on staff are very useful. W3C has a communication staff which can help make a spec. There is a feeling that what the W3C does is newsworthy. The W3C would probably dedicate 30-40% of a person to support this working group plus the normal slices from coordinating groups. Need to be sold on the benefits of either to the Atom community. How would the two groups handle a comment-based denial of service attack attempting to bog down the process -- thousands of comments from thousands of people. W3C: if the comments are considered out of scope, then they are out of scope. If it is a form letter, it is a petition and gets one response. If they are separate, they must be replied to. IETF: when this happened last year, the working group leader (IDN) replied individually to see if the poster wanted to continue participating, and a few people did join. ***Lunch*** Technical discussions start. Issue: multi-modal post. How should Atom support this? Mixed content, text plus one or more photos, or camera phone photo plus text. Current draft says nothing about this. Older drafts have media object, which is implemented in a few tools. General observations: servers may have constraints about whether the client can specify the name of the URL, in cell phones people may care very much about the number of connections or latency and the number of bytes. Cell phones may have large error rates, 10-20% failed transactions. How does the creation tool handle the bundling? This is a compound document. What is the downside of ignoring the problem? Lose interoperability about specifying URL name portions. More expensive for cell phones. What about posting the sub portions first, returning placeholders, then rewriting the parent document. Are there servers that have limitations on URLs/names? MT has a separate repository for photo albums, with photos posted there and linked to. This is mostly for gallery creation, but can be used for independent images. As the protocol stands, everything posted goes into the feed, but we don't want the Blogger: private method, give some support data and image and it can be placed in an absolute path, return a success/fail. XML-RPC API. Doing it in a limited fashion and waiting for something better to come along. Currently think that the simplest way to go is to do one file per post. For the mobile people, multiple postings don't seem to be a serious problem. Considered using HTTP pipelining instead of batching, but POSTs can't be pipelined. Solutions so far: - do nothing, post non-feed objects to a different place or use a non-Atom post, deposit them elsewhere, then post the parent content - create a new thing in the Atom protocol using the object element to include non-feed items - don't syndicate, add an attribute to say don't put this in the feed (but then this is not an entry) - atom:resource which is like an entry but isn't one - use MIME multipart - post and return content-location header in the response If the server needs to control the URLs, then the links will need rewriting. Is this done on the server, there is content rewriting there. Two issues: non-feed objects and packaging mechanism. Tim's use case -- separate staging server with relative URLs, then publish them to the live site. One transfer or more than one. Multiple transfers can end up with orphaned data. Are the clients going to implement transactional semantics? Failures in large transfers might be handled by the future HTTP PATCH method. Are orphans a problem? A client bug could create a multiple copies very fast. Orphans will exist without any help from the protocol. Deleting entries and partial transfers (intentional or accidental) will create orphans. Servers can manage this and account policies can clean it up. Who chooses the URI? The client requests a URI and the server can reply with its own cho really Is the inefficiency of base64 encoding a realy problem? Not for anyone in this room. As a use case, pubsub entries are atomic without any external files. If there are multiple parts, they must be integral, because the entries don't have a location. Could do this with a MIME package. Atom would need to be able to express the concept of a self-contained multipart document. Strawman solution: adopt one of the first four above, and have an entry with multiple resource fields. This would not work well for publishing to a site which chooses its own URLs. Use a URL to resources works fine when you are connected, but not when you are reading on the train -- disconnected operation. Also can be used for signatures. Use MHTML packaging. Could use the data: URL scheme. This is not quite the same thing as posting a multipart item. This is more like disconnected web browsing. Wofl is a web off-line reading system, for example. Three possibilities: don't solve it, solve it as part of Atom, solve it external to Atom with a recommended solution. Similar to HTML authoring: 1) don't over-constrain the problem, let the market solve it, e.g. staging->live process, 2) xxx Similar to RSS enclosure, so they can pre-fetch/cache the item. Imagine an ordinary Atom entry with normal, broken HTML containing three hrefs, and an extra element which lists the included hrefs so they can be pre-fetched. For example, two versions of the feed, one with embedded images, one without, or one with only thumbnails. Could be used for archive or import/export. In IETF style, this could be an individually-submitted draft to the working group. MHTML solution: A feed with three entries, one of which has three images. Do that with a single MHTML package, with the Atom feed as the root part, containing all three entries, and the images as separate parts. MHTML as a packaging mechanism does not work with Jabber/XMPP, that would need an XML packaging mechanism. Jabber is pure XML, same for SIP. Does this need to be a core part of Atom? Could it be a companion spec which showed how to use Atom with XML packaging? Leaning towards not addressing multipart feeds in the Atom core. Now what about a posting which is logically multipart, e.g. text plus two images. What about multiple representations of a single resource, like text and audio of the same news item? How to submit separate items? PUT works OK, but it forces a specific URL, which doesn't work for a load-shared server farm. POST allows you to do that. Are we re-inventing WebDAV? Blogger started out looking at that. Sam: the features needed for Atom don't really match any of the conformance levels of WebDAV. We would need to work with an expert and maybe define some extensions or profiles. WebDAV has had some interoperability problems, but it is getting better specifically for files. Is WebDAV as simple as Atom? Would including it make Atom much more complex? Probalby not. The protocol portions of it are not that big. That is a possibility we should look at. How do we mix the protocols (WebDAV and Atom)? Do we have different endpoints, options, or always do it the same way. Do it the same way. The great advantage of Atom is that the published entry and the feed are the same format. WebDAV supports reading directories, checking permissions, and sending files all over HTTP. Would Atom be defined as an extension to WebDAV, with a new profile and some new operations? Need an exact proposal for how we would serve this with WebDAV. In the absence of that, we have workable solutions which are Atom-only, for example, the existing coherent proposal for atom:resource. Need to make a repository of use cases, and match those to the options which satisfy them as a help in deciding which one is the right choice. Should the use case document be formal or a wiki page? A companion test case document is also useful. New technical issues: Discoverablility, how do I find all the old edit URIs with a new client tool? Creation info, return a list of resources that were touched by the post, for example the categories. Entries in isolation, the canonical form of an entry. Extensions. Is there an abstraction for templates which could be common across tools? "Wouldn't it be nice if Atom did X." If Atom has a well thought out publishing protocol and does what RSS does, then we are successful. There is a need for a marketing angle. RSS is currently winning the buzz war. If there are some thing which Atom does beyond RSS, then it has a story. Define a sparse, well-defined core with a set of extensions. Having a good extensibility story would be a plus. Other clear pain points we could address? Should have a must-understand attribute and version numbers tied to attributes of future extensibility. That is a big win. Allows vendors to safely make local extensions. It is possible to miss items when fetching sequential top-N feeds. If Atom could address this, providing a way to reliably fetch all entries, it would be a distinctive advantage. Look at work in OAI (Open Archives Initiative), though there are some serious problems with their specific protocols. Also check NNTP. Create a body to review and register extensions. Or use W3C for that? Everyone who posts notes, please send a link to your notes to the atom-syntax list. wunder -- Walter Underwood Principal Architect, Verity