TOC 
Network Working GroupT. Bray
Internet-DraftTextuality Services, Inc.
Expires: August 20, 2010February 16, 2010


The Web Socket protocol
draft-bray-thewebsocketprotocol-00.txt

Abstract

... as in Hickson ...

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on August 20, 2010.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.



Table of Contents

1.  Introduction
    1.1.  Background
    1.2.  Protocol overview
    1.3.  Design philosophy
    1.4.  Security model
    1.5.  Relationship to TCP/IP and HTTP
    1.6.  Establishing a connection
    1.7.  Writing a simple Web Socket server
    1.8.  Subprotocols using the Web Socket protocol
    1.9.  Web Socket and HTTP headers
2.  Conformance requirements
3.  Web Socket URLs
    3.1.  Parsing Web Socket URLs
    3.2.  Constructing Web Socket URLs
4.  Client-side requirements
    4.1.  Web Socket Handshake
        4.1.1.  Establishing the connection
        4.1.2.  Initial To Server
        4.1.3.  Initial From Server
    4.2.  Data Interchange
        4.2.1.  Input Data Framing
        4.2.2.  Output Data Framing
5.  Server-side Requirements
    5.1.  Web Socket Handshake
        5.1.1.  Establishing the connection
        5.1.2.  Initial From Server
        5.1.3.  Initial To Server
    5.2.  Data Interchange
6.  Closing the Connection
7.  Informative References
§  Author's Address




 TOC 

1.  Introduction



 TOC 

1.1.  Background

... as in Hickson ...



 TOC 

1.2.  Protocol overview

... as in Hickson ...



 TOC 

1.3.  Design philosophy

... as in Hickson ...



 TOC 

1.4.  Security model

... as in Hickson ...



 TOC 

1.5.  Relationship to TCP/IP and HTTP

... as in Hickson ...



 TOC 

1.6.  Establishing a connection

... as in Hickson ...



 TOC 

1.7.  Writing a simple Web Socket server

_This section is non-normative._

This section provides an example of a simple use of Web Sockets to provide a feature for a specific site, with a hardcoded handshake that (safely) ignores the client handshake data.

Suppose that the server is willing to accept connections from "wsock.example.com" and run the script "/bin/demo" on that server. It accepts connections on some port and sends the following Web Socket headers to the connecting client:

HTTP/1.1 101 Web Socket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
WebSocket-Origin: http://wsock.example.com
WebSocket-Location: ws://wsock.example.com/bin/demo

Note that the order of the WebSocket-* headers is significant; they must appear exactly as shown.

The server can safely ignore the client half of the handshake by reading bytes until it sees the double-CRLF terminating sequence.

NOTE: User agents will drop the connection after the handshake if the origin and URL sent as specified above don't match what the client sent to the server, to protect the server from third-party scripts. This is why the server has to send these strings: to confirm which origins and URLs the server is willing to service.

At this point, the server and client can exchange messages of arbitrary length, each framed by a leading byte whose value is 0x00 and a trailing byte whose value is 0xFF. The bytes between the framing bytes are interpreted as UTF-8 text, and either party can close the connection.



 TOC 

1.8.  Subprotocols using the Web Socket protocol

... as in Hickson ...



 TOC 

1.9.  Web Socket and HTTP headers

The Web Socket protocol relies on the use of headers very similar to those specified for HTTP [RFC2616] (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” June 1999.): limited character set, CRLF-delimited, double-CRLF terminated. We refer to these as "Web Socket Headers". The differences between them and HTTP headers are as follows:



 TOC 

2.  Conformance requirements

... as in Hickson ...



 TOC 

3.  Web Socket URLs



 TOC 

3.1.  Parsing Web Socket URLs

[Not having studied HTML's treatment of what it calls URLs, I use terminology appropriate for URIs as defined in RFC3986. If the appropriate terms for HTML URLs are different, presumably the mapping is simple.]

A Web Socket URL MUST be Absolute and MUST NOT contain a fragment identifier.

Its scheme after transformation to lower-case MUST be "ws" or "wss"; if the scheme is "ws", its /secure/ property is false, otherwise true. If the port is not provided, it is assumed to be 443 if /secure/ is true, otherwise 80.

The lower-case version of the Host component is its /host/ and the provided or assumed port is its /port/. Its /resource name/ is the combination of the path and query components, separated by a question mark, and with the path replaced by "/" if it is empty.



 TOC 

3.2.  Constructing Web Socket URLs

The ingredients are /secure/, /host/, /port/, and /resource name/. In the constructed URI, the scheme is "wss" or "ws" depending on whether or not /secure/ is true. The host is /host/ and port /port/; the port need appear in the URI only if does not match the defaults of 80 for "ws" and 443 for "wss". /resource name/ is used for the path component.



 TOC 

4.  Client-side requirements

... as in Hickson ...



 TOC 

4.1.  Web Socket Handshake

Suppose that a user agent is to establish a Web Socket connection to a host /host/, on port /port/, from an origin /origin/, with a boolean secure setting of /secure/, with a resource name /resource name/, and optionally with a protocol /protocol/.

All of these inputs except for /secure/ MUST be encoded in printable ASCII characters, which may require that the /host/ has previously been punycoded. The /resource name/ and /protocol/ (if provided) MUST NOT be empty. The /resource name/ MUST start with "/" and MUST NOT contain any spaces. /server/ is the concatenation of /host/, and, if the port number is other than the default of 443 when /secure/ is true or 80 for false, ":" followed by the port number.



 TOC 

4.1.1.  Establishing the connection

... include items 2-4 from Hickson's list but as normal running text...



 TOC 

4.1.2.  Initial To Server

Once a connection is established, the client MUST send the following Web Socket headers to the server. The header values enclosed with "/" are the values for /resource name/, /server/, and so are as defined above.

GET /resource name/ HTTP/1.1
Upgrade: WebSocket
Connection: Upgrade
Host: /server/
Origin: /origin/
WebSocket-Protocol: /protocol/

Note that the headers MUST be sent in exactly the order shown above, and that the WebsSocket-Protocol header MUST be omitted if no /protocol/ is provided.

If the client has any cookies that would be relevant to a resource accessed over HTTP, if /secure/ is false, or HTTPS, if it is true, on host /host/, port /port/, with /resource name/ as the path (and possibly query parameters), then Web Socket headers that would be appropriate for that information MUST be sent at this point. [RFC2616] (Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” June 1999.) [RFC2109] (Kristol, D. and L. Montulli, “HTTP State Management Mechanism,” February 1997.) [RFC2965] (Kristol, D. and L. Montulli, “HTTP State Management Mechanism,” October 2000.) This includes "HttpOnly" cookies (cookies with the http-only- flag set to true); the Web Socket protocol is not considered a non-HTTP API.



 TOC 

4.1.3.  Initial From Server

The client attempts to read a Web Socket header sequence from the server. The first line MUST match this:

HTTP/1.1 101 Web Socket Protocol Handshake

An exact match is required, meaning for example that HTTP 1.0 or 1.2 is not acceptable. If the status code (the second space-delimited field) is 407, this indicates that proxy authorization is required [RFC1345] (Simonsen, K., “Character Mnemonics and Character Sets,” June 1992.) and the client MAY attempt to provide that and restart the process.

Following this, the client reads Web Socket headers up to their double-CRLF terminator. The following MUST be true (Web Socket header names are converted to lower case before the comparisons described):

If there are any Web Socket headers whose name is "set-cookie" or "set-cookie2", these should be handled as defined by the appropriate specification.

If the message received from the does not meet all these conditions, the client MUST NOT establish a Web Socket connection.

User agents may apply a timeout to this step, failing the Web Socket connection if the server does not send back data in a suitable time period.



 TOC 

4.2.  Data Interchange

Once the Web Socket handshake has completed successfully, the client and server are free to interchange messages.



 TOC 

4.2.1.  Input Data Framing

There are two framing mechanisms which may be distinguished by the value of the first byte received.

If the first byte received has the value 0x00, the message data includes all the bytes between that byte and the first following byte whose value is 0xFF. The data is to be interpreted as Unicode text encoded using UTF-8. UTF-8 encoding errors MUST be handled by replacing the damaged portion of the text with instances of U+FFFD REPLACEMENT CHARACTER.

If the first byte has its most significant bit set, it is the first of a series of bytes which encode the length of the frame, this encoding terminated by the first byte which does not have the most significant bit set. The length is computed by removing the most significant bit, with each byte then encoding a range from 0 to 127. Thus, a frame beginning with 0x81 0x82 0x83 0x04 has a length of (1 * 128 * 128) + (2 * 128) + 3, or 16643. The frame length applies only to the data which follows the encoding of the length.

Clients MUST read and ignore frames which begin with a length, and only process those which begin with 0x00.

Framing errors, i.e. the connection closing before end-of-frame, or an end-of-frame not being followed by a valid frame-start byte, are fatal errors and the client MUST fail the connection and MAY report the problem to the user.



 TOC 

4.2.2.  Output Data Framing

To send data on a WebSocket connection, the client MUST send it in frames comprising a leading byte whose value is 0x00, data which comprises Unicode characters encoded in UTF-8, and a trailing byte whose value is 0xFF.



 TOC 

5.  Server-side Requirements



 TOC 

5.1.  Web Socket Handshake



 TOC 

5.1.1.  Establishing the connection

... include items 2-4 from Hickson's text down to definition of /subprotocol/, and add a definition for /url/: A Web Socket URL constructed as defined above in Section 3.2 (Constructing Web Socket URLs) using /host/, /port/, /resource name/, and /secure/.



 TOC 

5.1.2.  Initial From Server

Once the connection is established, the server MUST send the following Web Socket headers to the client. The header values enclosed with "/" are the values for /origin/, /url/, and so on as defined above.

HTTP/1.1 101 Web Socket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
WebSocket-Origin: /origin/
WebSocket-Location: /url/
WebSocket-Protocol: /subprotocol/

Note: The WebSocket-Protocol header MUST NOT be included if /subprotocol/ is not provided.



 TOC 

5.1.3.  Initial To Server

The first CRLF-delimited line of input received by the server MUST be:

GET /resource-name/ HTTP/1.1

Where /resource-name/ MUST be a printable ASCII string containing no space characters that gives the resource name.

Following this, the server reads WebSocket headers up to their double-CRLF terminator. The first headers MUST appear in this order, identified by name: Upgrade, Connection, Host, Origin, WebSocket-Protocol, with the final WebSocket-Protocol being optional.

The value of the Host header gives the hostname that the client intended to use when opening the Web Socket. It would be of interest in particular to virtual hosting environments, where one server might serve multiple hosts, and might therefore want to return different data. The value must be interpreted as UTF-8.

The value of the Origin header gives the scheme, hostname, and port (if it's not the default port for the given scheme) of the page that asked the client to open the Web Socket. It would be interesting if the server's operator had deals with operators of other sites, since the server could then decide how to respond (or indeed, _whether_ to respond) based on which site was requesting a connection. The value must be interpreted as UTF-8.

The value of the optional WebSocket-Protocol gives the name of a subprotocol that the client is intending to select. It would be interesting if the server supports multiple protocols or protocol versions. The value must be interpreted as UTF-8.

Other fields can be used, such as "Cookie", for authentication purposes. Their semantics are equivalent to the semantics of the HTTP headers with the same names.

If a server reads fields for authentication purposes (such as |Cookie"), or if a server assumes that its clients are authorized on the basis that they can connect (e.g. because they are on an intranet firewalled from the public Internet), then the server should also verify that the client's handshake includes the invariant "Upgrade" and "Connection" parts of the handshake. Otherwise, an attacker could trick a client into sending Web Socket frames to a server (e.g. using |XMLHttpRequest|) and cause the server to perform actions on behalf of the user without the user's consent.

Whether the server does or does not read the client handshake, it must at a minimum read (and optionally discard) bytes until it has read the double-CRLF terminator. Servers may do this before or after sending their handshake, but must do it before reading frames from the client as described in the next section.



 TOC 

5.2.  Data Interchange

Once the Web Socket handshake has completed successfully, the client and server are free to interchange messages.

The rules which the server must follow to interchange data with the client are exactly the same as those that apply to the client, as described above in Section 4.2.1 (Input Data Framing), except for the handling of UTF-8 encoding errors, for which this specification defines no particular server behavior. A server could close the connection, convert invalid byte sequences to U+FFFD REPLACEMENT CHARACTERs, store the data verbatim, or perform application-specific processing. Subprotocols layered on the Web Socket protocol might define specific behavior for servers.



 TOC 

6.  Closing the Connection

... as in Hickson ...



 TOC 

7. Informative References

[RFC1345] Simonsen, K., “Character Mnemonics and Character Sets,” RFC 1345, June 1992 (TXT).
[RFC2109] Kristol, D. and L. Montulli, “HTTP State Management Mechanism,” RFC 2109, February 1997 (TXT, HTML, XML).
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999 (TXT, PS, PDF, HTML, XML).
[RFC2965] Kristol, D. and L. Montulli, “HTTP State Management Mechanism,” RFC 2965, October 2000 (TXT, HTML, XML).


 TOC 

Author's Address

  Tim Bray
  Textuality Services, Inc.
Email:  tbray@textuality.com