@bluesky Identity

[This fragment is available in an audio version.]

Twitter announced Project @bluesky back in December 2019. I blogged about it supportively then reached out saying I was interested, and was invited to join the conversation; thanks! Several of us offered proposals; this is part of mine, concerned with how identity might work in a world of diverse federated social networks.

Goal · On the Internet, there are many entities that provide online conversations, whether short-form like Twitter or bulletin-board-esque like Reddit. Then there are a nearly infinite number of specialized communities, for photographers, dog groomers, and the owners of particular types of boats or cameras. Let’s call these entities “Providers”. ¶

@bluesky envisions allowing online conversations to span Providers. Which is to say, from inside Twitter I could follow not only other Twitter accounts, but posts on my boat-owners’ forum. And vice versa. This is a straightforward and easy-to-understand — if not necessarily easy to build — vision, and might be worth doing by itself.

In this simple vision, there’s no linkage between Twitter Tim (“Canadian Web geek with a camera”) and boat-forum Tim (“Jeanneau NC 795 tied up in Vancouver”). For a lot of people that’s probably OK or even desirable. But in some cases, people would like to take their identity, and perhaps their reputation, with them. If you’re a big hot-shot on Parler, you’d maybe like people on Twitter and your horse-dressage conversations to know that you’ve got serious alt-right credentials (shudder).

This is a proposal for a higher-level form of cross-Provider identity.

Provider Identity · Providers who participate in @bluesky have users with identities — they control user access and behavior. Let’s call those “Provider Identities”, PIDs for short. It’s easy to imagine a syntax to express this: I would be twitter.com@timbray. This would open the door to using OAuth-2 techniques like OIDC ID Tokens for Providers’ identity assertions, which would have the advantage of excellent library support on most programming platforms. For those who care about decentralized identity, it can be represented in the OIDC framework. ¶

Any @bluesky post has an originating Provider. This information would be valuable input to reputation and other filtering operations. It would not make sense to apply the same set of criteria to posts from 4chan as to those from a Pediatric Endocrinologists’ forum.

Bluesky Identity · Most users are likely content to remain associated with their home Provider, but it would be of value to @bluesky to have a global notion of identity that is not tied to any Provider. Let’s call this a “Bluesky Identity” or BID for short. ¶

A BID would typically be associated (“mapped” for short) with multiple PIDs, normally but not necessarily on different Providers. The goal of the Bluesky identity protocol is that multiple parties can easily maintain databases of mappings between PIDs and BIDs suitable for quick lookup, for example in reputation and search applications.

A BID is represented by a globally unique opaque bit string. There are multiple plausible ways to generate them, discussed below.

This protocol assumes the existence of a reliable Ledger service, shared by all Providers, to which arbitrary messages can be committed and which are recorded immutably with strict ordering semantics. There are multiple plausible ways to implement the ledger, discussed below. Note that the transaction load would be read-mostly with a low update rate.

In the following discussion, “structured”, when applied to Provider posts and ledger messages, implies the use of an agreed-on syntax specified as part of the @bluesky protocol, to facilitate unambiguous assertion parsing.

BID Identity Protocol · I propose that Providers offer APIs to facilitate implementing this protocol, but it’s the protocol that matters so I’m focusing on that. ¶

The protocol assumes that the user has access to an account on a Provider which we’ll call P1, and write access to the ledger.

Claiming a new BID · Summary: The user generates a BID, makes a Provider post claiming it, and records that post on the ledger. ¶

The user generates a BID.
The user makes a structured post to P1 containing the BID. Let’s call this a BID-claim post.
Once the post has been created and the user knows its URL, the user commits a structured message to the ledger containing the URL of the BID-claim post.

Now the ledger contains a permanent immutable record of a PID-BID mapping.

Note that there is no particular relationship between a BID and the Provider where it was originally claimed. BIDs can be passed from PID to PID even in the case where the originating Provider has ceased operation.

Grant a BID from one PID to another · The protocol assumes that the user is executing on a computer with access to accounts on two Providers, P1 and P2, and also to the ledger. Summary: a user creates a zero-knowledge proof that the owners of the two accounts know a shared secret, posts the proof to both Providers, and records the URLs of the posts on the ledger. ¶

The user creates an asymmetric keypair. Unusually, no special care need be taken to secure the private key, which only needs to exist in one computer’s memory for a few seconds.
The user creates a nonce, signs it with the private key, and makes a structured post to P1 containing the BID, the public key, the nonce, the signature, and the P2 PID which is to be mapped to the BID. Let‘s call this a BID-grant post.
The user creates a different nonce, signs it with the private key, and makes a structured post to P2 containing the BID, the public key, the nonce, the signature, and the PID on P1 which granted the BID mapping. Let’s call this a BID-accept post.
The user forgets the private key, presumably by overwriting it in memory.
Once both posts have been made, the user commits a structured message to the ledger containing the URLs of the BID-grant and BID-accept posts. Let’s call this a BID-grant transaction.

Now the ledger contains permanent immutable evidence backing the mapping of a BID from one PID to another.

Unmap a BID from a PID · Suppose the user’s PID at Provider P1 is bound to a particular BID. ¶

The user makes a structured post to P1 containing the BID. Let’s call this a BID-unclaim post.
The user commits a structured message to the ledger containing the URL of the BID-unclaim post.

By processing the ledger in sequence, any software agent can build a consistent mapping between PIDs and BIDs.

Single-use keys · The protocol asserts one more rule: No key-pair can be used more than once in BID-grant operations. That is to say, a software agent building a BID/PID map MUST remember which keys it has seen and ignore any BID-grant transactions which re-use a previously-used key. ¶

Implementation notes · ¶

To be useful, Providers and other interested parties would use the ledger to generate and maintain a database of mappings between PIDs and BIDs. Presumably it would be keyed by both PID and BID.
The protocol requires a single-use keypair. It is assumed that storing and securing private keys is difficult and probably beyond the capabilities of many users of these APIs. Should a private key leak, it would allow an adversary to assert PID-PID linkages.
This description of the protocol assumes User Agents writing directly to the ledger. In practice, write access would be better limited to Providers, which would provide APIs such as ClaimBID, GrantBID, and UnclaimBID, which could take care of enforcing protocol constraints (such as single-use key-pairs) and correct structuring of messages, while reducing ledger spam.
A Provider might limit a PID to a single BID claim.

BID generation and ledger implementation · Ways that BIDs could be generated: ¶

A BID could be a 128-bit integer, the first 64 bits identifying the Provider where it was claimed. Any provider joining the @bluesky protocol would be given a 64-bit range and hand out BIDs in sequence.
Some organization could hand out BIDs as a service, for example a @bluesky nonprofit, IANA, ISOC, or the ITU.
A BID could be a 64-bit integer, pick one at random and if there’s a collision, discover at ledger commit, give that one up and retry with another.
If the ledger were implemented as a blockchain, the BID could be the transaction hash for the transaction recording the BID claim.

Ways that the ledger could be generated:

An organization such as a @bluesky nonprofit, IANA, ISOC, or the ITU could offer it as a service. It would not be particularly technically challenging.
The ledger could be operated in a decentralized fashion based on blockchain technology. Since write transactions are rare, the poor update throughput typical of blockchain implementations shouldn’t be a problem.

Conclusion · I’m not sure this is the optimal scheme for establishing a higher-level shared-identity construct in a @bluesky-like federated system. I might be prepared to argue that this is the simplest thing that could possibly work. ¶

Credits · The notion of using social-media posts to establish key ownership was originated by keybase.io back in the day. Paul Hoffman and Lauren Wood contributed comments that led to significant clarification. ¶

Contributions

Comment feed for ongoing:

From: Kevin Marks (Dec 01 2020, at 14:44)

I promise you that this isn't the simplest thing that could work, as we already have that and it's called URLs.

If you just put a slash in and get twitter.com/@timbray you're done, and can leverage the existing fungible infrastructure for registration.

This also has the advantage that you can use https://www.tbray.org/ instead of twitter.

Like I just did in the form above.

We even have a way of resolving multiple identity claims by linking with rel="me"

http://www.kevinmarks.com/distributed-verify.html

[link]

From: Kevin Marks (Dec 01 2020, at 15:23)

Oh, and if you want an immutable ledger for registration of these domain based identities, we have that too, as long as you use https - it's called certificate transparency https://www.certificate-transparency.org/

[link]

From: Golda Velez (Dec 01 2020, at 18:12)

hm so regarding Kevin's comment below - I think the key bit is having a registered way to create the rel= assertions, other than having to trace the certificate for the page where the rel= happens to show up...

[link]

From: Alex (Dec 01 2020, at 18:47)

Nice proposal! Fun to read stuff like this :-)

> The protocol asserts one more rule: No key-pair can be used more than once in BID-grant operations. That is to say, a software agent building a BID/PID map MUST remember which keys it has seen and ignore any BID-grant transactions which re-use a previously-used key. ¶

Why? Doesn't this lead to unbounded storage requirements?

> Unmap a BID from a PID

How do you unmap your BID from a provider that is dead?

[link]

From: Gifford Hesketh (Jan 13 2021, at 15:42)

How might this rationalize with RFC 7033's WebFinger protocol?

[link]

From: Fediverse user (Feb 25 2021, at 13:24)

So basically have a way to verify that you are the creator of the posts across multiple servers and platforms?

[link]

ongoing

What this is ·