Everyone had the experience of answering the question “Who are you?”. When you were asked, how did you introduce yourself? Did you give your name and title? Identity card number? Or the registration number of an event? When we are trying to answer “who are you”, we are also defining our identity. Identity can vary from situation to situation, sometimes as a name, sometimes as an identity card number, and sometimes as a temporary number.
In this article, we will try to answer the following questions: How does digital identity evolve into its current form? What is Self-Sovereign Identity (SSI)? How do we achieve SSI on top of a distributed ledger?
p.s. Here is the talk given this August in Taipei.
What is Digital Identity?
Digital identity is the identity expressed and stored in digital form. Since the World Wide Web was invented, digital identity began to develop until today. Website domain names, e-mails, social media accounts, etc. are all digital identities. Our daily life is inseparable from the use of digital identity. It can be arguedthat without a digital identity, there is no modern and convenient life.
According to this 2016 article, the development of digital identity can be roughly divided into three phases before the emergence of SSI:
Phase 1: Centralized Identity
For the first time, digital identity has a large demand with the popularity of the World Wide Web. The various websites that have sprung up have revealed an urgent question: How do you prove that the website you are browsing is trustworthy? An intuitive idea is that we can issue a certificate to a trusted domain name. So who will issue it? Since the institution issuing the certificate must be a credible institution, a Certificate Authority (CA) is established to be responsible for the domain name review and the issuance of the certificate. Since its development in 1995, the CA is still the backbone of PKI.
However, CA is centralized and hierarchical: the root CA issues credentials to the secondary CA, the secondary CA reissues the certificate to the second-secondary CA, and the second-secondary CA can issue credentials to the domain name registration. For a website, the domain name with the credentials can be trusted by the user, so that the user is willing to get an account on that website. Under such a hierarchical structure, the identity of a user can always go up to the root CA — that is, the root CA is the root of the identity.
It can be seen that such a digital identity relies heavily on a trusted root certificate authority, and the identity of the user is completely controlled by the domain name owner. As the service usage grows, a user may have to register for dozens of services at the same time. The identity has become broken and fragile.
Phase 2: Federated Identity
In order to solve the fragmentation of identity, an intuitive idea is to let the identity be managed by a coalition of several organizations. The identity of any domain name registration in the alliance can be used in the alliance. One example is the Liberty Alliance led by Sun. Although the federated identity slightly solves the problem of broken identity between the alliances, the identity outside the alliance is still broken, and the identity is still controlled by the service provider.
Phase 3: User-Centric Identity
This is the stage we are in currently: the ability to connect different services, different alliances, and give users more control over their identity is the goal of this stage. To make the identity of a service common across multiple services, each service needs to work together to develop the same set of specifications to verify identity across services. The results of User Consent and Interoperability make the user the center of identity. Users can decide whether to share their identity from one service to another to prevent the fragmentation of digital identity. These authentication protocols such as OpenID (2005) / OAuth (2010) / FIDO (2013), which are well known to developers, are the product of this principle.
Although users have more control over their identity and better interoperability, users are more dependent on centralized services, resulting in service providers having the power to “abuse” user privacy, such as advertising revenue. A company with a source of interest can use or sell user information without the user’s consent, and the user’s privacy is at risk of being violated.
The value and thickness of identity comes from social behavior and interactions. In a perfectly ideal (e.g. non-digit) scenario, identity should be a whole, and different information can be revealed depending on the situation, just as when asked “Who am I” can give different identifications depending on the scenario.
However, the digital identity we use today is both fragile and unable to express the thickness of identity. So, how do you implement an identity that is not controlled by any centralized service? The answer to this question has only recently appeared — a distributed ledger is the last piece of the puzzle to achieve SSI.
What is Self-Sovereign Identity?
Self-Sovereign Identity (SSI) is a digital identity that users can fully control and use between any service. SSI differs from today’s digital identities: SSI is anchored to distributed ledger and are not controlled by any centralized services. Distributed ledger enables digital identity to have the following characteristics, and it is these features that guarantee the SSI of digital identity:
Existence: Centralized services can tamper with the presence of digital identities at any time. Distributed ledger enables identity to be anchored in the form of a Decentralized Identifier (DID) and protected from tampering.
Control: Centralized services can completely control digital identity. The distributed ledger uses the digital signature that is signed by a unique private key and the private key is kept by the user.
Access: Centralized services can easily restrict identity access. Distributed ledger is replication state machine, and users can access identity at any time.
Transparency: Centralized services are mostly closed-source projects. Distributed ledger are mostly open-sourced projects, and users can realize the details of software operations.
Persistence: Centralized services have the risk of service disruption. Distributed ledger is mostly maintained by nodes that are economically motivated and are not easily interrupted.
Digital identity consists of three elements: Identifier, Authentication, and Credential. In addition to these three elements, SSI also has a fourth element: the Decentralized Key Management System (DKMS), which is due to the need to manage private keys using a digital signature for SSI.
SSI is not a completely new invention. Many technical ideas basically follow existing specifications, and the real innovation of SSI is to develop a common set of specifications: Decentralized Identifier (DID), enabling identity to be the same standard and anchored to different distributed ledgers.
The four elements of SSI have the relationship shown above, and these elements form a stack architecture: the bottom layer #1 is responsible for identity anchoring; the second layer #2 needs to interact with the underlying distributed ledger and responsible for the storage of user data and private keys; and the third layer #3 requires the use of the second layer data for user identity authentication. After successful authentication, the topmost layer #4 can send various credentials to indicate the identity of the user. This upper layer relies on the lower layer, and the interworking between the same layer is similar to the TCP/IP suite. Each layer has its own protocols and specifications, and the operational details between the layers are abstract.
Since SSI requires close coordination of a series of protocols, the progress of SSI depends on a unified specification and well-designed protocol, which requires the promotion and maintenance of a non-profit organization composed of industry. There are many non-profit organizations that continue to contribute in the area of SSI, such as:
These organizations have had very productive output in the past three years. The most active of these should be RWoT: Since 2016, RWoT has published more than 40 papers, technical specifications, and open-sourced code. RWoT’s technical specifications have been further proposed to W3C or IETF for standardization. DID specification draft is largely based on WRoT’s work; even the term “SSI” was created in RWoT.
So how does each layer in the SSI architecture work? Let’s take a look at the specifications used by each layer.
DID is the lowest and most critical layer in the SSI architecture. It is responsible for the writing/reading of identity in distributed ledger. It has a clear definition of the format and resolution of the identifier. The following is a brief description of some important components:
DID (Decentralized Identifier): A DID is an identifier consisting of numbers and alphabets, which is unique and mapped to a DID document located in a certain distributed ledger. DID consists of three parts: the scheme, the DID Method, and the DID Method-specific String. The DID method will be explained in the next section; the generation method of the DID method specific string needs to be clearly defined in the specification of the DID method.
DID Methods: A set of strings located in the DID for specifying the resolution of each DID. Each type of ledger has a DID method specific to that ledger, and its corresponding DID document creation/passing rules. For example, the DID registered in Ethereum will be in the form of did:eth:12345. The DID method needs to be registered with the W3C to be recognized by the resolver.
DID Document: A distributed ledger can be thought of as a key-value database. The DID is the key and the corresponding value is the DID document written on the distributed ledger. The DID document contains a public key representing the identity, authentication method, service endpoints that can interact with this identity, etc.
DID Resolver: It helps the upper layer protocol easily query the DID document. The resolver can parse different DID methods, and then return the parsing result to the upper layer. The upper layer protocol does not need to know the details of document parsing. DIF developed a Universal Resolver for the needs of parsing, so the resolver only needs to be deployed once. If a new DID method is registered in the future, it only needs to extend that method in the Universal Resolver.
DKMS is the main interface for users to use their own SSI. In addition to the connection with the underlying DID, it also needs to provide the storage of credentials, the backup of private keys, etc. The tasks are quite diverse. In terms of specifications, DKMS can be divided into three sub-layers:
DID Layer: Responsible for linking with lower-level distributed ledger to perform DID lookup.
Cloud Layer (Hub): Responsible for storing user’s personal data for use by upper layer protocols, such as verifiable credentials.
Edge Layer (DApp): Responsible for managing the private key. This is also a decentralized application (DApp) that allows users to use their own identity.
There are still no proposals for DID authentication specifications to be a common standard, and only one RWoT document explores the authentication in depth. There is only one task for DID authentication: enable the user to prove that he or she owns an identity. The user only needs to prove that he has a private key that matches an SSI public key. After authentication, a trustworthy and longer communication channel can be established between different individuals to facilitate the exchange of other resources, such as verifiable credentials.
There are many authentication protocols that are already used for years, such as OAuth / OpenID and so on. Similar to these authentication protocols, DID authentication uses challenge-response cycle: the verifier challenges, the identity owner responds to the challenge, and the verifier authenticates that the response is valid. As for the form of the challenge, there is no clear definition, but we must all have the experience of responding to the challenge. The password that we must enter before logging in any account is one of the ways to challenge.
VC is the earliest and most mature specification in the SSI architecture. As the top-level protocol of the SSI architecture, it has only one purpose: to replace all the ID cards in the user’s wallet. VC is a cryptographic-based digital certificate that can be used across different applications. It makes the identity return to the optimal state: the identity is in one piece and completely controlled by the user, and the user can reveal different credentials depending on the scenario. Since all SSI can issue and store credentials, the identity is not broken anymore.
VC consists of three parts:
Claims: A statement about a subject that expresses the relationship between [subject-property-content], for example: [Alice — Student — Some school] represents Alice as a student of a school.
Credential Metadata: Other information about the credential, such as type, issuer, issuing time, etc.
Proof: A digital signature for the issuer’s contents.
How do you avoid exposing excessive privacy when using VC to expose your identity? The Verifiable Presentation is an advanced specification which uses Zero-knowledge Proof to protect the credential. The details are for the author to analyze the article in the future.
This article uses quite a bit of space to introduce the background and development of SSI. Although the development of SSI is only a short period of four years, it has already achieved considerable results. We can also see new applications, protocols, specifications continue to evolve, and the ecosystem is becoming more and more complete. However, because it is a fairly novel field, information is often scattered everywhere and lacks context. It needs to be buried in the documents to accidentally figure out some clues. This article is expected to help researchers and developers quickly master the essence of SSI.