Encoding Structured Binary Data for Transport in URIs and QR Codes
© 2020 Blockchain Commons
Authors: Wolf McNally, Christopher Allen
Version: 2.1.0
Revised: Aug 21, 2023
In order to increase security, developers of hardware cryptocurrency wallets deliberately elide wireless networking capability from their devices. Nonetheless, such devices must send and receive data through some channel to function, and the quantity of data can easily exceed human patience for manual transcription. Many device makers have settled on QR codes as a way of optically sending data from their device displays to network-connected devices. Unconnected devices that include a camera can also read QR codes. Exclusively using QR codes for the transmission of data has the advantages of transparency and the reduction of the attack surface.
While QR codes have built-in error correction and several different encoding modes optimized for different forms of data, they do not impose an internal structure on the data they convey. They do however limit the maximum amount of data that can be conveyed in a single QR code. Ultimately this limitation is due to the inherent limitations of optical readers to resolve a captured image. The largest QR code ("version 40") consists of 177x177 "modules" (pixels). Version 40 QR codes, using the binary encoding mode and the lowest level of error correction have a capacity of 2,953 bytes. This maximum capacity on QR codes becomes an issue when one wishes to convey data messages longer than the maximum supported by the standard. In addition, since the assumed use case of QR codes is usually to convey human-readable text (the canonical example being a URL) the native binary encoding mode of QR codes is not consistently supported by readers.
QR Codes support an "alphanumeric" mode optimized for efficiently conveying a subset of ASCII consisting of 45 characters:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ $%*+-./:
This character set is optimized for industrial applications, not general text (e.g., lower case letters are not included) or even URL encoding (symbols used in URIs such as ?
, =
, and #
are not included). It is also impossible to convey binary data encoded as Base64 or Base64URL using this character set as these formats require the use of both upper case and lower case letters.
Developers of cryptocurrency wallets currently all have their own bespoke ways of breaking a binary message into several parts suitable for display as a series of QR codes, and reassembling them on the destination device. This lack of standardization is one of several problems hampering interoperability between such devices.
This document specifies a method of encoding binary data of arbitrary content and length so that it is suitable for transport in either URIs or QR codes.
The name of the URI scheme for this encoding is "UR" and is intended to be analogous to existing names such as "URL" ("Uniform Resource Locator"), "URI" ("Uniform Resource Identifier") and URN ("Uniform Resource Name"). As this encoding method is intended for self-contained resources themselves, we have chosen "UR" ("Uniform Resource").
This proposed method has the following goals:
- Transport binary data of arbitrary content and length using a sequence of one or more URIs or QR codes.
- Remain agnostic about whether QR codes are displayed together or time-sequenced (animated).
- Avoid the use of QR code binary mode to support transparency and wide compatibility with QR code reader libraries.
- Use the alphanumeric QR code mode for efficiency.
- Be case agnostic, allowing use of all upper case letters (for QR code transport) or all lower case letters (canonical for display and URIs.)
- Include a CRC-32 checksum of the entire message in each part to tie them together and ensure the transmitted message has been reconstructed.
- Each single part should also be a valid URI and not require escaping (e.g. percent-encoding) of any of its characters.
- Support the addition of structure in the binary data.
- Support transmitting an arbitrary amount of data both as a minimal, finite sequence of parts and as an indefinite sequence of parts using a "rateless encoding". Our architecture for this is described in the Multipart UR (MUR) Implementation Guide.
Type | Name | Language | Unit Tests | Demo |
---|---|---|---|---|
Reference | URKit | Swift | URKitTests | URDemo |
Reference | bc-ur | C++ | test.cpp | |
Third-party | foundation-ur-py | Python | test.py | |
Third-party | ur-rs | Rust | search code for #[test] |
UR demo |
Third-party | Hummingbird | Java | tests | |
Third-party | bc-ur | TypeScript/JavaScript | tests | |
Third-party | BitConserve-UR | C++/Arduino (fork of bc-ur) |
Compliant UR codec implementations MUST pass the unit tests from the reference implementations above.
URDemo provides an interactive demonstration of single and multi-part encoding and decoding using URKit under iOS. There are also two videos of this demonstration available:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119.
fragment : Refers to one of a subsequence of bytes from the message.
message : The original CBOR structure to be encoded and recovered through decoding
part : Refers to one of a set of multi-part URs that together convey the message.
At the binary level, the goal of adding structure is accomplished by specifying the message in Gordian Deterministic CBOR (dCBOR), which is an application profile of Concise Binary Object Representation (CBOR).
All binary sequences encoded according to this specification MUST be valid dCBOR. Such a dCBOR-encoded payload is henceforth referred to as a message. All further references to CBOR in this document refer to dCBOR.
CBOR has many desirable traits, including being self-describing, fast to encode and decode, and minimal implementation complexity.
The method of encoding binary data as printable characters specified in this proposal is Bytewords.
Each UR encoded object includes a type
component as the first path component after the ur
scheme. Types MUST consist only of characters from the English letters (ignoring case), Arabic numerals, and the hyphen -
.
bytes
which represents an undifferentiated string of bytes of any length. The bytes
type exists only for testing and validation of UR implementations and MUST NOT be used for any other purpose. It also has no corresponding CBOR tag (described below). Other specifications register and document types that specify forms of structured content intended to address various application domains.
To define a structure that can be isomorphically converted between tagged CBOR and UR formats, you must allocate both a UR type and a CBOR tag, which have a one-to-one correspondence.
For example, the seed
UR type corresponds to the CBOR tag 40300
:
UR type | CBOR Tag |
---|---|
ur:seed |
#6.40300 |
IANA maintains a list of registered CBOR tags. Tags above 32767 are allocated by IANA on a first-come, first-served basis and are easy to obtain. Tags below 32768 require review and approval by IANA experts, and are therefore more difficult to obtain, while tags below 24 require an RFC and are most difficult to obtain.
The UR type is a string that is assigned by the developer of the UR type, and may be registered with Blockchain Commons along with its CBOR tag on a first-come, first-served basis.
When a CBOR with a UR type is encoded as standalone CBOR or anywhere embedded in a CBOR structure, its tag MUST match the tag registered with the UR type. So an example in CBOR diagnostic notation for a cryptographic seed to be serialized as a standalone binary object might be:
40300({1: h'c7098580125e2ab0981253468b2dbc52'})
Which corresponds to the following CBOR hex:
d99d6ca10150c7098580125e2ab0981253468b2dbc52
On the other hand, if the CBOR structure is the top-level object in a UR, then it MUST NOT be tagged, as the UR type provides that information. So the untagged CBOR is:
{1: h'c7098580125e2ab0981253468b2dbc52'}
And when serialized to hex this untagged CBOR is:
a10150c7098580125e2ab0981253468b2dbc52
Finally when encoded as a UR, it is:
ur:seed/oyadgdstaslplabghydrpfmkbggufgludprfgmamdpwmox
A list of registered CBOR types and their corresponding tags is here.
A single-part UR has the following form:
ur:<type>/<message>
For example:
ur:seed/oyadhdeynteelblrcygldwvarflojtcywyjytpdkfwprylienshnjnpluypmamtkmybsjkspvseesawmrltdlnlgkplfbkqzzoglfeoyaegslobemohs
A multi-part UR has the following form:
ur:<type>/<seq>/<fragment>
For example:
ur:seed/1-3/lpadaxcsencylobemohsgmoyadhdeynteelblrcygldwvarflojtcywyjydmylgdsa
For a single-part UR, message
is created by simply encoding the untagged CBOR structure as Bytewords.
For a multi-part UR, the procedure is more complex. The decoder differentiates between a single-part and multi-part UR by the presence of the seq
path component, which is only present in multi-part URs. The seq
path component has the form:
<seqNum>-<seqLen>
seqLen
is the length of the sequence and seqNum
and is the 1-based index into the sequence.
So for a 10-part UR, the first part will have the seq
1-10
and the tenth will have the seq
10-10
. However, parts beyond this can be generated by the fountain encoder, hence seq
values of 11-10
and up are normal.
The implementation of multi-part URs where seqNum
<= seqLen
is straightforward, as it is simply a matter of breaking message
into seqLen
fragments of fixed size (with the last fragment including padding if necessary to make it equal in size to the others) and encoding each fragment as a multi-part UR. This is all you need to create a basic multi-part UR, and you can transmit a message of any length by repeating this fixed sequence of parts until the receiver has managed to read every one exactly once. This is also called a "fixed-rate approach."
But this approach has a serious drawback: as the sender does not know which parts the receiver has successfully read and which it still needs, if any of the codes in the series is missed by the receiver, the entire sequence will need to be repeated. As long as the sequence is short, this is not usually a problem, but as the sequence gets longer, the probability of one or more missed codes increases, and the time to transmit the entire message diverges from optimal as long as the sequence is not fully received and must be repeated again.
UR uses a hybrid fixed-rate and rateless partitioning scheme called Multipart UR (MUR). MUR fountain codes are used to address the above problem by including a pseudo-random "mix" of one or more fragments in each part where seqNum
> seqLen
. These fragments are overlaid using XOR, which is an involutory (reversible) function. The receiver keeps track of the pure and mixed fragments it has received, and uses them to involute ("subtract") the parts it has from the mixed parts again using XOR. When a mixed part is the result of overlaying several fragments you have and one you don't, you can use this to extract the part you don't have. This increases the probability that a given mixed part can be reduced to a needed fragment, resulting a more robust stream of parts that, for the receiver, converges on the original message quickly, reducing message transmission time under lossy conditions.
For all the details, see the Multipart UR (MUR) Implementation Guide.
CBOR was chosen because some of the goals of URs are to 1) translate structured binary data with minimal adoption barriers, and 2) encourage interoperability between adopters.
- Protocol buffers require the use of a separate tool, the protocol buffer compiler
protoc
, with a target language-specific plugin, and also requires linking with a protocol buffers runtime library. This can be rather heavy-weight for smaller, embedded platforms. - In addition, Protocol Buffers require the use of a schema in the form of
.proto
file to define the structure of the data, which is more complex than the self-describing nature of CBOR. - There are several levels of adoption for CBOR, ranging from just a few lines of custom code in the target language needed to translate the minimally required CBOR structures, to header-only C++ implementations that only compile code actually referenced, to complete implementations that can translate structures having indefinite lengths and structures unknown in advance. CBOR allows developers to choose a level of adoption that suits them. A list of implementations is here.
- CBOR is an IETF standard, which means wide adoption and an open development process.
- IANA maintains a list of registered CBOR tags, helping standardize commonly used embedded CBOR data types and increasing interoperability.
- Building on the IANA registry, Blockchain Commons has its own registry of CBOR data types oriented towards blockchain and cryptocurrency development, each of which can be used within a larger CBOR type, or as a stand-alone top-level UR with its own UR
type
.
- JSON-LD is a text-based serialization format, which is not as efficient or compact as a binary format such as CBOR. CBOR-LD is a binary version of JSON-LD that does not take full advantage CBOR features, and in particular does not use deterministic CBOR for canonicalization. In addition, both require data be structured in a semantic graph, and they inherits from RDF, along with an "open world" data model and linked-data principles, that require to parsing of URIs/URLs and a special schema syntax for linked data and describing data types. This is more complex than the self-describing nature of CBOR.
- A conversation around Flatbuffers vs CBOR can be found here.
- An analysis of CBOR vs several other data serialization methods can be found in Appendix E of the CBOR RFC