Asset and Metadata Encryption
Every asset Capsule stores — original bytes, derivative bytes, metadata blob — is encrypted client-side before it ever crosses a network boundary. The encryption code lives in capsule-core::crypto::encryption and is the only place AES-256-GCM is invoked in the codebase. Two constructions live here:
- STREAM for asset bytes (originals + derivatives) — supports streaming, ranged reads, and per-chunk authentication.
- Standalone AEAD for metadata blobs — a single contiguous byte string with a fixed wire format.
The split is intentional: assets are huge and accessed in pieces; metadata blobs are small and always fetched whole.
Authenticated Asset Encryption
Section titled “Authenticated Asset Encryption”Every asset is content-addressed by the SHA-256 of its ciphertext and encrypted with a unique file key. The file key is derived from the appropriate AMK; the AMK itself is recoverable from the account’s master key (see Identity-Based Key Derivation).
Asset Key Derivation
Section titled “Asset Key Derivation”Each asset is encrypted with a key derived from a versioned album master key (AMK), distributed and ledgered over MLS (see MLS). Capsule never derives a key from the MLS epoch’s internal state.
An album’s AMK ledger:
Album { id: UUID, mls_group: MlsGroup, keys: [ AMK_v1: (random 32 bytes, created at album creation), AMK_v2: (random 32 bytes, created when member X was removed), AMK_v3: ... ], current_version: 3,}The per-file key is derived from the AMK version that encrypted it, using the KDF:
file_key = HKDF_SHA512( ikm: AMK_v{amk_version}, salt: file_id, info: "asset-file/v1", length: 32 // 32 bytes for AES-256; HKDF-SHA512 expand truncates safely)AMKs are delivered over MLS application messages. When epoch N’s MLS group is established, the creating device sends an AlbumKeyDistribution { amk_version, amk_bytes } message through MLS. Every current member’s device receives and stores it locally (hardware-wrapped).
Distribution lag is expected and is not a failure. An epoch bump and its AlbumKeyDistribution broadcast are separate MLS messages, so during a bump a device can legitimately receive an asset manifest referencing an amk_version whose key bytes have not yet arrived. A device that lacks the AMK for an amk_version that is otherwise within the MLS-attested epoch range treats the asset as pending — held and retried as MLS state catches up — rather than as a decryption failure or a forged manifest. Only an amk_version beyond the MLS-attested epoch, or one still missing after the retry timeout, is escalated. This is the verify_asset pending outcome and the matching Failure Modes row; it is what keeps a concurrent upload during an epoch bump from being misread as an attack.
Encryption Workflow
Section titled “Encryption Workflow”Encrypting an asset for upload:
- Derive
file_keyfromAMK_v{current}(above). - Generate a random 7-byte
nonce_prefixfrom the OS CSPRNG (7 = the 12-byte AES-GCM nonce minus STREAM’s 4-byte chunk counter and 1-byte last-chunk flag). - Split the plaintext into 65,520-byte chunks and encrypt sequentially with
EncryptorBE32<Aes256Gcm>, producing 64 KiB ciphertext chunks (16-byte tag each); the final chunk is flagged as last. - Compute the
ciphertext_hashincrementally over the produced ciphertext (algorithm fixed bycrypto_suite_id). - Build and sign the manifest (device signature + write-tier signature).
- Upload the blob (see Upload Protocol).
Streaming download / ranged reads:
- Sequential:
DecryptorBE32<Aes256Gcm>consumes chunks in order, verifying each tag. - Ranged: to start at plaintext byte
B, computechunk_index = B / 65,520. Because the STREAM construction derives each chunk’s nonce deterministically, chunkidecrypts independently givenfile_keyandi— the server need only serve that 64 KiB ciphertext chunk, which the client decrypts and verifies.
STREAM Construction
Section titled “STREAM Construction”Capsule strictly requires streaming.
The chosen method is AES-256-GCM with the STREAM construction (Hoang-Reyhanitabar-Rogaway-Vizár, 2015). STREAM splits the file into chunks, encrypts each with AES-GCM using a structured nonce (prefix || counter || last-chunk-flag), and guarantees you detect truncation, reordering, and chunk deletion.
In Rust: the RustCrypto aead crate exposes stream::EncryptorBE32<Aes256Gcm> and stream::DecryptorBE32<Aes256Gcm> — drop-in. We use a 65,520-byte plaintext chunk → 64 KiB ciphertext chunk. (Note the upload transport’s 4 KiB chunk alignment, described in Upload Protocol, is a separate concern from this crypto chunk size.)
Metadata Encryption
Section titled “Metadata Encryption”Not all metadata can be encrypted — some must stay server-readable for routing and preview. The split is deliberate:
- Encrypted (AES-256-GCM under a key derived from the album’s AMK, fresh random nonce per blob): the CBOR sidecar / metadata blobs — including the chromahash LQIP and
dominant_color, so image-derived display hints never leak to a server that never decodes assets. Each blob is independently versioned and signed like an asset manifest. - Server-plaintext by necessity:
owner_id, the ciphertext content hash, and the ciphertext size — the routing and storage-accounting facts a key-less server needs. This is a deliberate, documented trade-off. - AI embeddings (semantic-search vectors, face embeddings) are sensitive — a user can be re-identified from them. They are kept plaintext locally (vector search requires it) but encrypted at rest in the server-side backup.
CBOR metadata blobs use deterministic encoding per the canonical CBOR ruleset owned by Metadata — the same byte-exact rules the plaintext sidecar follows, since the metadata blob’s plaintext is that CBOR document. Because a blob’s hash is what content-addresses it and what the signed manifest commits to, two implementations encoding the same logical metadata must produce byte-identical output — otherwise the hash diverges and the signature fails to verify across federated peers. Conformance to the canonical ruleset is mandatory and is the load-bearing check behind cross-platform and cross-language interop.
Metadata Blob Wire Format
Section titled “Metadata Blob Wire Format”An encrypted metadata blob is a single contiguous byte string. Implementations MUST produce and consume exactly this layout, with no framing variations, so two correct implementations can compute identical content hashes byte-for-byte. This wire format is itself the contract: any byte-level deviation breaks cross-peer signature verification.
+---------------------+---------------------+--------------------------+---------------+| crypto_suite_id (2) | nonce (12 bytes) | ciphertext (variable) | tag (16 bytes)|+---------------------+---------------------+--------------------------+---------------+| big-endian u16 | fresh CSPRNG draw | AES-256-GCM(plaintext) | GCM tag |crypto_suite_id(2 bytes, big-endianu16) — pins the AEAD and KDF used to derive the key. Identical to the field carried inside the manifest (see Versioning Identifiers), and a mismatch with the manifest’s value rejects the blob at decode.nonce(12 bytes) — fresh OS-CSPRNG per blob; never reused, never derived.ciphertext— the deterministically-encoded CBOR plaintext, sealed with AES-256-GCM underHKDF-SHA512(ikm=AMK_v{n}, salt=blob_id, info="metadata-blob/v1", length=32).tag(16 bytes) — GCM authentication tag.
The total blob’s ciphertext_hash (in the asset’s signed manifest) is computed over the full byte string above — header, nonce, ciphertext, and tag concatenated.
Validation
Section titled “Validation”- Encrypt-decrypt round-trip — for both STREAM and standalone metadata AEAD, unit tests that randomized plaintext bytes encrypt and decrypt to themselves. Fixed-vector cases pin the per-primitive parameters.
- STREAM tamper-detection — unit tests that mutate each chunk in turn (single bit flip, chunk swap, chunk drop, final-chunk-flag toggle) and assert
DecryptorBE32rejects. - Ranged-read correctness — unit test that fetching chunk
iin isolation decrypts to the matching plaintext slice (no off-by-one), and that ranged reads stitched together byte-match a sequential decrypt. - Metadata blob wire-format determinism — cross-language conformance test (Rust ↔ any FFI consumer) that encoding the same logical CBOR map produces byte-identical blobs against the shared canonical CBOR known-answer vectors. This is a blocking conformance gate, not advisory: a consumer that drifts cannot be shipped, because its signatures would not verify across peers.
- Nonce-misuse refusal — unit test that the metadata-blob writer rejects an attempt to reuse a previously-emitted nonce (defense in depth on top of the CSPRNG fresh-draw rule).
Wire-format compatibility with the upload protocol is exercised by Upload Protocol smoke tests; this doc’s responsibility is the byte-level correctness of the AEAD itself.