Metadata
The CBOR sidecar is the canonical, plaintext-local-only metadata record for every asset (see Filesystem — Client). It is self-describing: field 0 carries the schema version so any reader can detect a schema it does not implement before parsing the rest. Versioning the schema in-band is what prevents a faulty or old client from corrupting state with a partial parse.
This doc is the single source of truth for the CBOR sidecar schema. The schema below — every field, type, and ordering rule — is the contract every implementation must conform to byte-for-byte (else cross-peer signatures break). Per the SSoT rule, other docs reference fields here by name and never re-declare them.
All metadata processing lives in capsule-core::metadata (extraction, filtering, querying) and capsule-core::sidecar (encoding, signing, schema versioning). Implementation is in Rust and exposed to all native clients via FFI from capsule-core — the I/O is handled natively to minimize FFI surface.
Sidecar Schema v1
Section titled “Sidecar Schema v1”SidecarV1 { sidecar_schema: u16, // FIELD 0 — readable before parsing the rest. Currently 1. crypto_suite_id: u16, // matches the asset's manifest; see Cryptography uuid: UUIDv7, hash: bytes, // canonical plaintext digest; algorithm + length fixed by crypto_suite_id (see Primitives) capture_timestamp: RFC3339, import_timestamp: RFC3339, content_type: String, // closed enum per protocol_version dimensions: Option<{ width: u32, height: u32 }>,
// display placeholder — image-derived, lives inside this encrypted sidecar (see Thumbnails — LQIP) lqip: Option<{ chromahash: bytes, format_version: u16, dominant_color: [u8; 3] }>,
// collaborative metadata (see Collaborative Metadata below) tags_user: OR_set<(tag: String, add_id)>, tags_ai: OR_set<(tag: String, add_id, model_id: String, model_version: String)>, caption_lww: Option<{ value: String, ts: RFC3339, by: device_id }>, superseded_captions: Vec<{ value: String, written_by: device_id, ts: RFC3339 }>, // bounded ≤ 16 rating_lww: Option<{ value: u8, ts: RFC3339, by: device_id }>,
// organization — stack grouping; StackMembership shape owned by Asset Organization stack_membership: Option<StackMembership>,
// identifiers (see Identifiers below; privacy-on-export rules apply) camera_id: Option<{ model: String, serial: String }>, device_id: UUIDv4, session_id: UUIDv7,
// geolocation (see Geolocation below) gps: Option<{ lat: f64, lon: f64, source: GpsSource }>,
// provenance binding provenance_chain_hash: [u8; 32], // hash of the latest ProvenanceRecord for this asset
// forward-compat _unknown: Map, // unknown CBOR keys preserved verbatim, never executed
// signature signature: Hybrid(Ed25519, ML-DSA-65), // covers every byte above, including _unknown}Schema Versioning Rules
Section titled “Schema Versioning Rules”sidecar_schemais CBOR field 0 by deterministic key order (RFC 8949 §4.2). A reader can determine the schema before allocating a parser for the rest.- A client whose
max_known_sidecar_schema < this.sidecar_schemarefuses to write to that sidecar. Reading is allowed only in read-only mode if explicitly opted-in. This is the refuse-by-default rule from the threat model — an old client cannot strip-and-resign a newer sidecar. - The signature covers every byte including
_unknown, so stripping unknown fields invalidates the signature and is detectable. - A schema bump is a coordinated change; per Versioning — Album Protocol Version Pinning, an album’s pinned protocol version constrains which sidecar schemas may be written into it.
Canonical CBOR Encoding
Section titled “Canonical CBOR Encoding”The sidecar — and the encrypted metadata blob whose plaintext is this same CBOR document — must serialize byte-identically across every implementation and language: the bytes are what the signed manifest and content hash commit to, so one divergent byte makes an honest sidecar look forged to another platform or federated peer. The canonical rules are RFC 8949 §4.2 deterministic encoding, normative here:
- Definite-length encoding only — no indefinite-length maps, arrays, text strings, or byte strings.
- Shortest-form integers — the smallest of the 1/2/4/8-byte encodings that represents the value.
- Map keys sorted by the bytewise lexicographic order of their encoded form, with no duplicate keys. This ordering governs every map, including
_unknown— unknown keys are re-sorted into the same canonical order on write, so a round-trip through any conformant client is byte-stable and the signature (which covers_unknown) still verifies. - Floats in the shortest IEEE-754 form (16/32/64-bit) that round-trips the value exactly; the canonical quiet NaN for NaN. Capsule avoids floats in signed structures where an integer or string suffices.
- Field 0 (
sidecar_schema) sorts first under the rule above, so a reader reads the schema version before parsing the rest.
Every implementation — the Rust capsule-core::sidecar encoder and any FFI consumer — MUST emit identical bytes for the same document, enforced as a blocking cross-language conformance gate against shared known-answer vectors committed in capsule-core::sidecar (the same fixtures Encryption tests against): a consumer that drifts cannot ship, because its signatures would not verify across peers.
Add-id Binding
Section titled “Add-id Binding”add_id is the tuple (device_id: UUIDv4, monotonic_counter: u64), where monotonic_counter is incremented per-device per-(asset, OR-set) pair. Every OR-set add carries an add_id; every OR-set remove targets a specific add_id. A remove that names an add_id the receiver has never observed an add for is rejected, not silently no-op — preventing the “remove an element you never added” attack noted in the Threat Model.
Counter durability across restarts. A monotonic_counter must never repeat for a given (device_id, asset, OR-set): a reused add_id would alias two distinct adds, so removing one would silently delete the other and break OR-set convergence. The counter is persisted in the local index, and on client restart or reinstall it is reseeded to one past the maximum add_id.counter this device has ever issued, recovered from the signed sidecars themselves (a device’s own past add_ids are durably recorded in the sidecars it wrote). An add lost to a crash before its sidecar was persisted was never observed by any peer, so its counter may be safely reused — correctness depends only on never reusing a counter that ever reached a written sidecar. A counter is reset to zero only when the device can prove it has issued nothing — i.e. no sidecar bears its device_id. This makes the counter monotonic over the lifetime of a device_id, not merely within one process.
Identifiers
Section titled “Identifiers”The three identifying fields defined inside the sidecar schema are subject to the Privacy on Export rules below when an asset crosses a trust boundary.
- Camera identifier (
camera_id). Model ID of the device plus a unique identifier for the specific device (e.g. serial number). Useful for grouping shots from the same physical camera across libraries. - Device identifier (
device_id). UUIDv4 generated on the original importing device. Useful for provenance. - Session ID (
session_id). Identifies the authenticated session in which the asset was imported. Defined in Session Management.
Privacy on Export
Section titled “Privacy on Export”The identifiers above and several other metadata fields are fingerprinting surface if they leave the user’s trust boundary unredacted: a camera serial uniquely links every photo to one physical device, and precise GPS reveals home addresses. When an asset crosses a boundary, Capsule strips these fields by default and only includes them on explicit opt-in.
A boundary crossing is any of:
- A share link is generated for a non-member of the album.
- An external backup is exported to media the user will hand off (e.g. cloud storage shared with someone else, a physical drive given to a friend).
- A federated peer outside the owning user’s home server fetches the asset (see Federation).
When the boundary is crossed, the following fields are stripped from the exported metadata blob unless the user has explicitly opted in to retain them:
| Field | Default on export | Opt-in retains |
|---|---|---|
| Camera serial number | Stripped | Full value |
| Device identifier (UUIDv4) | Stripped | Full value |
| Session ID | Stripped | Full value |
| GPS coordinates | Rounded to 2 decimal places (≈1 km) | Full precision |
| Personal contact tags (faces matched to a known person) | Stripped | Retained |
Stripping happens at the moment of export — the encrypted sidecar inside the user’s library is untouched, so the user does not lose the data locally. Retention opt-in is per-export, not a sticky account setting, to prevent foot-guns where a user opts in once and forgets.
Capsule’s own devices syncing the same user’s library do not trigger this redaction — that is intra-trust, not a boundary crossing.
Collaborative Metadata
Section titled “Collaborative Metadata”User-editable metadata on a shared album — tags, captions, ratings — can be edited concurrently on different devices, including offline. To make these merges deterministic, such fields are modelled as CRDTs:
- Tags: an OR-set (observed-remove set) with explicit
add_idbinding, so a tag added on one device and removed on another converge predictably, and a remove that targets an unknownadd_idis rejected rather than treated as a no-op. - Single-value fields (
caption_lww,rating_lww): last-writer-wins registers keyed by a signed timestamp and the writingdevice_idas the lexicographic tiebreaker.
Surfacing Concurrent Edits
Section titled “Surfacing Concurrent Edits”A plain LWW register loses one side of a tied edit silently — a real problem when two people caption the same photo from different devices within seconds. Capsule keeps the most recent value as authoritative and preserves the displaced ones:
- The losing value of every concurrent caption edit lands in
superseded_captions, capped at 16 entries (oldest evicted). Each entry carries who wrote it and when, so the UI can surface a “this caption replaced another” hint and let the user restore the earlier value. - Ratings are unambiguous numerically; they do not need a superseded log.
This converts a silent-data-loss damage vector (a buggy client clobbering another device’s edit) into an explicit, recoverable surface. See Threat Model — Forbidden Client Behaviors for the corresponding rule that clients must never strip superseded_captions.
How Operations Travel
Section titled “How Operations Travel”We encrypt the operations, not the resulting state. Merges are then commutative and associative, so order of arrival does not matter and a peer replaying a stale operation cannot corrupt current state. The operation log reconciles into the canonical CBOR sidecar, which remains the source of truth (see Core Principles — recovery-first).
Each operation carries the same prior_provenance_hash chain link as any lifecycle action, so a metadata-update is provenance-tracked exactly like a create or delete.
Album membership is deliberately not a CRDT here — it is driven by MLS proposals and commits (see Cryptography — MLS), which already resolve concurrent changes.
The same encrypted-operation path also carries the per-owner library-settings document — smart-album definitions (predicate + display name) and similar client-authored organizational state — synced and merged across devices like any other collaborative metadata, and never legible to the server. (The default-album designation is separate: a non-secret server-side owner pointer, not part of this encrypted document.)
This LWW/OR-set approach is intentionally simpler than a full event-graph with state resolution: photo metadata does not need it, and the extra machinery would not be functionally justified.
Tag Provenance and Namespacing
Section titled “Tag Provenance and Namespacing”User tags and AI-suggested tags live in structurally separate OR-sets (tags_user and tags_ai in the sidecar schema). The separation is structural, not policy:
- An AI tag can never overwrite a user tag and vice versa — they are different fields, so the question does not arise. A hallucinating model cannot pollute user intent.
- Every
tags_aientry carriesmodel_idandmodel_version(see AI — Embedding Provenance). When the canonical model for that slot changes, AI tags from the old model are flagged as stale; cross-model semantic comparison is forbidden (see Threat Model — Client-Side Validation Invariants). - A user can promote an AI tag — explicit user action copies the entry to
tags_user(with a fresh user-scopedadd_id) and may optionally remove it fromtags_ai. Promotion is a signed lifecycle operation; never automatic. - A user can dismiss an AI tag — an OR-set remove on
tags_aikeyed by the originaladd_id.
The same dual-namespace structure applies to any future ML-derived metadata field that overlays a user-editable one (face labels, location guesses, etc.). The owner doc for the model is AI/ML Integrations; the storage shape is owned here.
Geolocation
Section titled “Geolocation”GPS is stored canonically in WGS-84 (gps.lat / gps.lon), the near-universal camera format. Some jurisdictions mandate obfuscated coordinates for display — notably China’s GCJ-02, and Baidu’s BD-09 (a second obfuscation layer over GCJ-02). Capsule always stores WGS-84 and converts to the required system deterministically and client-side (in capsule-core) at plot time; the stored coordinate is never the obfuscated one. Per-platform map-provider selection is a client/deployment concern, not part of this schema.
Validation
Section titled “Validation”The sidecar schema is the contract; validation focuses on serde determinism + CRDT correctness.
- Canonical CBOR conformance (unit + cross-language). Encode a fixture sidecar (including a populated
_unknownmap); assert byte-identical output across runs, platforms, and every FFI consumer, matching the shared known-answer vectors for the canonical ruleset — key sort including_unknown, shortest-form integers, definite-length only. Re-decode; assert structural equality. This is a blocking conformance gate, not advisory. - Add-id counter durability (unit). Issue adds advancing the counter; drop the in-memory counter to simulate a restart/reinstall; reseed from the device’s existing sidecars; assert the next
add_id.counteris strictly greater than every counter the device previously issued — never a reuse. - Schema versioning enforcement (unit). Construct a sidecar with
sidecar_schema = N+1; load on a reader whosemax_known = N; assert write-refusal. Construct withsidecar_schema = N; assert acceptance. - OR-set merge convergence (unit). Generate add/remove operations from N devices in random order; merge in every permutation; assert byte-identical final state across permutations.
- Add-id rejection (unit). Issue a remove with an
add_idnever observed locally; assert rejection (not silent no-op). - LWW with superseded capture (unit). Two devices write captions within milliseconds; merge; assert the winner is the lexicographic-tiebreak chosen, and the loser appears in
superseded_captions. - Privacy-on-export stripping (unit). Each row of the privacy table is a fixture test: assert the field is stripped by default, retained when opt-in is set, and that the local sidecar is unchanged either way.
- Concurrent-edit reconciliation (smoke). Two test clients edit the same album offline; merge over MLS; assert convergence with no manual conflict resolution needed.
Cross-module case: metadata edited on device A → synced via server → applied on device B with correct CRDT merge. Bounded E2E surface in Module Map.
Related
Section titled “Related”- Asset Organization — albums and stacks that consume the
stack_membershipfield. - AI/ML Integrations — owner of the models behind
tags_aiand the reserved AI-facet fields. - Thumbnails and Previews — owner of the LQIP scheme carried in the
lqipfield.