GitHub - backbone-hq/cord: Canonical serialization in Rust for security-sensitive applications.

Cord is a compact deterministic serialization format for Rust with first-class serde integration.

Rich type system — structs, enums, sets, maps, byte arrays, date-times, decimals, UUIDs, options, and more
Dynamic schemas — define data structures at runtime, encode and decode without compiled types, and mix freely with the serde path
Forward evolution — wrap fields in Evolving<T> to round-trip unknown data (e.g., new enum variants) without data loss
Fine-grained wire control — tune integer encoding, length prefix widths, and variant index sizes per field
Deterministic output — every unique value produces exactly one byte sequence, making it safe to sign, hash, cache, and deduplicate serialized data

Installation

cargo add cord

Quick Start

Any type that derives Cord just works:

use cord::{serialize, deserialize, Cord};

#[derive(Cord, Debug, PartialEq)]
struct User {
    id: u32,
    name: String,
    active: bool,
}

let user = User {
    id: 42,
    name: "Alice".to_string(),
    active: true,
};

let bytes = serialize(&user).unwrap();
let deserialized: User = deserialize(&bytes).unwrap();
assert_eq!(user, deserialized);

#[derive(Cord)] generates both Serialize and Deserialize implementations. Types that already derive serde::Serialize and serde::Deserialize also work — #[derive(Cord)] is only needed when using Cord-specific field attributes.

Cord supports booleans, integers (i8–i128, u8–u128), floats (f32, f64), strings, byte arrays, options, sequences, structs, tuple structs, and enums out of the box.

Beyond the Basics

Beyond primitive types and structs, Cord provides DateTime, Map, Set, Decimal, and Uuid — use them directly as field types, no annotations needed. Enums, options, and Vec<u8> all work out of the box.

use cord::{serialize, deserialize, Cord, DateTime, Decimal, Map, Set, Uuid};
use std::collections::{HashMap, HashSet};

#[derive(Cord, Debug, PartialEq)]
enum AccessLevel {
    Public,
    Restricted(Vec<String>),
}

#[derive(Cord, Debug, PartialEq)]
struct Document {
    title: String,
    access: AccessLevel,
    created: DateTime,             // Nanosecond-precision UTC timestamp
    tags: Set<String>,             // Serialized in sorted order
    attributes: Map<String, String>, // Serialized sorted by key
    description: Option<String>,
    id: Uuid,                      // 16-byte canonical UUID
    price: Decimal,                // Arbitrary-precision decimal
}

let mut tags = HashSet::new();
tags.insert("important".to_string());
tags.insert("draft".to_string());

let mut attributes = HashMap::new();
attributes.insert("priority".to_string(), "high".to_string());
attributes.insert("version".to_string(), "2.0".to_string());

let doc = Document {
    title: "Design Doc".to_string(),
    access: AccessLevel::Restricted(vec!["alice".into(), "bob".into()]),
    created: DateTime::now(),
    tags: Set::from(tags),
    attributes: Map::from(attributes),
    description: None,
    id: Uuid::from(uuid::Uuid::nil()),
    price: Decimal::from_i64(1999, 2), // 19.99
};

let bytes = serialize(&doc).unwrap();
let decoded: Document = deserialize(&bytes).unwrap();
assert_eq!(doc, decoded);

Forward Evolution

When different parts of a system run different versions of the same schema, you need a way to handle unknown data without losing it. Evolving<T> length-prefixes the serialized payload so that if deserialization of the inner type fails (e.g., an unknown enum variant), the raw bytes are preserved and can be round-tripped without data loss:

use cord::{serialize, deserialize, Cord, Evolving};

#[derive(Cord, Debug, PartialEq)]
enum Status {
    Active,
    Inactive,
    // Future versions may add more variants
}

#[derive(Cord, Debug, PartialEq)]
struct Message {
    id: u32,
    status: Evolving<Status>,
}

let msg = Message {
    id: 1,
    status: Evolving::new(Status::Active),
};

let bytes = serialize(&msg).unwrap();
let decoded: Message = deserialize(&bytes).unwrap();

// Known values are accessible
assert!(decoded.status.is_known());
assert_eq!(decoded.status.known(), Some(&Status::Active));

If a newer version adds Status::Pending and serializes it, older code will deserialize it as Evolving::Unknown(bytes) — and re-serializing produces identical bytes.

The #[cord(evolving = N)] attribute controls the width of the length prefix used for the envelope:

Attribute	Payload Length Prefix	Max Payload Size
`#[cord(evolving = 8)]`	u8	255 bytes
`#[cord(evolving = 16)]`	u16	65,535 bytes
`#[cord(evolving = 32)]`	u32 (default)	~4 GiB

#[derive(Cord, Debug, PartialEq)]
struct CompactMessage {
    id: u32,
    #[cord(evolving = 8)]
    status: Evolving<Status>,  // 1-byte length prefix instead of 4
}

Without the attribute, Evolving<T> defaults to a 32-bit length prefix.

Dynamic Schemas

For use cases where the data structure isn't known at compile time, Cord provides a dynamic path with runtime schemas. Schemas are themselves serializable Cord types, so you get schema hashing, compact binary representation, and schema-as-data for free.

Defining a Schema

use cord::Schema;

let user_schema = Schema::Struct(vec![
    ("name".into(), Schema::string()),
    ("age".into(), Schema::U32),
    ("active".into(), Schema::Bool),
]);

Schemas support the full range of Cord types via convenience constructors:

use cord::Schema;

let schema = Schema::Struct(vec![
    ("id".into(), Schema::varint(Schema::U32)),
    ("tags".into(), Schema::set(Schema::string())),
    ("metadata".into(), Schema::map(Schema::string(), Schema::string())),
    ("nickname".into(), Schema::option(Schema::string())),
]);

Encoding and Decoding

Use cord::dynamic::encode and cord::dynamic::decode when both sides agree on the schema:

use cord::{Schema, Value};
use cord::dynamic;

let schema = Schema::Struct(vec![
    ("name".into(), Schema::string()),
    ("age".into(), Schema::U32),
]);

let value = Value::Struct(vec![
    ("name".into(), Value::String("Alice".into())),
    ("age".into(), Value::U32(30)),
]);

// Encode to bytes — produces the same output as the serde path
let bytes = dynamic::encode(&value, &schema).unwrap();

// Decode back
let decoded = dynamic::decode(&schema, &bytes).unwrap();
assert_eq!(decoded, value);

Cross-Path Compatibility

The dynamic path produces identical bytes to the serde path, so you can freely mix them:

use cord::{serialize, deserialize, Cord, Schema, Value};
use cord::dynamic;

#[derive(Cord, Debug, PartialEq)]
struct User {
    name: String,
    age: u32,
}

let schema = Schema::Struct(vec![
    ("name".into(), Schema::string()),
    ("age".into(), Schema::U32),
]);

// Serialize with serde, decode dynamically
let user = User { name: "Alice".into(), age: 30 };
let serde_bytes = serialize(&user).unwrap();
let dynamic_val = dynamic::decode(&schema, &serde_bytes).unwrap();

// Encode dynamically, deserialize with serde
let dynamic_bytes = dynamic::encode(&dynamic_val, &schema).unwrap();
assert_eq!(serde_bytes, dynamic_bytes);
let decoded_user: User = deserialize(&dynamic_bytes).unwrap();
assert_eq!(decoded_user, user);

Hashing

Since Cord guarantees deterministic serialization, you can compute canonical hashes of any serializable value — schemas, typed structs, dynamic values, or anything else. Enable the hash feature for built-in SHA3-256 hashing:

cargo add cord --features hash

use cord::{hash, Cord};

#[derive(Cord)]
struct User {
    name: String,
    age: u32,
}

let user = User { name: "Alice".into(), age: 30 };

// Compute a canonical SHA3-256 hash
let h: [u8; 32] = hash(&user).unwrap();

// Same value always produces the same hash, regardless of when or where
let h2: [u8; 32] = hash(&user).unwrap();
assert_eq!(h, h2);

Or bring your own hash — Cord's deterministic encoding means serialize(value) always produces the same bytes for the same value:

use cord::{serialize, Cord};

#[derive(Cord)]
struct User {
    name: String,
    age: u32,
}

let user = User { name: "Alice".into(), age: 30 };
let bytes = serialize(&user).unwrap();
// Hash bytes with any algorithm you prefer

Dynamic Values

Build dynamic values with JSON-like syntax using the cord_value! macro:

use cord::{cord_value, to_value, from_value, Cord, Value};

#[derive(Cord, Debug, PartialEq)]
struct User {
    name: String,
    age: u32,
    active: bool,
}

// Build a Value with JSON-like syntax
let value = cord_value!({
    "name": "Alice",
    "age": 30_u32,
    "tags": ["admin", "user"],
    "active": true,
});

// Convert a Value to a typed struct with from_value
let user_value = cord_value!({
    "name": "Alice",
    "age": 30_u32,
    "active": true,
});
let user: User = from_value(&user_value).unwrap();
assert_eq!(user.name, "Alice");

// Go the other direction: typed struct → Value
let value2 = to_value(&user).unwrap();
assert_eq!(user_value, value2);

// Pattern match to inspect fields directly
if let Value::Struct(fields) = &user_value {
    let (name, val) = &fields[0];
    assert_eq!(name, "name");
    assert_eq!(*val, Value::String("Alice".into()));
}

Type mapping for cord_value!:

{ "key": value, ... } becomes Value::Struct
[a, b, c] becomes Value::Seq
String literals become Value::String
true/false become Value::Bool
Integer literals use Rust's type inference — unsuffixed integers default to i32, use suffixes like 30_u32 or 7_u8 for explicit types
Parenthesized expressions (expr) allow embedding variables and function calls

Tuning the Wire Format

By default, Cord uses fixed-width big-endian encoding for integers, 32-bit (u32) length prefixes for sequences/strings/bytes, and 32-bit (u32) variant indices for enums. This makes the format predictable and easy to implement across languages.

For size-sensitive protocols, Cord provides field attributes to control encoding width. These require #[derive(Cord)] on the containing type.

Variable-Length Integers

Use #[cord(varint)] for compact variable-length encoding (LEB128 for unsigned, zigzag + LEB128 for signed). Works with all integer types from u8 to u128:

use cord::Cord;

#[derive(Cord, Debug, PartialEq)]
struct Compact {
    #[cord(varint)]
    small_value: u32,       // 1 byte for values < 128
    large_value: u32,       // Always 4 bytes
    #[cord(varint)]
    big_id: u128,           // Variable-length 128-bit support
}

Width

Control the width of length prefixes (strings, byte arrays, sequences) and variant indices (enums) with #[cord(width = N)]. The attribute applies to whichever is relevant for the field type:

Attribute	Wire Width	Applies To
`#[cord(width = 8)]`	u8 (1B)	Length prefix or variant index
`#[cord(width = 16)]`	u16 (2B)	Length prefix or variant index
`#[cord(width = 64)]`	u64 (8B)	Length prefix or variant index

Custom Variant Indices

Use #[cord(index = N)] on enum variants to assign explicit wire indices:

use cord::Cord;

#[derive(Cord, Debug, PartialEq)]
enum Command {
    #[cord(index = 1)]
    Ping,
    #[cord(index = 5)]
    Pong(u32),
    #[cord(index = 100)]
    Reset,
}

If any variant has #[cord(index)], all variants must have it.

Combining Attributes

use cord::Cord;

#[derive(Cord, Debug, PartialEq)]
struct Packet {
    #[cord(width = 8)]
    kind: Status,           // 1-byte variant index instead of 4
    #[cord(width = 8)]
    name: String,           // 1-byte length prefix instead of 4
    #[cord(varint)]
    sequence: u64,          // Variable-length encoding
    fixed: u32,             // Standard 4-byte encoding
}

Deterministic Serialization

Cord guarantees that every unique value has exactly one binary representation. This is a property of the format itself — sorted collections, NFC-normalized strings, fixed-width or minimal-length encodings — not something you opt into.

This matters most when serialized bytes are inputs to cryptographic operations. If you sign or hash a data structure and later need to re-serialize it to verify the signature, you need identical bytes. Most formats can't promise that — key order in maps, variable-length integer encodings, and Unicode normalization differences can all silently produce different output for the same logical value.

With Cord, any implementation that follows the spec will produce the same bytes for the same data. You can serialize, deserialize, re-serialize, and the output is always identical. This makes it straightforward to use with signing, hashing, content-addressing, caching, and deduplication.

Threat Model

Cord is designed to defend against scenarios where attackers exploit ambiguities in data representation to bypass security controls, particularly in cryptographic contexts:

Canonicalization bypass: Cryptographic systems often verify signatures against a normalized form while operating on raw input. Attackers exploit this gap by crafting inputs with trailing data, comment fields, or flexible encodings that bypass verification but execute differently. Classic examples include XML signature wrapping attacks and JWT header manipulation.
Protocol confusion: When data is parsed differently across system boundaries, attackers can craft inputs that pass one subsystem's verifications and authorize malicious actions in downstream systems, effectively amounting to a payload substitution attack.
Inconsistency: When third parties cannot independently reproduce the exact byte sequence of cryptographically authenticated data, verification becomes dependent on trusting the original signer's environment. In distributed verification systems like blockchains or certificate transparency logs, this can lead to consensus failures or validation errors.

Cord does not protect against:

Side-channel attacks during serialization/deserialization
Memory safety issues outside of Cord's implementation
Malicious inputs exceeding reasonable size limits
Implementation flaws in cryptographic primitives used with Cord outputs

Unicode Normalization

Cord enforces NFC (Canonical Decomposition followed by Canonical Composition) normalization for all strings. Strings are automatically normalized to NFC during serialization, and the deserializer rejects non-NFC strings. This prevents equivalent Unicode sequences (e.g., e as a single code point vs. e + combining acute accent) from producing different binary representations.

Depth Limiting

The deserializer enforces a maximum nesting depth to protect against stack overflows from deeply nested or malicious input. Both the serde and dynamic decoding paths track depth and return CordError::DepthLimitExceeded if the limit is exceeded. The default limit is 128, available as cord::DEFAULT_MAX_DEPTH.

use cord::deserialize;

// Deeply nested options: Some(Some(Some(... None ...)))
// A 200-level nesting will be rejected at depth 128
let mut bytes = vec![0x01; 200]; // 200 layers of Some(...)
bytes.push(0x00);                // innermost None

let result: Result<_, _> = deserialize::<Option<Option<Option<u8>>>>(&bytes);
// Fails with CordError::DepthLimitExceeded

Feature Flags

Feature	Default	Description
`hash`	off	Adds `cord::hash()` (SHA3-256 hashing)

Supported Types Reference

Type	Support	Notes
Boolean	yes
Integers (i8–i128, u8–u128)	yes	Fixed-width big-endian encoding (default)
Integers (varints)	yes	Opt-in variable-length encoding (LEB128/zigzag)
Floats (f32, f64)	yes	Big-endian IEEE 754; NaN rejected, −0 canonicalized to +0
Char	yes	UTF-8, NFC-normalized, with length prefix
Strings	yes	UTF-8, NFC-normalized, with length prefix (u32 default)
Byte arrays	yes	With length prefix (u32 default)
Sequences	yes	With length prefix (u32 default)
Options	yes
Struct/Tuple struct	yes
Enums	yes	Variant index u32 default
Evolving	yes	Forward-compatible enum wrapper with length-prefixed payload
Set	yes	Sorted during serialization
Map	yes	Sorted by key during serialization
DateTime	yes	Nanosecond-precision UTC timestamp (seconds + nanos)
Decimal	yes	Arbitrary-precision decimal (u8 scale + two's complement unscaled)
Uuid	yes	16-byte canonical UUID

Limitations and Trade-offs

Not human-readable: Binary output requires tooling to inspect
Additive schema evolution: Fields cannot be removed once added without breaking compatibility
Wire format versioning: The format may change between major versions (v1 and v2 are not wire-compatible)

Performance

Cord v2 uses fixed-width big-endian encoding by default (16 bytes for 128-bit integers), which is fast to encode and decode. For size-sensitive applications, #[cord(varint)] and #[cord(width = N)] trade some speed for smaller output. Sets and Maps incur a sort during serialization.

cargo bench --bench performance

Migrating from v1

Cord v2 is a breaking change — the wire format is not compatible with v1. Data serialized with v1 cannot be deserialized with v2, and vice versa. If you have persisted v1 data, you will need to migrate it (deserialize with v1, re-serialize with v2).

Current Status

Cord is a mature project that has seen production use in Backbone. Nevertheless, we urge users to:

Thoroughly test before using in critical systems
Be prepared for breaking changes in major versions
Consider serialization format lock-in for long-term data storage

Roadmap

Our current priorities are:

Comprehensive fuzzing
Language bindings (Python, JavaScript, ...)
Configurable limits for nested structures
Formal verification of components

Anything else you'd like to see? Suggest a feature!

Built by Backbone

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
benches		benches
cord-derive		cord-derive
media		media
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Quick Start

Beyond the Basics

Forward Evolution

Dynamic Schemas

Defining a Schema

Encoding and Decoding

Cross-Path Compatibility

Hashing

Dynamic Values

Tuning the Wire Format

Variable-Length Integers

Width

Custom Variant Indices

Combining Attributes

Deterministic Serialization

Threat Model

Unicode Normalization

Depth Limiting

Feature Flags

Supported Types Reference

Limitations and Trade-offs

Performance

Migrating from v1

Current Status

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Installation

Quick Start

Beyond the Basics

Forward Evolution

Dynamic Schemas

Defining a Schema

Encoding and Decoding

Cross-Path Compatibility

Hashing

Dynamic Values

Tuning the Wire Format

Variable-Length Integers

Width

Custom Variant Indices

Combining Attributes

Deterministic Serialization

Threat Model

Unicode Normalization

Depth Limiting

Feature Flags

Supported Types Reference

Limitations and Trade-offs

Performance

Migrating from v1

Current Status

Roadmap

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages