Cord is a compact deterministic serialization format for Rust with first-class serde integration.
- Rich type system — structs, enums, sets, maps, byte arrays, date-times, decimals, UUIDs, options, and more
- Dynamic schemas — define data structures at runtime, encode and decode without compiled types, and mix freely with the serde path
- Forward evolution — wrap fields in
Evolving<T>to round-trip unknown data (e.g., new enum variants) without data loss - Fine-grained wire control — tune integer encoding, length prefix widths, and variant index sizes per field
- Deterministic output — every unique value produces exactly one byte sequence, making it safe to sign, hash, cache, and deduplicate serialized data
cargo add cordAny type that derives Cord just works:
use cord::{serialize, deserialize, Cord};
#[derive(Cord, Debug, PartialEq)]
struct User {
id: u32,
name: String,
active: bool,
}
let user = User {
id: 42,
name: "Alice".to_string(),
active: true,
};
let bytes = serialize(&user).unwrap();
let deserialized: User = deserialize(&bytes).unwrap();
assert_eq!(user, deserialized);#[derive(Cord)] generates both Serialize and Deserialize implementations. Types that already derive serde::Serialize and serde::Deserialize also work — #[derive(Cord)] is only needed when using Cord-specific field attributes.
Cord supports booleans, integers (i8–i128, u8–u128), floats (f32, f64), strings, byte arrays, options, sequences, structs, tuple structs, and enums out of the box.
Beyond primitive types and structs, Cord provides DateTime, Map, Set, Decimal, and Uuid — use them directly as field types, no annotations needed. Enums, options, and Vec<u8> all work out of the box.
use cord::{serialize, deserialize, Cord, DateTime, Decimal, Map, Set, Uuid};
use std::collections::{HashMap, HashSet};
#[derive(Cord, Debug, PartialEq)]
enum AccessLevel {
Public,
Restricted(Vec<String>),
}
#[derive(Cord, Debug, PartialEq)]
struct Document {
title: String,
access: AccessLevel,
created: DateTime, // Nanosecond-precision UTC timestamp
tags: Set<String>, // Serialized in sorted order
attributes: Map<String, String>, // Serialized sorted by key
description: Option<String>,
id: Uuid, // 16-byte canonical UUID
price: Decimal, // Arbitrary-precision decimal
}
let mut tags = HashSet::new();
tags.insert("important".to_string());
tags.insert("draft".to_string());
let mut attributes = HashMap::new();
attributes.insert("priority".to_string(), "high".to_string());
attributes.insert("version".to_string(), "2.0".to_string());
let doc = Document {
title: "Design Doc".to_string(),
access: AccessLevel::Restricted(vec!["alice".into(), "bob".into()]),
created: DateTime::now(),
tags: Set::from(tags),
attributes: Map::from(attributes),
description: None,
id: Uuid::from(uuid::Uuid::nil()),
price: Decimal::from_i64(1999, 2), // 19.99
};
let bytes = serialize(&doc).unwrap();
let decoded: Document = deserialize(&bytes).unwrap();
assert_eq!(doc, decoded);When different parts of a system run different versions of the same schema, you need a way to handle unknown data without losing it. Evolving<T> length-prefixes the serialized payload so that if deserialization of the inner type fails (e.g., an unknown enum variant), the raw bytes are preserved and can be round-tripped without data loss:
use cord::{serialize, deserialize, Cord, Evolving};
#[derive(Cord, Debug, PartialEq)]
enum Status {
Active,
Inactive,
// Future versions may add more variants
}
#[derive(Cord, Debug, PartialEq)]
struct Message {
id: u32,
status: Evolving<Status>,
}
let msg = Message {
id: 1,
status: Evolving::new(Status::Active),
};
let bytes = serialize(&msg).unwrap();
let decoded: Message = deserialize(&bytes).unwrap();
// Known values are accessible
assert!(decoded.status.is_known());
assert_eq!(decoded.status.known(), Some(&Status::Active));If a newer version adds Status::Pending and serializes it, older code will deserialize it as Evolving::Unknown(bytes) — and re-serializing produces identical bytes.
The #[cord(evolving = N)] attribute controls the width of the length prefix used for the envelope:
| Attribute | Payload Length Prefix | Max Payload Size |
|---|---|---|
#[cord(evolving = 8)] |
u8 | 255 bytes |
#[cord(evolving = 16)] |
u16 | 65,535 bytes |
#[cord(evolving = 32)] |
u32 (default) | ~4 GiB |
#[derive(Cord, Debug, PartialEq)]
struct CompactMessage {
id: u32,
#[cord(evolving = 8)]
status: Evolving<Status>, // 1-byte length prefix instead of 4
}Without the attribute, Evolving<T> defaults to a 32-bit length prefix.
For use cases where the data structure isn't known at compile time, Cord provides a dynamic path with runtime schemas. Schemas are themselves serializable Cord types, so you get schema hashing, compact binary representation, and schema-as-data for free.
use cord::Schema;
let user_schema = Schema::Struct(vec![
("name".into(), Schema::string()),
("age".into(), Schema::U32),
("active".into(), Schema::Bool),
]);Schemas support the full range of Cord types via convenience constructors:
use cord::Schema;
let schema = Schema::Struct(vec![
("id".into(), Schema::varint(Schema::U32)),
("tags".into(), Schema::set(Schema::string())),
("metadata".into(), Schema::map(Schema::string(), Schema::string())),
("nickname".into(), Schema::option(Schema::string())),
]);Use cord::dynamic::encode and cord::dynamic::decode when both sides agree on the schema:
use cord::{Schema, Value};
use cord::dynamic;
let schema = Schema::Struct(vec![
("name".into(), Schema::string()),
("age".into(), Schema::U32),
]);
let value = Value::Struct(vec![
("name".into(), Value::String("Alice".into())),
("age".into(), Value::U32(30)),
]);
// Encode to bytes — produces the same output as the serde path
let bytes = dynamic::encode(&value, &schema).unwrap();
// Decode back
let decoded = dynamic::decode(&schema, &bytes).unwrap();
assert_eq!(decoded, value);The dynamic path produces identical bytes to the serde path, so you can freely mix them:
use cord::{serialize, deserialize, Cord, Schema, Value};
use cord::dynamic;
#[derive(Cord, Debug, PartialEq)]
struct User {
name: String,
age: u32,
}
let schema = Schema::Struct(vec![
("name".into(), Schema::string()),
("age".into(), Schema::U32),
]);
// Serialize with serde, decode dynamically
let user = User { name: "Alice".into(), age: 30 };
let serde_bytes = serialize(&user).unwrap();
let dynamic_val = dynamic::decode(&schema, &serde_bytes).unwrap();
// Encode dynamically, deserialize with serde
let dynamic_bytes = dynamic::encode(&dynamic_val, &schema).unwrap();
assert_eq!(serde_bytes, dynamic_bytes);
let decoded_user: User = deserialize(&dynamic_bytes).unwrap();
assert_eq!(decoded_user, user);Since Cord guarantees deterministic serialization, you can compute canonical hashes of any serializable value — schemas, typed structs, dynamic values, or anything else. Enable the hash feature for built-in SHA3-256 hashing:
cargo add cord --features hashuse cord::{hash, Cord};
#[derive(Cord)]
struct User {
name: String,
age: u32,
}
let user = User { name: "Alice".into(), age: 30 };
// Compute a canonical SHA3-256 hash
let h: [u8; 32] = hash(&user).unwrap();
// Same value always produces the same hash, regardless of when or where
let h2: [u8; 32] = hash(&user).unwrap();
assert_eq!(h, h2);Or bring your own hash — Cord's deterministic encoding means serialize(value) always produces the same bytes for the same value:
use cord::{serialize, Cord};
#[derive(Cord)]
struct User {
name: String,
age: u32,
}
let user = User { name: "Alice".into(), age: 30 };
let bytes = serialize(&user).unwrap();
// Hash bytes with any algorithm you preferBuild dynamic values with JSON-like syntax using the cord_value! macro:
use cord::{cord_value, to_value, from_value, Cord, Value};
#[derive(Cord, Debug, PartialEq)]
struct User {
name: String,
age: u32,
active: bool,
}
// Build a Value with JSON-like syntax
let value = cord_value!({
"name": "Alice",
"age": 30_u32,
"tags": ["admin", "user"],
"active": true,
});
// Convert a Value to a typed struct with from_value
let user_value = cord_value!({
"name": "Alice",
"age": 30_u32,
"active": true,
});
let user: User = from_value(&user_value).unwrap();
assert_eq!(user.name, "Alice");
// Go the other direction: typed struct → Value
let value2 = to_value(&user).unwrap();
assert_eq!(user_value, value2);
// Pattern match to inspect fields directly
if let Value::Struct(fields) = &user_value {
let (name, val) = &fields[0];
assert_eq!(name, "name");
assert_eq!(*val, Value::String("Alice".into()));
}Type mapping for cord_value!:
{ "key": value, ... }becomesValue::Struct[a, b, c]becomesValue::Seq- String literals become
Value::String true/falsebecomeValue::Bool- Integer literals use Rust's type inference — unsuffixed integers default to
i32, use suffixes like30_u32or7_u8for explicit types - Parenthesized expressions
(expr)allow embedding variables and function calls
By default, Cord uses fixed-width big-endian encoding for integers, 32-bit (u32) length prefixes for sequences/strings/bytes, and 32-bit (u32) variant indices for enums. This makes the format predictable and easy to implement across languages.
For size-sensitive protocols, Cord provides field attributes to control encoding width. These require #[derive(Cord)] on the containing type.
Use #[cord(varint)] for compact variable-length encoding (LEB128 for unsigned, zigzag + LEB128 for signed). Works with all integer types from u8 to u128:
use cord::Cord;
#[derive(Cord, Debug, PartialEq)]
struct Compact {
#[cord(varint)]
small_value: u32, // 1 byte for values < 128
large_value: u32, // Always 4 bytes
#[cord(varint)]
big_id: u128, // Variable-length 128-bit support
}Control the width of length prefixes (strings, byte arrays, sequences) and variant indices (enums) with #[cord(width = N)]. The attribute applies to whichever is relevant for the field type:
| Attribute | Wire Width | Applies To |
|---|---|---|
#[cord(width = 8)] |
u8 (1B) | Length prefix or variant index |
#[cord(width = 16)] |
u16 (2B) | Length prefix or variant index |
#[cord(width = 64)] |
u64 (8B) | Length prefix or variant index |
Use #[cord(index = N)] on enum variants to assign explicit wire indices:
use cord::Cord;
#[derive(Cord, Debug, PartialEq)]
enum Command {
#[cord(index = 1)]
Ping,
#[cord(index = 5)]
Pong(u32),
#[cord(index = 100)]
Reset,
}If any variant has #[cord(index)], all variants must have it.
use cord::Cord;
#[derive(Cord, Debug, PartialEq)]
struct Packet {
#[cord(width = 8)]
kind: Status, // 1-byte variant index instead of 4
#[cord(width = 8)]
name: String, // 1-byte length prefix instead of 4
#[cord(varint)]
sequence: u64, // Variable-length encoding
fixed: u32, // Standard 4-byte encoding
}Cord guarantees that every unique value has exactly one binary representation. This is a property of the format itself — sorted collections, NFC-normalized strings, fixed-width or minimal-length encodings — not something you opt into.
This matters most when serialized bytes are inputs to cryptographic operations. If you sign or hash a data structure and later need to re-serialize it to verify the signature, you need identical bytes. Most formats can't promise that — key order in maps, variable-length integer encodings, and Unicode normalization differences can all silently produce different output for the same logical value.
With Cord, any implementation that follows the spec will produce the same bytes for the same data. You can serialize, deserialize, re-serialize, and the output is always identical. This makes it straightforward to use with signing, hashing, content-addressing, caching, and deduplication.
Cord is designed to defend against scenarios where attackers exploit ambiguities in data representation to bypass security controls, particularly in cryptographic contexts:
- Canonicalization bypass: Cryptographic systems often verify signatures against a normalized form while operating on raw input. Attackers exploit this gap by crafting inputs with trailing data, comment fields, or flexible encodings that bypass verification but execute differently. Classic examples include XML signature wrapping attacks and JWT header manipulation.
- Protocol confusion: When data is parsed differently across system boundaries, attackers can craft inputs that pass one subsystem's verifications and authorize malicious actions in downstream systems, effectively amounting to a payload substitution attack.
- Inconsistency: When third parties cannot independently reproduce the exact byte sequence of cryptographically authenticated data, verification becomes dependent on trusting the original signer's environment. In distributed verification systems like blockchains or certificate transparency logs, this can lead to consensus failures or validation errors.
Cord does not protect against:
- Side-channel attacks during serialization/deserialization
- Memory safety issues outside of Cord's implementation
- Malicious inputs exceeding reasonable size limits
- Implementation flaws in cryptographic primitives used with Cord outputs
Cord enforces NFC (Canonical Decomposition followed by Canonical Composition) normalization for all strings. Strings are automatically normalized to NFC during serialization, and the deserializer rejects non-NFC strings. This prevents equivalent Unicode sequences (e.g., e as a single code point vs. e + combining acute accent) from producing different binary representations.
The deserializer enforces a maximum nesting depth to protect against stack overflows from deeply nested or malicious input. Both the serde and dynamic decoding paths track depth and return CordError::DepthLimitExceeded if the limit is exceeded. The default limit is 128, available as cord::DEFAULT_MAX_DEPTH.
use cord::deserialize;
// Deeply nested options: Some(Some(Some(... None ...)))
// A 200-level nesting will be rejected at depth 128
let mut bytes = vec![0x01; 200]; // 200 layers of Some(...)
bytes.push(0x00); // innermost None
let result: Result<_, _> = deserialize::<Option<Option<Option<u8>>>>(&bytes);
// Fails with CordError::DepthLimitExceeded| Feature | Default | Description |
|---|---|---|
hash |
off | Adds cord::hash() (SHA3-256 hashing) |
| Type | Support | Notes |
|---|---|---|
| Boolean | yes | |
| Integers (i8–i128, u8–u128) | yes | Fixed-width big-endian encoding (default) |
| Integers (varints) | yes | Opt-in variable-length encoding (LEB128/zigzag) |
| Floats (f32, f64) | yes | Big-endian IEEE 754; NaN rejected, −0 canonicalized to +0 |
| Char | yes | UTF-8, NFC-normalized, with length prefix |
| Strings | yes | UTF-8, NFC-normalized, with length prefix (u32 default) |
| Byte arrays | yes | With length prefix (u32 default) |
| Sequences | yes | With length prefix (u32 default) |
| Options | yes | |
| Struct/Tuple struct | yes | |
| Enums | yes | Variant index u32 default |
| Evolving | yes | Forward-compatible enum wrapper with length-prefixed payload |
| Set | yes | Sorted during serialization |
| Map | yes | Sorted by key during serialization |
| DateTime | yes | Nanosecond-precision UTC timestamp (seconds + nanos) |
| Decimal | yes | Arbitrary-precision decimal (u8 scale + two's complement unscaled) |
| Uuid | yes | 16-byte canonical UUID |
- Not human-readable: Binary output requires tooling to inspect
- Additive schema evolution: Fields cannot be removed once added without breaking compatibility
- Wire format versioning: The format may change between major versions (v1 and v2 are not wire-compatible)
Cord v2 uses fixed-width big-endian encoding by default (16 bytes for 128-bit integers), which is fast to encode and decode. For size-sensitive applications, #[cord(varint)] and #[cord(width = N)] trade some speed for smaller output. Sets and Maps incur a sort during serialization.
cargo bench --bench performanceCord v2 is a breaking change — the wire format is not compatible with v1. Data serialized with v1 cannot be deserialized with v2, and vice versa. If you have persisted v1 data, you will need to migrate it (deserialize with v1, re-serialize with v2).
Cord is a mature project that has seen production use in Backbone. Nevertheless, we urge users to:
- Thoroughly test before using in critical systems
- Be prepared for breaking changes in major versions
- Consider serialization format lock-in for long-term data storage
Our current priorities are:
- Comprehensive fuzzing
- Language bindings (Python, JavaScript, ...)
- Configurable limits for nested structures
- Formal verification of components
Anything else you'd like to see? Suggest a feature!
Built by Backbone