Skip to content

feat: tiered data storage#7

Open
enigbe wants to merge 4 commits into
mainfrom
2025-10-tiered-data-storage
Open

feat: tiered data storage#7
enigbe wants to merge 4 commits into
mainfrom
2025-10-tiered-data-storage

Conversation

@enigbe

@enigbe enigbe commented Oct 21, 2025

Copy link
Copy Markdown
Owner

What this PR does:

We introduce TierStore, a KVStore implementation that manages data across
three distinct storage layers.

The layers are:

  1. Primary: The main/remote data store.
  2. Ephemeral: A secondary store for non-critical, easily-rebuildable data
    (e.g., network graph). This tier aims to improve latency by leveraging a
    local KVStore designed for fast/local access.
  3. Backup: A tertiary store for disaster recovery. Backup operations are sent
    asynchronously/lazily to avoid blocking primary store operations.

We also permit the configuration of Node with these stores allowing
callers to set exponential back-off parameters, as well as backup and ephemeral
stores, and to build the Node with TierStore's primary store. These configuration
options also extend to our foreign interface, allowing bindings target to build the
Node with their own ffi::KVStore implementations.

A sample Python implementation is added and tested.

Additionally, we add comprehensive testing for TierStore by introducing

  1. Unit tests for TierStore core functionality.
  2. Integration tests for Node built with tiered storage.
  3. Python FFI tests for foreign ffi::KVStore implementations.

Concerns

It is worth considering the way retry logic is handled, especially because of nested
retries. TierStore comes with a basic one by default but there are KVStore implementations
that come with them baked-in (e.g. VssStore), and thus would have no need for
the wrapper-store's own logic.

@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 9 times, most recently from 29f47f3 to 264aa7f Compare November 4, 2025 22:07
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from a30cbfb to 1e7bdbc Compare December 4, 2025 23:30
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 5 times, most recently from b5e980f to 67d47c2 Compare February 4, 2026 16:28
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from 95285b0 to 4b2d345 Compare February 18, 2026 11:23
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from cba29a3 to db1fe83 Compare February 24, 2026 23:03
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from db1fe83 to 35f9ec2 Compare March 9, 2026 15:47
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from e89ada5 to 3abd0a5 Compare April 1, 2026 21:44
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 5 times, most recently from 8f122cb to 7155c40 Compare April 6, 2026 10:15
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from 7720935 to 181db03 Compare April 9, 2026 07:24
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 5 times, most recently from 5da76a8 to dec4aa7 Compare April 27, 2026 08:52
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from c1563e3 to a2458e4 Compare May 6, 2026 07:37
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from 8dbb312 to 0ae61b7 Compare May 19, 2026 06:05
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 7 times, most recently from cadf81f to 1f2cd4d Compare June 16, 2026 08:34
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from 3eef01d to c4aa2b0 Compare June 24, 2026 09:06
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from 0a240d4 to 10b9721 Compare June 29, 2026 07:01
enigbe added 3 commits July 3, 2026 18:35
This commit adds `TierStore`, a tiered `KVStore` implementation that
routes node persistence across three storage roles:

- a primary store for durable, authoritative data
- an optional backup store for a second durable copy of primary-backed data
- an optional ephemeral store for rebuildable cached data such as the
  network graph and scorer

TierStore routes ephemeral cache data to the ephemeral store when
configured, while durable data remains primary+backup. Reads and lists
do not consult the backup store during normal operation.

For primary+backup writes and removals, this implementation treats the
backup store as part of the persistence success path rather than as a
best-effort background mirror. Earlier designs used asynchronous backup
queueing to avoid blocking the primary path, but that weakens the
durability contract by allowing primary success to be reported before
backup persistence has completed. TierStore now issues primary and backup
operations together and only returns success once both complete.

This gives callers a clearer persistence guarantee when a backup store is
configured: acknowledged primary+backup mutations have been attempted
against both durable stores. The tradeoff is that dual-store operations
are not atomic across stores, so an error may still be returned after one
store has already been updated.

Additionally, adds unit coverage for the current contract, including:
- basic read/write/remove/list persistence
- routing of ephemeral data away from the primary store
- backup participation in the foreground success path for writes and removals
Add native builder support for configuring ephemeral storage and a local
SQLite backup mirror.

Wrap the primary store in TierStore during node construction and create
configured secondary stores using dedicated SQLite database files.

Implement paginated listing through TierStore and update filesystem-backed
tests to use FilesystemStoreV2.

Add full-cycle integration coverage verifying durable backup mirroring.
Preserve call-time ordering for ephemeral writes and removes by routing them
through the same versioned lock path as primary-backed mutations.

Add regression coverage for stale ephemeral writes and removes.
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from 7750cf8 to 9820126 Compare July 3, 2026 23:13
Previously, the implementations for `list` and paginated listing on `TierStore`
contained a bug where, if the ephemeral store was configured, these operations
could return only keys from the ephemeral tier for shared root namespaces such
as ("", ""). This meant durable keys saved to the primary store could be ignored
whenever their namespace was shared with ephemeral-cached keys.

While the solution to `list` seemed straightforward, the one for paginated
listing raised questions about the potential difference in store types between
the primary and ephemeral stores, and the conflict that would arise from them
having independent pagination orders and tokens. `TierStore`, as currently
designed, does not maintain a cross-store cursor, so we rely on a simple overlay
strategy: paginate through primary first, then append ephemeral keys once primary
is on its terminal page.

This does mean the resulting paginated listing is not a strict cross-store
creation-order merge: ephemeral keys are returned after primary pagination is
exhausted, regardless of their creation order in the ephemeral store. We accept
that tradeoff because `TierStore` has no shared creation order across backing
stores, and the overlay is bounded to the current ephemeral-cached key set.

This commit allows us to merge listings across `TierStore`'s primary and
ephemeral tiers by treating the latter as an overlay for ephemeral-cached keys
only. We filter stale primary copies of ephemeral-cached keys, append live
ephemeral copies once primary pagination is complete, and only consult the
ephemeral store for namespaces that can contain ephemeral-cached keys.
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from 9820126 to 81e65cf Compare July 3, 2026 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant