You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Project priority sequence per #28: complete encoder rewrite (#111 incl. #23) → speed/ratio optimizations (#178, #180) → params API (#27) → magicless (#26) → Phase 6 C-ABI / CLI drop-in (#126/#127/#128/#130/#131/#132) → THEN this track. lsm-tree bilateral coordination accepted 2026-05-18 — commitment preserved, but execution defers until drop-in parity ships. Pre-Phase-6 work on this issue will not be scheduled.
⚠️Feature gate (mandatory): all Rust code added by this issue is compiled only when the lsm Cargo feature is enabled (#[cfg(feature = "lsm")] on every new public item — module, struct, enum variant, impl block, function). Feature is default off, opt-in for downstream consumers. Without lsm: build is byte-identical to today, no new public symbols, cdylib from Phase 6 stays strict drop-in for donor libzstd v1.5.7. C FFI surface is unaffected regardless of feature state.
Status
Confirmed scope (2026-06-06). Originally framed as a recovery-only spike ("decode as much as possible past corruption"). Re-scoped after checking the downstream consumer (lsm-tree #257) and the live ECC path:
The consumer actually needs partial decode primarily for range-query performance: decode only the inner zstd blocks covering a key range instead of full-block decompression + row filtering.
So this lands as a block-subset primitive decode_blocks_partial(src, start_block, end_block) that serves performance first, with best-effort recovery as a general-purpose secondary mode of the same call (useful to non-encrypted callers; lsm-tree's corruption recovery lives one layer down on Page ECC). No longer a "may close without landing" spike: the perf path is a concrete, measurable deliverable.
Context
FrameDecoder::decode_blocks always decodes a frame from its first block and treats any block-decode failure as terminal. Two distinct consumer needs are unmet:
Performance (primary): a range query needs only a subset of keys from an SST block. Mapping that key range to a decompressed byte range and then to inner-zstd-block indices, the caller wants to decode only the covering blocks, skip the rest, and avoid materialising / scanning the non-target output.
Recovery (secondary): when ECC has exhausted its parity budget on a block and a zstd block is still unrecoverable, a caller may still want "give me whatever decoded successfully before the corruption" rather than discarding the whole frame.
A single block-subset partial decode covers both: a half-open [start_block, end_block) range for the perf path, and graceful stop-with-preserved-prefix for the recovery path.
Proposed scope
Decode side — decode_blocks_partial
#[cfg(feature = "lsm")]implFrameDecoder{/// Decode the inner blocks `[start_block, end_block)` of the current/// frame. Blocks before `start_block` are decoded into the window/// (required for match resolution — zstd blocks share one window, so/// leading blocks generally cannot be skipped) but their output is not/// returned; blocks at/after `end_block` are not decoded at all (the/// trailing-block perf win). Returns the decompressed bytes of the/// in-range blocks as one contiguous owned buffer.////// Doubles as best-effort recovery: if a block decode fails, decoding/// stops, the clean prefix of in-range output is preserved in `data`,/// and the failure is reported via `stopped_at`. `[0, u32::MAX)` decodes/// the whole frame, stopping at the first corrupt block (pure recovery).pubfndecode_blocks_partial(&mutself,src:implRead,start_block:u32,end_block:u32,// exclusive; u32::MAX = to end of frame) -> Result<PartialDecode,FrameDecoderError>;}#[cfg(feature = "lsm")]pubstructPartialDecode{/// Decompressed bytes of the in-range blocks actually decoded/// (contiguous; `data.len()` == sum of their decompressed sizes).pubdata:Vec<u8>,/// First block whose output is in `data` (== the requested start_block).pubstart_block:u32,/// Count of in-range blocks successfully decoded into `data`.pubblocks_decoded:u32,/// `Some((block_index, error))` if decode stopped on a corrupt block/// before reaching `end_block`; `None` if the range decoded cleanly/// (or the frame's last block was reached first).pubstopped_at:Option<(u32,FrameDecoderError)>,/// True if the frame's last block was reached during this decode.pubframe_finished:bool,}
Returns an owned Vec<u8> rather than the read()-cursor interface: a range query needs a contiguous buffer to scan keys anyway (the ring backend exposes two slices), and an owned buffer removes the window-retention ambiguity (the leading blocks' window bytes are physically required for in-range match resolution but must not leak into the returned output). A misused range (start_block > end_block) returns Err(FrameDecoderError::InvalidBlockRange { .. }), distinct from a corrupt-frame stop (reported via stopped_at).
Encode side — per-block decompressed byte ranges (consumer prerequisite)
#257 maps a key range → decompressed byte range → (start_block, end_block) and references FrameEmitInfo.blocks[*].decompressed_byte_range "from #173". That field was never specced or shipped in #173: FrameEmitInfo carries only compressed-side coordinates (offset_in_frame, body_size, ...). A Compressed block's decompressed size is not recoverable from its wire header (Block_Size is the compressed length), so the mapping must be captured encode-side. This issue adds it:
#[cfg(feature = "lsm")]pubstructFrameBlock{// ...existing fields.../// Decompressed (regenerated) size of this block's output in bytes.pubdecompressed_size:u32,}#[cfg(feature = "lsm")]implFrameEmitInfo{/// Half-open decompressed byte range `[start, end)` of `blocks[i]`/// in the frame's full decompressed output (prefix sum of/// `decompressed_size`). `None` if `i` is out of range.pubfndecompressed_byte_range(&self,block_index:usize) -> Option<core::ops::Range<u64>>;}
Populated from the encoder's actual per-block input chunk sizes, threaded through both emit paths (one-shot borrowed + streaming) at the same block-emit sites as the existing per-block XXH64 sidecar (including post-split partitions).
Why this is hard
Subset correctness: zstd blocks share one window, so leading blocks [0, start_block) cannot simply be skipped — they must be decoded into the window so the in-range blocks' match offsets resolve, then their output discarded from the front once decoding is complete (match resolution done). The win is on the trailing side (blocks past end_block are never touched) plus not materialising / scanning the leading output.
Recovery boundary: the fused sequence executor already rolls the output buffer back to a pre-block checkpoint on a sequence error, so blocks 0..F-1 stay clean; the exact in-range byte count is tracked via per-block length deltas, and any trailing garbage from the failing block is simply left undrained. No FSE/window unwinding is needed because decode stops rather than resumes.
Acceptance criteria
Happy-path full decode (no corruption, [0, u32::MAX)) yields bytes identical to decode_blocks for the same frame.
Subset [s, e) returns PartialDecode.data byte-identical to the [decompressed_byte_range(s).start .. decompressed_byte_range(e-1).end) slice of the full decode, for every boundary in {1 block, 2 blocks, half, all}.
Trailing blocks past end_block are not decoded (assert via a sentinel / instrumented reader that the source past the last in-range block's body is not consumed).
Corrupt block at index N within range → stopped_at == Some((N, _)), data holds exactly the decompressed bytes of the in-range blocks start_block..N, blocks_decoded == N - start_block.
start_block > end_block → Err(InvalidBlockRange).
FrameEmitInfo::decompressed_byte_range(i) matches the actual decoded output offsets for Raw / RLE / Compressed blocks, including post-split frames.
No happy-path perf regression on compare_ffi decompress benches (< 1% delta) — the new path is lsm-gated and off the default decode path.
All lib tests pass.
At least one integration test demonstrating lsm-tree-style range-query partial recovery: encode a multi-block frame, use decompressed_byte_range to pick (start, end), decode the subset, assert it equals the corresponding slice of a full decode.
decode_blocks_partial, PartialDecode, the new FrameBlock::decompressed_size field, and FrameEmitInfo::decompressed_byte_range are all gated behind the lsm Cargo feature (default off) — same gate as #171/#172/#173/#174. Default-build cdylib from #126/#127 remains strict drop-in for donor v1.5.7 (donor has no partial-decode primitive, so its absence in the no-feature build is correct).
Status
Confirmed scope (2026-06-06). Originally framed as a recovery-only spike ("decode as much as possible past corruption"). Re-scoped after checking the downstream consumer (lsm-tree #257) and the live ECC path:
So this lands as a block-subset primitive
decode_blocks_partial(src, start_block, end_block)that serves performance first, with best-effort recovery as a general-purpose secondary mode of the same call (useful to non-encrypted callers; lsm-tree's corruption recovery lives one layer down on Page ECC). No longer a "may close without landing" spike: the perf path is a concrete, measurable deliverable.Context
FrameDecoder::decode_blocksalways decodes a frame from its first block and treats any block-decode failure as terminal. Two distinct consumer needs are unmet:A single block-subset partial decode covers both: a half-open
[start_block, end_block)range for the perf path, and graceful stop-with-preserved-prefix for the recovery path.Proposed scope
Decode side —
decode_blocks_partialReturns an owned
Vec<u8>rather than theread()-cursor interface: a range query needs a contiguous buffer to scan keys anyway (the ring backend exposes two slices), and an owned buffer removes the window-retention ambiguity (the leading blocks' window bytes are physically required for in-range match resolution but must not leak into the returned output). A misused range (start_block > end_block) returnsErr(FrameDecoderError::InvalidBlockRange { .. }), distinct from a corrupt-frame stop (reported viastopped_at).Encode side — per-block decompressed byte ranges (consumer prerequisite)
#257maps a key range → decompressed byte range →(start_block, end_block)and referencesFrameEmitInfo.blocks[*].decompressed_byte_range"from #173". That field was never specced or shipped in #173:FrameEmitInfocarries only compressed-side coordinates (offset_in_frame,body_size, ...). A Compressed block's decompressed size is not recoverable from its wire header (Block_Sizeis the compressed length), so the mapping must be captured encode-side. This issue adds it:Populated from the encoder's actual per-block input chunk sizes, threaded through both emit paths (one-shot borrowed + streaming) at the same block-emit sites as the existing per-block XXH64 sidecar (including post-split partitions).
Why this is hard
[0, start_block)cannot simply be skipped — they must be decoded into the window so the in-range blocks' match offsets resolve, then their output discarded from the front once decoding is complete (match resolution done). The win is on the trailing side (blocks pastend_blockare never touched) plus not materialising / scanning the leading output.0..F-1stay clean; the exact in-range byte count is tracked via per-block length deltas, and any trailing garbage from the failing block is simply left undrained. No FSE/window unwinding is needed because decode stops rather than resumes.Acceptance criteria
[0, u32::MAX)) yields bytes identical todecode_blocksfor the same frame.[s, e)returnsPartialDecode.databyte-identical to the[decompressed_byte_range(s).start .. decompressed_byte_range(e-1).end)slice of the full decode, for every boundary in{1 block, 2 blocks, half, all}.end_blockare not decoded (assert via a sentinel / instrumented reader that the source past the last in-range block's body is not consumed).Nwithin range →stopped_at == Some((N, _)),dataholds exactly the decompressed bytes of the in-range blocksstart_block..N,blocks_decoded == N - start_block.start_block > end_block→Err(InvalidBlockRange).FrameEmitInfo::decompressed_byte_range(i)matches the actual decoded output offsets for Raw / RLE / Compressed blocks, including post-split frames.compare_ffidecompress benches (< 1% delta) — the new path islsm-gated and off the default decode path.decompressed_byte_rangeto pick(start, end), decode the subset, assert it equals the corresponding slice of a full decode.Related
decompressed_byte_rangefield added here is the mapping primitive perf(bench): pre-touch output Vec before measurement loop to eliminate page-fault artifact #257 needs; its reference to "feat(encoding+decoding): FrameEmitInfo block-layout introspection + opt-in per-block XXH64 sidecar #173" should point here.stopped_atreuses theblock_indexplumbing.FrameBlockwith the decompressed-size field feat(encoding+decoding): FrameEmitInfo block-layout introspection + opt-in per-block XXH64 sidecar #173 did not carry.ADDENDUM (2026-05-18): feature gating
decode_blocks_partial,PartialDecode, the newFrameBlock::decompressed_sizefield, andFrameEmitInfo::decompressed_byte_rangeare all gated behind thelsmCargo feature (default off) — same gate as #171/#172/#173/#174. Default-buildcdylibfrom #126/#127 remains strict drop-in for donor v1.5.7 (donor has no partial-decode primitive, so its absence in the no-feature build is correct).Bilateral cross-reference
decompressed_byte_range.stopped_atreuses feat(decoding): block-precise error position on FrameDecoderError variants #174'sFailedToReadBlockBodyAtblock_indexplumbing internally; feat(decoding): block-precise error position on FrameDecoderError variants #174 stays valid as shipped even though its lsm-bilateral consumer moved on.