feat(proxy): support OPE index type#390
Conversation
…-suite Picks up the in-flight OPE work in cipherstash-suite that hasn't been published yet. The newer suite renames Plaintext::Utf8Str → Text and Plaintext::JsonB → Json (same for ColumnType); this commit applies the mechanical follow-on changes in the proxy. Workspace builds clean and proxy unit tests (97) pass. The path override is temporary; the workspace dep should be reverted to the crates.io version once cipherstash-suite is published.
Recognises the new 'ope' index in encrypt config (alongside ore, match, unique, ste_vec) and routes it through cipherstash-client's `IndexType::Ope` so that ORE-style range/order operators work against OPE-indexed columns. - Adds `OpeIndexOpts` and `ope` field to the proxy's `Indexes` config. - Wires `Index::new_ope()` into `Column::into_column_config()`. - Adds an `encrypted_ope` integration-test table mirroring `encrypted` but with `ope`+`unique` indexes per column, and extends `clear()` to truncate it. - Adds 7 WHERE tests (int2/4/8, float8, date, text, bool) and 6 ORDER BY tests (asc/desc, NULLs first/last) targeting the new table. - Adds a `can_parse_ope_index` unit test mirroring `can_parse_ore_index`. CHANGELOG entry under [Unreleased].
Eliminates the parallel-test races that plagued the ORE WHERE/ORDER tests when run alongside each other (and the OPE tests against the shared `encrypted_ope` table). Each test now owns a dedicated table generated up front by a DO block in `tests/sql/schema.sql`, with the same shape as `encrypted` (minus jsonb) and the matching `add_search_config` calls. Other changes: - Drop `#[serial]` from `map_ore_index_order` — per-table isolation removes the need. - Drop the shared `encrypted_ope` table and the hand-added `clear_ope` helper. Tests use the new generic `clear_table(name)` to truncate their dedicated fixture. - Parameterise `ore_order_helpers` on the table name so the same helpers serve `map_ore_index_order` (per-test tables) and multitenant tests (shared `encrypted` table, isolated via keyset + `#[serial]`). End-to-end (parallel, default thread count): 38/38 ORE+OPE tests pass in ~1.6s, repeatedly. Was previously 14s serial / flaky in parallel.
📝 WalkthroughWalkthroughThis pull request introduces OPE (Order-Preserving Encryption) index type support alongside existing ORE functionality, refactors test infrastructure to use per-test tables instead of shared state, updates type system mappings from Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
tests/sql/schema.sql (1)
172-307: Factor the ORE/OPE fixture setup into one generator.The two DO blocks duplicate the schema and search-config matrix. The next column or config change now has to stay in sync across
encrypted,encrypted_elixir, and both fixture generators, which is easy to drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/sql/schema.sql` around lines 172 - 307, The two DO $$ blocks duplicate table creation and eql_v2.add_search_config calls for ORE and OPE; refactor by extracting a single generator function or DO block that accepts an index_kind parameter ('ore' or 'ope') and a list of table name suffixes, then loops once to CREATE TABLE and call eql_v2.add_search_config and eql_v2.add_encrypted_constraint accordingly; update calls that currently use the separate test_tables arrays to pass the appropriate names and index_kind to the single generator and remove the duplicated second DO block (refer to the existing identifiers: the two DO $$ blocks, the test_tables arrays, and functions eql_v2.add_search_config / eql_v2.add_encrypted_constraint to locate where to consolidate).packages/cipherstash-proxy-integration/src/map_ope_index_order.rs (1)
14-56: Add text ordering test with non-ASCII or mixed-case data to verify collation semantics.The current text ordering tests (
map_ope_order_text_ascandmap_ope_order_text_desc) use only lowercase ASCII strings (["aardvark", "aplomb", "chimera", "chrysalis", "zephyr"]). Without explicitCOLLATE "C"in the query or test data containing mixed-case or non-ASCII characters, these tests won't detect if OPE byte-order semantics diverge from PostgreSQL's default text collation. For transparent zero-change SQL semantics, consider adding a case with non-ASCII characters (e.g., accented letters) or explicitly specifying collation to make the ordering contract explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/cipherstash-proxy-integration/src/map_ope_index_order.rs` around lines 14 - 56, Update the tests map_ope_order_text_asc and map_ope_order_text_desc to validate collation semantics by either (a) changing the values array to include mixed-case and non-ASCII strings (e.g., accented characters) so ordering compares against PostgreSQL's text ordering for those characters, or (b) explicitly specifying a deterministic byte-wise collation in the SELECT (e.g., add COLLATE "C" to the ORDER BY clause) so the OPE byte-order behavior is asserted; adjust the expected vectors accordingly (use values.iter().map(...) or values.iter().rev().map(...) as already done) and keep the insert/select logic (insert variable, SELECT encrypted_text FROM {table} ORDER BY encrypted_text ...) the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Cargo.toml`:
- Around line 46-50: The Cargo.toml currently uses local sibling path
dependencies cipherstash-client and cts-common which break clean CI/contributor
builds; either revert these entries to the published crates.io versions by
replacing the path entries for cipherstash-client and cts-common with the
appropriate crate version specifications, or add explicit provisioning in
CI/bootstrap to checkout the sibling cipherstash-suite repository before
building (ensuring the workspace path "../cipherstash-suite/packages/..."
exists), and remove or update the TODO comment accordingly so builds no longer
rely on an unchecked-out sibling repo.
In `@CHANGELOG.md`:
- Around line 9-11: The changelog entry includes implementation-level details;
revise the "Added" OPE entry to be release-facing by keeping only the
user-visible summary (e.g., "Added OPE (Order-Preserving Encryption) index for
range and ORDER BY queries") and remove the SQL example and internal note about
IndexType::Ope and CipherStash build requirements; move the SQL usage example
(SELECT eql_v2.add_search_config...) and the IndexType::Ope/CipherStash build
details into the docs or PR description instead so the CHANGELOG remains
high-level and user-oriented.
In `@packages/cipherstash-proxy-integration/src/common.rs`:
- Around line 94-97: The helper clear_table_with_client is interpolating the
table name directly into a TRUNCATE SQL string; instead validate/whitelist the
table identifier before formatting to prevent injection or accidental SQL
execution. Update clear_table_with_client to first check the table argument
(e.g. match against a whitelist of allowed fixture names or validate with a
strict regex like only ASCII letters/numbers/underscores) and return/raise an
error if it fails validation, and avoid using unwrap() on client.simple_query
(propagate or handle the Result); only format and execute the TRUNCATE when the
table name passes the guard.
In `@packages/cipherstash-proxy-integration/src/map_ope_index_where.rs`:
- Around line 199-206: The loop that zips actual and expected and asserts
equality per element should be replaced with a direct assert_eq!(actual,
expected, "value mismatch for {col_name} via {sql}") after the length check;
remove the for (a, e) in actual.iter().zip(expected.iter()) { assert!(a == e,
...) } block and use assert_eq! on the whole Vecs (actual and expected) to get
clearer diffs and simpler code while keeping the existing message placeholders
{col_name} and {sql}.
---
Nitpick comments:
In `@packages/cipherstash-proxy-integration/src/map_ope_index_order.rs`:
- Around line 14-56: Update the tests map_ope_order_text_asc and
map_ope_order_text_desc to validate collation semantics by either (a) changing
the values array to include mixed-case and non-ASCII strings (e.g., accented
characters) so ordering compares against PostgreSQL's text ordering for those
characters, or (b) explicitly specifying a deterministic byte-wise collation in
the SELECT (e.g., add COLLATE "C" to the ORDER BY clause) so the OPE byte-order
behavior is asserted; adjust the expected vectors accordingly (use
values.iter().map(...) or values.iter().rev().map(...) as already done) and keep
the insert/select logic (insert variable, SELECT encrypted_text FROM {table}
ORDER BY encrypted_text ...) the same.
In `@tests/sql/schema.sql`:
- Around line 172-307: The two DO $$ blocks duplicate table creation and
eql_v2.add_search_config calls for ORE and OPE; refactor by extracting a single
generator function or DO block that accepts an index_kind parameter ('ore' or
'ope') and a list of table name suffixes, then loops once to CREATE TABLE and
call eql_v2.add_search_config and eql_v2.add_encrypted_constraint accordingly;
update calls that currently use the separate test_tables arrays to pass the
appropriate names and index_kind to the single generator and remove the
duplicated second DO block (refer to the existing identifiers: the two DO $$
blocks, the test_tables arrays, and functions eql_v2.add_search_config /
eql_v2.add_encrypted_constraint to locate where to consolidate).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bce3ce58-efa8-4f1a-b428-54f5d387c63e
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
CHANGELOG.mdCargo.tomlpackages/cipherstash-proxy-integration/src/common.rspackages/cipherstash-proxy-integration/src/lib.rspackages/cipherstash-proxy-integration/src/map_ope_index_order.rspackages/cipherstash-proxy-integration/src/map_ope_index_where.rspackages/cipherstash-proxy-integration/src/map_ore_index_order.rspackages/cipherstash-proxy-integration/src/map_ore_index_where.rspackages/cipherstash-proxy-integration/src/multitenant/ore_order.rspackages/cipherstash-proxy-integration/src/ore_order_helpers.rspackages/cipherstash-proxy/src/postgresql/context/column.rspackages/cipherstash-proxy/src/postgresql/data/from_sql.rspackages/cipherstash-proxy/src/postgresql/data/to_sql.rspackages/cipherstash-proxy/src/proxy/encrypt_config/config.rstests/sql/schema.sql
| # TODO: revert to crates.io once the cipherstash-suite version with `IndexType::Ope` | ||
| # is published. See https://github.com/cipherstash/cipherstash-suite for the | ||
| # OPE-enabled change. | ||
| cipherstash-client = { path = "../cipherstash-suite/packages/cipherstash-client" } | ||
| cts-common = { path = "../cipherstash-suite/packages/cts-common" } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Path dependencies that leave the repository:"
rg -n 'path\s*=\s*"\.\./cipherstash-suite/'
echo
echo "Bootstrap / CI references to that sibling checkout:"
rg -n -C2 'cipherstash-suite|../cipherstash-suite'Repository: cipherstash/proxy
Length of output: 3799
Don't merge sibling path dependencies into main.
Lines 49-50 reference ../cipherstash-suite/packages/... which only resolve when that repository is checked out as a sibling. The search found no CI or bootstrap setup that provisions this checkout, meaning clean builds in CI and for contributors will fail. The TODO comment (lines 46-48) acknowledges this is temporary, but it cannot ship in this state.
Resolve this before merge by either:
- Switching back to a published crate.io version of
cipherstash-clientandcts-common - Implementing explicit CI/bootstrap provisioning of the sibling repository
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Cargo.toml` around lines 46 - 50, The Cargo.toml currently uses local sibling
path dependencies cipherstash-client and cts-common which break clean
CI/contributor builds; either revert these entries to the published crates.io
versions by replacing the path entries for cipherstash-client and cts-common
with the appropriate crate version specifications, or add explicit provisioning
in CI/bootstrap to checkout the sibling cipherstash-suite repository before
building (ensuring the workspace path "../cipherstash-suite/packages/..."
exists), and remove or update the TODO comment accordingly so builds no longer
rely on an unchecked-out sibling repo.
| ### Added | ||
|
|
||
| - **OPE (Order-Preserving Encryption) index**: New `ope` index type alongside the existing `ore` for range and ordering queries. OPE ciphertexts compare under standard lexicographic byte ordering, so they're a drop-in alternative to ORE for range and `ORDER BY`. Configure with `SELECT eql_v2.add_search_config('table', 'column', 'ope', 'int')` (any cast type that `ore` supports). Requires EQL with OPE support and a CipherStash client/config build that includes `IndexType::Ope`. |
There was a problem hiding this comment.
Make the changelog entry release-facing.
The first sentence is changelog material; the SQL example and IndexType::Ope note read like implementation details. I'd move those details into docs or the PR description so [Unreleased] stays user-facing.
As per coding guidelines, "Write changelog entries from the user's perspective, not implementation details."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` around lines 9 - 11, The changelog entry includes
implementation-level details; revise the "Added" OPE entry to be release-facing
by keeping only the user-visible summary (e.g., "Added OPE (Order-Preserving
Encryption) index for range and ORDER BY queries") and remove the SQL example
and internal note about IndexType::Ope and CipherStash build requirements; move
the SQL usage example (SELECT eql_v2.add_search_config...) and the
IndexType::Ope/CipherStash build details into the docs or PR description instead
so the CHANGELOG remains high-level and user-oriented.
| pub async fn clear_table_with_client(client: &Client, table: &str) { | ||
| let sql = format!("TRUNCATE {}", table); | ||
| client.simple_query(&sql).await.unwrap(); | ||
| } |
There was a problem hiding this comment.
Guard table names before building TRUNCATE SQL.
Line 95 interpolates table directly into SQL. Even in tests, this helper is pub and can execute unintended SQL if passed unexpected input. Constrain fixture table names before formatting.
Suggested hardening
pub async fn clear_table_with_client(client: &Client, table: &str) {
+ assert!(
+ table
+ .chars()
+ .all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_'),
+ "invalid fixture table name: {table}"
+ );
let sql = format!("TRUNCATE {}", table);
client.simple_query(&sql).await.unwrap();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub async fn clear_table_with_client(client: &Client, table: &str) { | |
| let sql = format!("TRUNCATE {}", table); | |
| client.simple_query(&sql).await.unwrap(); | |
| } | |
| pub async fn clear_table_with_client(client: &Client, table: &str) { | |
| assert!( | |
| table | |
| .chars() | |
| .all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_'), | |
| "invalid fixture table name: {table}" | |
| ); | |
| let sql = format!("TRUNCATE {}", table); | |
| client.simple_query(&sql).await.unwrap(); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/cipherstash-proxy-integration/src/common.rs` around lines 94 - 97,
The helper clear_table_with_client is interpolating the table name directly into
a TRUNCATE SQL string; instead validate/whitelist the table identifier before
formatting to prevent injection or accidental SQL execution. Update
clear_table_with_client to first check the table argument (e.g. match against a
whitelist of allowed fixture names or validate with a strict regex like only
ASCII letters/numbers/underscores) and return/raise an error if it fails
validation, and avoid using unwrap() on client.simple_query (propagate or handle
the Result); only format and execute the TRUNCATE when the table name passes the
guard.
| assert_eq!( | ||
| actual.len(), | ||
| expected.len(), | ||
| "wrong row count for {col_name} via {sql}" | ||
| ); | ||
| for (a, e) in actual.iter().zip(expected.iter()) { | ||
| assert!(a == e, "value mismatch for {col_name} via {sql}"); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "map_ope_index_where.rs" | head -5Repository: cipherstash/proxy
Length of output: 129
🏁 Script executed:
if [ -f "packages/cipherstash-proxy-integration/src/map_ope_index_where.rs" ]; then
wc -l "packages/cipherstash-proxy-integration/src/map_ope_index_where.rs"
fiRepository: cipherstash/proxy
Length of output: 131
🏁 Script executed:
if [ -f "packages/cipherstash-proxy-integration/src/map_ope_index_where.rs" ]; then
sed -n '190,215p' "packages/cipherstash-proxy-integration/src/map_ope_index_where.rs" | cat -n
fiRepository: cipherstash/proxy
Length of output: 903
Replace manual equality loop with assert_eq!.
Line 205 uses assert!(a == e, ...) in a loop; prefer assert_eq! for guideline compliance and clearer failure diffs. Since both actual and expected are sorted Vec<T> with T: PartialEq, this can be simplified to a single direct comparison.
Suggested fix
assert_eq!(
actual.len(),
expected.len(),
"wrong row count for {col_name} via {sql}"
);
- for (a, e) in actual.iter().zip(expected.iter()) {
- assert!(a == e, "value mismatch for {col_name} via {sql}");
- }
+ assert_eq!(actual, expected, "value mismatch for {col_name} via {sql}");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert_eq!( | |
| actual.len(), | |
| expected.len(), | |
| "wrong row count for {col_name} via {sql}" | |
| ); | |
| for (a, e) in actual.iter().zip(expected.iter()) { | |
| assert!(a == e, "value mismatch for {col_name} via {sql}"); | |
| } | |
| assert_eq!( | |
| actual.len(), | |
| expected.len(), | |
| "wrong row count for {col_name} via {sql}" | |
| ); | |
| assert_eq!(actual, expected, "value mismatch for {col_name} via {sql}"); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/cipherstash-proxy-integration/src/map_ope_index_where.rs` around
lines 199 - 206, The loop that zips actual and expected and asserts equality per
element should be replaced with a direct assert_eq!(actual, expected, "value
mismatch for {col_name} via {sql}") after the length check; remove the for (a,
e) in actual.iter().zip(expected.iter()) { assert!(a == e, ...) } block and use
assert_eq! on the whole Vecs (actual and expected) to get clearer diffs and
simpler code while keeping the existing message placeholders {col_name} and
{sql}.
This PR also includes changes to bring Proxy up to date with the refactored
cipherstash-config(which I will try to separate into its own PR).Acknowledgment
By submitting this pull request, I confirm that CipherStash can use, modify, copy, and redistribute this contribution, under the terms of CipherStash's choice.
Summary by CodeRabbit
New Features
Tests