Skip to content

quikk-software/cloakpipe

 
 

Repository files navigation

CloakPipe

Privacy middleware for LLM & RAG pipelines

CI crates.io License

CloakPipe is a Rust-native privacy proxy for LLM and RAG pipelines. It sits between your application and any OpenAI-compatible API, automatically detecting sensitive entities, replacing them with consistent pseudonyms, and rehydrating responses -- so your LLM provider never sees real data.

CloakPipe Demo

The Problem

Every RAG pipeline that calls an external API sends sensitive data in plaintext:

                        WITHOUT CLOAKPIPE
                        =================

 Your App                              LLM / Embedding API
    |                                        |
    |  "Tata Motors reported Rs 3.4L Cr      |
    |   revenue. Contact: cfo@tata.com       |
    |   AWS key: AKIAIOSFODNN7EXAMPLE"       |
    |                                        |
    +--------- PLAINTEXT over HTTPS -------->|  <-- Provider sees everything
    |                                        |
    |  "Tata Motors reported strong Q3..."   |
    |<---------------------------------------+

Naive redaction ([REDACTED]) destroys semantic meaning and breaks retrieval. Python PII tools add 50-200ms latency and miss financial data. Cloud-locked solutions only work within their own ecosystem.

The Solution: Consistent Pseudonymization

CloakPipe replaces sensitive entities with consistent tokens that preserve semantic structure:

                         WITH CLOAKPIPE
                         ==============

 Your App              CloakPipe Proxy              LLM API
    |                       |                          |
    |  "Tata Motors         |                          |
    |   reported Rs 3.4L    |                          |
    |   Cr revenue in       |                          |
    |   Q3 2025. Contact:   |                          |
    |   cfo@tata.com"       |                          |
    +---------------------->|                          |
                            |                          |
                     DETECT & PSEUDONYMIZE             |
                     +-------------------------+       |
                     | Tata Motors  -> ORG_7    |       |
                     | Rs 3.4L Cr  -> AMOUNT_12|       |
                     | Q3 2025     -> DATE_3   |       |
                     | cfo@tata.com-> EMAIL_5  |       |
                     +-------------------------+       |
                            |                          |
                            |  "ORG_7 reported         |
                            |   AMOUNT_12 revenue      |
                            |   in DATE_3. Contact:    |
                            |   EMAIL_5"               |
                            +------------------------->|
                            |                          |
                            |  "ORG_7 had strong       |  Provider sees
                            |   AMOUNT_12 growth..."   |  only pseudonyms
                            |<-------------------------+
                            |
                     REHYDRATE RESPONSE
                     +-------------------------+
                     | ORG_7      -> Tata Motors|
                     | AMOUNT_12  -> Rs 3.4L Cr|
                     +-------------------------+
                            |
    |  "Tata Motors had     |
    |   strong Rs 3.4L Cr   |
    |   growth..."          |
    |<----------------------+

    User sees real data.
    LLM never saw it.

The same entity always maps to the same token across documents, queries, and sessions. This means:

  • Embeddings preserve semantic structure (vector search still works)
  • Multi-turn conversations stay coherent
  • The LLM reasons over pseudonyms, and rehydration restores real values

Where CloakPipe Sits in a RAG Pipeline

 Documents                          User Queries
     |                                   |
     v                                   v
 +---------+                       +-----------+
 | Chunker |                       | Query     |
 +---------+                       +-----------+
     |                                   |
     v                                   v
 +--------------------------------------------------+
 |                   CLOAKPIPE                       |
 |                                                   |
 |  +------------+  +-------+  +-----------------+  |
 |  | Detection  |->| Vault |->| Pseudonymize    |  |
 |  | Engine     |  | (AES) |  | (consistent)    |  |
 |  +------------+  +-------+  +-----------------+  |
 |   regex|finance|custom|NER    entity -> token     |
 +--------------------------------------------------+
     |                                   |
     v                                   v
 Embedding API                     LLM API
 (sees pseudonyms)                 (sees pseudonyms)
     |                                   |
     v                                   v
 Vector DB                         +--------------------------------------------------+
 (pseudonymized                    |                   CLOAKPIPE                       |
  embeddings)                      |  +-----------------+  +-------+                   |
     |                             |  | Rehydrate       |->| Vault |                   |
     +--- retrieve context ------->|  | (streaming SSE) |  | (AES) |                   |
                                   |  +-----------------+  +-------+                   |
                                   +--------------------------------------------------+
                                                             |
                                                             v
                                                        User sees
                                                        real data

4 leak points in a standard RAG pipeline. CloakPipe covers all of them.

Features

  • Drop-in proxy -- OpenAI-compatible API; change one URL and your app is protected
  • Multi-layer detection -- Regex patterns, financial intelligence, custom TOML rules, optional NER
  • Consistent pseudonymization -- Same entity always maps to the same token across sessions
  • Encrypted vault -- AES-256-GCM at rest, zeroize memory safety for key material
  • SSE streaming rehydration -- Token-aware buffering handles pseudonyms split across chunks
  • Audit logging -- Structured JSONL logs for compliance (metadata only, never raw values)
  • Industry profiles -- Pre-tuned detection for legal, healthcare, fintech; guided setup wizard
  • MCP server -- Expose privacy tools to AI agents (Claude, Cursor, custom harnesses)
  • Admin dashboard -- Privacy-aware chat UI, detection feed, compliance audit, policy management (local-first via PowerSync)
  • Single binary -- No Docker, no Python, no microservices. Deploy in seconds
  • <5ms overhead -- Rust-native, sits in the hot path without you noticing

Quick Start

Install from crates.io

cargo install cloakpipe-cli

Or build from source

git clone https://github.com/rohansx/cloakpipe.git
cd cloakpipe
cargo build --release

Initialize configuration

./target/release/cloakpipe init
# Creates cloakpipe.toml with sensible defaults

Set environment variables

export OPENAI_API_KEY="sk-..."
export CLOAKPIPE_VAULT_KEY=$(openssl rand -hex 32)

Start the proxy

./target/release/cloakpipe start
# Listening on 127.0.0.1:8900

Point your app at CloakPipe

from openai import OpenAI

# Before -- data sent in plaintext
client = OpenAI()

# After -- data pseudonymized automatically
client = OpenAI(base_url="http://127.0.0.1:8900/v1")

That's it. No SDK changes, no framework plugins, no code modifications.

Works with any OpenAI-compatible client

LangChain
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    openai_api_base="http://127.0.0.1:8900/v1",
    model="gpt-4o",
)
LlamaIndex
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    api_base="http://127.0.0.1:8900/v1",
    model="gpt-4o",
)
curl
curl http://127.0.0.1:8900/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Summarize Q3 results for Tata Motors"}]
  }'
Ollama / local models

Point CloakPipe upstream at your local Ollama instance:

[proxy]
upstream = "http://localhost:11434"

Test Detection (no API key needed)

# Built-in sample text
./target/release/cloakpipe test

# Custom text
./target/release/cloakpipe test --text "Send $1.2M to alice@acme.com by Q3 2025"

# From file
./target/release/cloakpipe test --file document.txt

Example output:

Detected 8 entities:
  EMAIL     alice@acme.com         -> EMAIL_1
  AMOUNT    $1.2M                  -> AMOUNT_1
  DATE      Q3 2025                -> DATE_1

Pseudonymized:
  "Send AMOUNT_1 to EMAIL_1 by DATE_1"

Rehydrated:
  "Send $1.2M to alice@acme.com by Q3 2025"

Roundtrip: OK

Configuration

CloakPipe is configured via cloakpipe.toml:

[proxy]
listen = "127.0.0.1:8900"
upstream = "https://api.openai.com"
api_key_env = "OPENAI_API_KEY"
timeout_seconds = 120

[vault]
path = "./vault.enc"
encryption = "aes-256-gcm"
key_env = "CLOAKPIPE_VAULT_KEY"

[detection]
secrets = true           # API keys, JWTs, connection strings
financial = true         # Currency amounts, percentages, fiscal dates
dates = true
emails = true
phone_numbers = true
ip_addresses = true
urls_internal = true

[detection.custom]
patterns = [
    { name = "project_codename", regex = "Project\\s+(Alpha|Beta|Gamma)", category = "PROJECT" },
    { name = "client_tier", regex = "Tier\\s+[A-C]\\s+client", category = "CLASSIFICATION" },
]

[detection.overrides]
preserve = ["OpenAI", "GPT-4", "Claude"]    # Never pseudonymize
force = ["internal-secret"]                  # Always pseudonymize

[audit]
enabled = true
log_path = "./audit/"
log_entities = true    # Log entity metadata (never raw values)

Industry Profiles

CloakPipe ships with pre-tuned detection profiles for different industries:

# Interactive setup wizard — choose your industry, provider, and vault backend
cloakpipe setup
Profile What it adds Use case
General Balanced defaults — secrets, financial, dates, emails Most applications
Legal Case numbers, docket IDs, SSNs, bar numbers Law firms, legal tech
Healthcare MRN, NPI, DEA numbers, ICD codes (HIPAA-aware) Health tech, clinical AI
Fintech SWIFT/BIC, ISIN, IBAN, routing numbers, IP detection Banking, trading platforms

Set in config:

profile = "healthcare"

Or switch at runtime via the MCP configure tool or API.

MCP Server (Agentic Integrations)

CloakPipe exposes its privacy tools as an MCP server, so AI agents (Claude Code, Cursor, custom agents) can pseudonymize data as a skill:

cloakpipe mcp

Available tools:

Tool Description
pseudonymize Detect and replace sensitive entities with consistent tokens
rehydrate Restore pseudo-tokens back to original values
detect Dry-run scan — see what would be caught without replacing
vault_stats Show total mappings and per-category counts
configure Switch industry profile or toggle detection categories at runtime

Claude Desktop / Claude Code config:

{
  "mcpServers": {
    "cloakpipe": {
      "command": "cloakpipe",
      "args": ["mcp"]
    }
  }
}

Detection Layers

Layer What it catches Examples
Secrets API keys, JWTs, connection strings, tokens AKIAIOSFODNN7EXAMPLE, eyJhbG...
Financial Multi-currency amounts, percentages, fiscal dates $1.2M, Rs 3.4L Cr, 15.7%, Q3 2025
Contact Emails, phone numbers, IP addresses, internal URLs alice@acme.com, 192.168.1.1
Custom User-defined TOML patterns Project codenames, client tiers, internal terms
NER Persons, organizations, locations ONNX-based (optional, --features ner)
Fuzzy Resolution Variant spellings, misspellings, nicknames Rishikesh = Rishi = Rishiksh (typo)

Fuzzy Entity Resolution (v0.6)

CloakPipe can merge variant spellings of the same entity into a single token:

Without resolver:
  "Rishi"      → PERSON_1
  "Rishikesh"  → PERSON_2    ← 2 tokens for same person
  "Rishiksh"   → PERSON_3    ← typo = 3rd token

With resolver:
  "Rishi"      → PERSON_1
  "Rishikesh"  → PERSON_1    ← same token (prefix match)
  "Rishiksh"   → PERSON_1    ← same token (Jaro-Winkler 0.96)

Enable in config:

[detection.resolver]
enabled = true
threshold = 0.90       # Minimum similarity score (conservative)
min_prefix_len = 4     # Minimum length for prefix matching

[[detection.resolver.aliases]]
group = ["Rishikesh Kumar", "Rishi", "Rishi kesh"]

Matching uses Jaro-Winkler similarity + prefix bonuses, only within the same entity category. "Rishikesh" (person) and "Rishikesh" (city) stay as separate tokens.

How It Works

Request Flow:

  Incoming request
       |
       v
  +------------------+     +------------------+     +------------------+
  |  1. DETECT       |---->|  2. PSEUDONYMIZE |---->|  3. FORWARD      |
  |                  |     |                  |     |                  |
  |  Multi-layer     |     |  Entity -> Token |     |  Sanitized req   |
  |  engine scans    |     |  stored in AES   |     |  to upstream API |
  |  request body    |     |  encrypted vault |     |                  |
  +------------------+     +------------------+     +------------------+
                                                           |
  +------------------+     +------------------+            |
  |  5. AUDIT        |<----|  4. REHYDRATE    |<-----------+
  |                  |     |                  |
  |  Log metadata    |     |  Token -> Entity |    Response Flow
  |  (never raw      |     |  in response,    |    (including SSE
  |   values)        |     |  including SSE   |     streaming)
  +------------------+     |  streaming with  |
                           |  chunk buffering |
                           +------------------+
                                  |
                                  v
                           Response to app
                           (real values restored)

Key design decisions:

  • Consistent mappings -- "Tata Motors" always maps to ORG_7, across all documents, queries, and sessions
  • Encrypted vault -- Mappings persisted with AES-256-GCM; keys zeroed from memory via zeroize
  • Streaming-aware -- SSE rehydration handles tokens split across chunks (e.g., OR + G_7)
  • Metadata-only audit -- Logs record entity counts and categories, never the actual values

Project Structure

Crate crates.io Description
cloakpipe-cli crates.io CLI binary (start, test, stats, init, setup, mcp, tree, vector)
cloakpipe-core crates.io Detection, pseudonymization, vault (file + SQLite), rehydration, industry profiles
cloakpipe-proxy crates.io Axum HTTP proxy (chat completions + embeddings)
cloakpipe-audit crates.io Audit logging (JSONL + SQLite) with rotation
cloakpipe-tree crates.io CloakTree: vectorless document retrieval
cloakpipe-vector crates.io ADCPE: distance-preserving vector encryption
cloakpipe-mcp crates.io MCP server for agentic integrations
cloakpipe-local crates.io Fully local RAG mode (planned)
dashboard Admin UI: privacy chat, detection feed, compliance, policies (React + PowerSync)

Dashboard

CloakPipe includes an admin dashboard with a privacy-aware chat interface. The chat detects PII in your browser, pseudonymizes it, and calls the LLM directly — no proxy setup required.

cd dashboard
npm install
npm run dev
# Opens at http://localhost:5173

Runs in demo mode by default with seeded data. Add your OpenAI/Anthropic API key in Settings to start chatting.

Pages: Chat (with live Privacy Shield) · Overview · Detection Feed · Compliance & Audit · Policies · Instances · Settings

Built with React, TypeScript, Tailwind CSS, and PowerSync for local-first SQLite. See dashboard/README.md for full documentation.

Roadmap

Version Feature Status
v0.1 Multi-layer detection, consistent pseudonymization, encrypted vault, OpenAI-compatible proxy, SSE streaming, audit logging Released
v0.2 CloakTree — vectorless, reasoning-based retrieval for structured documents Released
v0.3 ONNX NER, SQLite vault/audit, multi-user support Released
v0.4 Distance-preserving vector encryption (ADCPE) Released
v0.5 Industry profiles, MCP server for agentic integrations, guided setup wizard Released
v0.6 Fuzzy entity resolution — Jaro-Winkler matching, alias groups Released
v0.7 Admin dashboard — privacy chat, detection feed, compliance, local-first PowerSync Released
v0.8 TEE support (AWS Nitro Enclaves, Intel TDX) Planned

Running Tests

cargo test

66 tests covering vault encryption, multi-layer detection, pseudonymization roundtrips, streaming rehydration, SQLite vault/audit, ADCPE vector encryption, industry profiles, MCP server tools, and end-to-end proxy behavior.

Security

CloakPipe handles sensitive data by design. Security considerations:

  • Vault encryption: All entity mappings are encrypted with AES-256-GCM at rest. Keys are never written to disk.
  • Memory safety: Sensitive key material is deterministically zeroed via zeroize -- not left to the garbage collector.
  • Audit trail: Structured logs record what happened (entity counts, categories, timing) without recording what the entities actually were.
  • No telemetry: CloakPipe sends zero data anywhere. The proxy connects only to your configured upstream.

If you discover a security vulnerability, please report it privately via GitHub Security Advisories.

Contributing

Contributions are welcome. Please open an issue to discuss your idea before submitting a PR.

# Development build
cargo build

# Run tests
cargo test

# Run with tracing
RUST_LOG=debug cargo run -- start

License

This project is licensed under the MIT License.

About

Privacy middleware for LLM & RAG pipelines - consistent pseudonymization, encrypted vault, SSE streaming rehydration.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Rust 55.8%
  • TypeScript 40.7%
  • Python 2.3%
  • Dockerfile 0.5%
  • CSS 0.4%
  • JavaScript 0.2%
  • HTML 0.1%