scribe turns healthcare X12 EDI into auditable events.
Instead of giving you a huge opaque JSON dump, it emits small domain events with control numbers, segment positions, byte offsets, and run IDs so pipelines can validate, load, replay, and debug claims/remits without hand-rolling brittle string parsers.
Use it in five layers:
- Parse EDI into JSON events to understand/debug content.
- Ingest X12 files into an immutable journal.
- Stitch journal events into claim and coverage aggregate versions, matching 837 claim submissions with 835 remittances.
- Project the journal into balance and notification read models.
- Consume applications and reports read directly from the projected read store(s).
The parser handles 834 enrollment, 837 claims, 835 remits, and 270/271 eligibility traffic. Raw PHI can be kept out of normal flows by writing tokenised events and resolving sensitive values through a separate PHI vault only when required.
scribe parses X12 syntax and maps selected healthcare EDI facts into journal events. It is not (yet) a full X12/TR3 validator.
Parse an 837 claim file and filter the emitted event stream:
scribe parse --type 837 claims.edi \
| jq 'select(.event_type=="SubscriberReferenced")'Ingest multiple inputs into a replayable evidence stream:
scribe ingest --out journal.scribe \
--837 claims.edi \
--835 remit.ediStitch claim versions by matching 837 claim facts with 835 remittance facts, then populate read-store indexes:
scribe stitch claims \
--journal journal.scribe \
--read-store read_store.sqlite \
--out claim_aggregates.ndjson \
--notify-out notifications.ndjsonClaims match on tokenised CLM01/CLP01. Service lines are paired by procedure
and charge, date when available, or line order, with the chosen match_method
included in the aggregate output.
Project a balance from the journal:
scribe project balance \
--journal journal.scribescribe is a small C binary. Run it interactively, in shell pipelines, as a
serverless trigger when files arrive in S3, as a cron/batch job, as a K8S job, or
inside common schedulers.
For more advanced usage see scripts/stroke-demo.sh,
demo.sh, or scribe --help.
No releases yet, but binaries are attached to each Action run.
The case study will build if needed. On Linux and Windows you'll need sqlite3. See CI for exact packages. macOS has it already.
./scripts/stroke-demo.shor
cmake -S . -B build
cmake --build build- Inputs: 834 enrollment, 837 claims, 835 remits, 270/271 eligibility
- Events: small auditable facts with source transaction, control numbers, segment index, byte offset, and optional run ID
- Journal: immutable binary evidence stream
- PHI vault: raw PHI resolver, separate from normal stores
- Read store: indexes, versioned aggregate snapshots, and latest rows
- Outputs: claim aggregates, member coverage, balances, and outbox facts
SQLite is used as a stand-in for a managed database to back the vault and read stores, but this can be swapped out in the future.
The synthetic stroke case study lives in tests/fixtures/stroke_encounter/. Generated reference output lives in demo/.
./scripts/stroke-demo.sh
./demo.shInspect the claim latest table:
sqlite3 -header -column demo/stroke_read_store.sqlite "
select aggregate_id, version, state_json
from claim_aggregate_latest
order by aggregate_id;
"See scripts/stroke-demo.sh and demo.sh for the full ingest, stitch, coverage, PHI, and balance command lines.
Default flows stay tokenised. Use --include-phi --phi-vault ... only for
controlled PHI read stores.
All PHI-looking fixture values are made up. The stroke case study is only inspired by a UK, non-US healthcare episode that I personally had. IDs, payer details, dates, amounts, and EDI content are made up.
- theory.md: compact model notes
- events.md: event names
- tests/fixtures/stroke_encounter/README.md: fixture map
MIT. See LICENSE.