You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updated 2026-05-08 per Codex review (rounds 1 + 2) on #165. §5 added in round 1; round 2 reframed it as a BM25 oracle + escalation trigger, not a substitute for a hosted-search comparison. Hosted search remains a permanent contingency, not a question DuckDB FTS closes.
Sub-issue of #165. Depends on #170 (offline builder).
Goal
Implement the browser-side query path against the v1 substrate, behind a feature flag, and run the canonical benchmark to compare against the v1 contract budgets and the curated benchmark from #169.
Scope
1. Browser query path
New cell in explorer.qmd (or extracted module): searchSubstrate(term).
Per-query metrics required in the benchmark JSON. This list is the contract between #171 (produce the data) and #172 (consume it as a hard-fail). Adding a hard-fail in #172 without adding the metric here means the gate evaluates against absent data.
Concept-only queries from the curated benchmark (ceramic, bone, mammal) must return non-empty results — verifies the v1 substrate dereferences vocabulary labels correctly. Failing any concept-only query is a hard fail, regardless of latency.
Stopword-heavy queries (pottery from Cyprus) must return non-empty results — verifies query-time stopword removal works.
Wildcard literals (%, _) handled by tokenizer (no ILIKE escape — substrate path doesn't use ILIKE).
5. DuckDB FTS local-only relevance oracle
Run a parallel relevance evaluation against a known-good BM25 system as a v1 quality oracle and escalation trigger. This anchors "does our static substrate approximate BM25 over the same document projection?" — it does not answer "is static-browser the right product boundary for good search?"
Top-3 / top-10 overlap numbers vs hand-labeled set AND vs DuckDB FTS reference posted to this issue
7. Hosted-search backend as a permanent contingency (not just NO-GO downstream)
The DuckDB FTS oracle in §5 anchors v1 quality. It does not close the hosted-search question. Hosted search (Solr / Meilisearch / Typesense / equivalent) remains a contingency for either of the following triggers:
(b) v2+ quality requirements that exceed what a static substrate can deliver — e.g., phrase search, typo tolerance, richer analyzer pipelines, or v2 field growth that pushes the static substrate over its byte budget.
Either trigger fires the same downstream issue: Explorer FTS Track 6: Hosted-search backend. A v1 GO does not close (b); it just means we ship the static substrate and revisit hosted search when v2 requirements demand it.
Sub-issue of #165. Depends on #170 (offline builder).
Goal
Implement the browser-side query path against the v1 substrate, behind a feature flag, and run the canonical benchmark to compare against the v1 contract budgets and the curated benchmark from #169.
Scope
1. Browser query path
explorer.qmd(or extracted module):searchSubstrate(term).termusing the JS tokenizer from Explorer FTS Track 3: Offline index builder + tokenizer regression set #170.pid IN (...)join back tosamples_map_lite.parquetfor display fields (label, source, lat, lng, place_name).sourceFilterSQL()andfacetFilterSQL()(existing helpers inexplorer.qmd:377/:528).2. Feature flag
?fts=v1routesdoSearch()tosearchSubstrate(term).3. Benchmark run (browser-path)
tests/test_search_perf.py(from Explorer FTS Track 1a: Browser perf-smoke baseline #167) to run the canonical query set against both:?fts=v1)tests/search_substrate_benchmark_<YYYY-MM-DD>.jsonPer-query metrics required in the benchmark JSON. This list is the contract between #171 (produce the data) and #172 (consume it as a hard-fail). Adding a hard-fail in #172 without adding the metric here means the gate evaluates against absent data.
performance.measure('search-…')ceramic,bone,mammal, +1-2)pottery from Cyprusandpottery Cypruspottery pottery cyprusvspottery cyprusa the of) outcomesamples_map_literow)4. Worst-case + concept-label coverage
ceramic,bone,mammal) must return non-empty results — verifies the v1 substrate dereferences vocabulary labels correctly. Failing any concept-only query is a hard fail, regardless of latency.pottery from Cyprus) must return non-empty results — verifies query-time stopword removal works.%,_) handled by tokenizer (no ILIKE escape — substrate path doesn't use ILIKE).5. DuckDB FTS local-only relevance oracle
Run a parallel relevance evaluation against a known-good BM25 system as a v1 quality oracle and escalation trigger. This anchors "does our static substrate approximate BM25 over the same document projection?" — it does not answer "is static-browser the right product boundary for good search?"
tools/build_fts_index.py(PR Improve search: multi-term AND + relevance ranking (FTS spike) #95 spike artifact) to build a local-only DuckDB FTS index over the same v1 sample search document projection.What this oracle does NOT cover (and therefore does NOT close the hosted-search question):
explaintraces)These are reasons hosted search remains a permanent contingency — see §7 below + #172 NO-GO framing.
6. Acceptance criteria for the prototype
This issue ships the prototype + benchmark data. It does not decide GO/NO-GO — that's #172.
?fts=v17. Hosted-search backend as a permanent contingency (not just NO-GO downstream)
The DuckDB FTS oracle in §5 anchors v1 quality. It does not close the hosted-search question. Hosted search (Solr / Meilisearch / Typesense / equivalent) remains a contingency for either of the following triggers:
Either trigger fires the same downstream issue:
Explorer FTS Track 6: Hosted-search backend. A v1 GO does not close (b); it just means we ship the static substrate and revisit hosted search when v2 requirements demand it.Out of scope
?fts=v1flag and shipping (that's Explorer FTS Track 5: GO/NO-GO decision gate #172 GO).Refs
#165, #169, #170, #172, PR #95