feat: add hyperparams probe for inference parameter sweep (CWE-1434) by JakeBx · Pull Request #1772 · NVIDIA/garak

JakeBx · 2026-05-15T04:03:10Z

Adds garak/probes/hyperparams.py — a generic probe (HyperparamBasher) that sweeps generator inference parameters (temperature, top_p, top_k, etc.) across the prompts of any existing probe, measuring how model behaviour changes under non-default settings. Implements detection of CWE-1434: Insecure Setting of Generative AI/ML Model Inference Parameters. Also adds the CWE-1434 entry to garak/data/tags.misp.tsv.

Also adds garak/analyze/hyperparam_summary.py — a standalone post-run script that reads harness detector results from the report JSONL and prints a per-combo pass/fail table.

Closes #1233.

Design decisions

Probe-based rather than harness-based

The most complete solution may be a HyperparamHarness that runs any existing probe N times with different generator configs — full probe fidelity, zero base-class changes, normal parallelism per run. The trade-off is N full probe runs (one per param combo), separate report files per combo, and a new harness invocation path.

A probe-based approach fits the existing _generator_precall_hook pattern established by promptinject and divergence, ships as a single self-contained module, and produces one report with per-attempt notes. The cost is that source probe hooks and buffs are not fully inherited — only _attempt_prestore_hook is chained (see below). The harness remains the right answer for thorough sweeps of complex probes and is a possible follow-on.

Fail-fast on source probes with a custom `_generator_precall_hook`

promptinject and divergence.Repeat both define _generator_precall_hook with instance state that cannot be safely transferred to HyperparamBasher (e.g. promptinject reads attempt.notes["settings"] set by its own _attempt_prestore_hook; divergence reads self.override_maxlen). Rather than run silently with broken behaviour, HyperparamBasher raises PluginConfigurationError at initialisation. These probes are the correct candidates for the harness approach instead.

Generator compatibility — soft warning

HyperparamBasher sweeps params by calling setattr(generator, param, value) before each attempt. This works correctly for openai.OpenAICompatible and subclasses, which read instance attributes dynamically via inspect.signature when constructing API requests. Template-based generators such as rest.RestGenerator store the attribute but never interpolate it into the request body — the sweep has no effect and results are misleading. _validate_params_against_generator cannot detect this because the attribute exists on the generator; it simply goes unused.

Rather than a hard isinstance gate (which would break valid future generators that expose dynamic params without subclassing OpenAICompatible), the probe emits a prominent WARNING log if the generator's MRO does not include OpenAICompatible. A cleaner long-term solution could be that generators opt in by declaring supports_dynamic_inference_params = True as a class attribute. This shifts the burden to generator authors who actually know whether their class interpolates instance attributes into requests, avoids class hierarchy coupling, and is future-proof for generators that don't subclass OpenAICompatible.

Per-combo results via sidecar index + post-run script

The core value of this probe is identifying which parameter settings cause failures. Per-combo detection requires harness-produced detector_results, which are only available after probe() returns — attempts are serialised before harness detection runs, so attempt.detector_results is always {} at generation time.

An earlier version of this commit solved this inline: _summarise_by_combo() loaded the primary detector via _plugins.load_plugin, ran detection across all completed attempts inside probe(), and stored results in attempt.notes["hyperparam_detector_results"] to survive JSONL serialisation. This worked, but on review the architectural impact was a bit too icky — probes generate attempts, detectors classify them, and crossing that boundary doubles inference cost for any non-cached detector (e.g. an LLM-as-judge).

The current approach keeps the boundary clean:

Each attempt carries notes["hyperparam_combo"] for direct grouping in the report jsonl.
garak.analyze.hyperparam_summary is a standalone post-run script that reads detector_results from the harness-produced JSONL and prints the per-combo breakdown. The terminal prints a copy-pasteable invocation with the report path at the end of each run.

The UX trade-off (running one extra command after the probe) is preferable to a boundary violation with hidden cost. The inline version printed results immediately; the post-run script requires an extra step but operates on correct, harness-produced scores.

A follow up would be to update the report generation to display in the report results by parameter setting.

Verification

Smoke tested against mistralai/mistral-nemo via OpenRouter (openai.OpenAICompatible).

{
    "run": {"generations": 2, "soft_probe_prompt_cap": 3},
    "plugins": {
        "generators": {"openai": {"OpenAICompatible": {"uri": "https://openrouter.ai/api/v1"}}},
        "probes": {
            "hyperparams": {
                "HyperparamBasher": {
                    "source_probe": "packagehallucination.Python",
                    "param_space": {"temperature": [0.0, 1.5, 2.0]},
                    "sweep_strategy": "single"
                }
            }
        }
    }
}

export OPENAICOMPATIBLE_API_KEY=$OPENROUTER_API_KEY
python -m garak
--model_type openai.OpenAICompatible
--model_name mistralai/mistral-nemo
--config

View per-combo results (run after garak completes):
python -m garak.analyze.hyperparam_summary --report ~/.local/share/garak/.report.jsonl

Full verification smoke test configs and output: https://github.com/JakeBx/mtls-testing/tree/main/garak/param-basher

…eping Adds garak/probes/hyperparams.py — a generic probe (HyperparamBasher) that sweeps generator inference parameters across the prompts of any existing probe, measuring how model behaviour changes under non-default settings. Implements CWE-1434 detection. Adds CWE-1434 entry to tags.misp.tsv. Random sweep builds the full param_space Cartesian product upfront and draws without replacement via np.random.default_rng, avoiding the seen-set rejection loop that could silently under-sample dense spaces. Supports random_seed for reproducible sweeps without contaminating global random state. Emits a prominent WARNING (logged and printed to terminal) if the generator's MRO does not include OpenAICompatible, catching the silent-failure mode where template-based generators store swept attributes but never interpolate them into requests. Closes NVIDIA#1233. Signed-off-by: JakeBx <jacob.j.lee@live.com>

Remove _summarise_by_combo() from HyperparamBasher, which violated the probe/detector boundary by loading and running a detector inside probe(). This also doubled inference cost for non-cached detectors and relied on a timing workaround (writing to attempt.notes) to survive JSONL serialisation before harness detection runs. Replace with garak/analyze/hyperparam_summary.py: standalone post-run script that reads detector_results from the harness-produced JSONL and prints the per-combo pass/fail table; invoke with `python -m garak.analyze.hyperparam_summary --report <path> probe() now prints a copy-pasteable command with the report path after each run. Signed-off-by: JakeBx <jacob.j.lee@live.com>

Signed-off-by: JakeBx <jacob.j.lee@live.com>

JakeBx force-pushed the feat/hyperparam-basher branch from 349e6ea to 5715905 Compare May 15, 2026 04:16

JakeBx marked this pull request as ready for review May 16, 2026 10:51

JakeBx force-pushed the feat/hyperparam-basher branch from 83d48f8 to 5a69e35 Compare May 16, 2026 22:59

JakeBx force-pushed the feat/hyperparam-basher branch from 5b38923 to 0e064a8 Compare May 17, 2026 02:16

JakeBx force-pushed the feat/hyperparam-basher branch from 2a65393 to e102573 Compare May 17, 2026 12:42

fix: refactor means parallel is now viable

3968e2f

Signed-off-by: JakeBx <jacob.j.lee@live.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add hyperparams probe for inference parameter sweep (CWE-1434)#1772

feat: add hyperparams probe for inference parameter sweep (CWE-1434)#1772
JakeBx wants to merge 3 commits into
NVIDIA:mainfrom
JakeBx:feat/hyperparam-basher

JakeBx commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JakeBx commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Design decisions

Probe-based rather than harness-based

Fail-fast on source probes with a custom _generator_precall_hook

Generator compatibility — soft warning

Per-combo results via sidecar index + post-run script

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JakeBx commented May 15, 2026 •

edited

Loading

Fail-fast on source probes with a custom `_generator_precall_hook`