Skip to content

Add MulticlassJudge detector for configurable LLM-as-judge classification#1773

Open
ABeltramo wants to merge 5 commits into
NVIDIA:feature/technique_intentfrom
trustyai-explainability:feature/multiclass-judge
Open

Add MulticlassJudge detector for configurable LLM-as-judge classification#1773
ABeltramo wants to merge 5 commits into
NVIDIA:feature/technique_intentfrom
trustyai-explainability:feature/multiclass-judge

Conversation

@ABeltramo
Copy link
Copy Markdown
Collaborator

Introduces a new MulticlassJudge detector that extends ModelAsJudge with JSON-aware response parsing and user-defined classification categories (e.g. complied/rejected/alternative/other). Supports configurable system and user prompts, custom score keys/fields, confidence thresholds, and optional JSON schema injection for structured output APIs.

Cherry-picked from trustyai-explainability/garak:automated-red-teaming

hjrnunes added 3 commits May 15, 2026 07:41
…tion

Introduces a new MulticlassJudge detector that extends ModelAsJudge with JSON-aware response parsing and user-defined classification categories (e.g. complied/rejected/alternative/other). Supports configurable system and user prompts, custom score keys/fields, confidence thresholds, and optional JSON schema injection for structured output APIs.

Signed-off-by: ABeltramo <beltramo.ale@gmail.com>
Signed-off-by: ABeltramo <beltramo.ale@gmail.com>
Signed-off-by: ABeltramo <beltramo.ale@gmail.com>
@ABeltramo ABeltramo force-pushed the feature/multiclass-judge branch from e89c4fa to 0cfc5f0 Compare May 15, 2026 06:41
@ABeltramo
Copy link
Copy Markdown
Collaborator Author

From a quick glance it seems that the CI failures are unrelated to this PR.
The only failing test is tests/generators/test_litellm.py::test_litellm_model_detection, which fails with:

  openai.OpenAIError: Missing credentials. Please pass an `api_key` ...

This test has no OPENAI_API_KEY skip guard (unlike the other tests in the same file), so it runs unconditionally whenever litellm is installed. The same test passes on our fork's CI, which ran against a slightly older package version, so I guess this is just a flaky test..

@jmartin-tech
Copy link
Copy Markdown
Collaborator

The test failure is from an environment requirements change in the litellm dependency released yesterday, a PR to address it in main should be up by end of day a propagate it to the feature branch.

ABeltramo added 2 commits May 19, 2026 08:51
Signed-off-by: ABeltramo <beltramo.ale@gmail.com>
Signed-off-by: ABeltramo <beltramo.ale@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants