Implement OpenAIResponsesGenerator by ABeltramo · Pull Request #1806 · NVIDIA/garak

ABeltramo · 2026-05-28T09:26:55Z

Implements a new OpenAI compatible generator OpenAIResponsesGenerator that uses the responses API instead of ChatCompletion.

This started as way to run Agent Breaker against agents that support the responses API, but I think it's valuable as a standalone Generator for any probe.
I'm currently storing the returned tool calls as Message notes, mainly so that I can do manual inspection of the attempts in the report.jsonl file. Ideally, I think there should be a generic mechanism for storing those so that detectors can pick those up properly. Probably outside the scope for this PR, just wanted to know if you already have a direction for integrating tool calling into Garak.

Verification

The new Generator can be used like others OpenAI compatible generators, ex:

  generators:
    openai:
      OpenAIResponsesGenerator:
        uri: http://localhost:1234/v1/
        api_key: dummy 
        tools:
          - type: mcp
            server_label: mcp_server
            server_url: http://localhost:8888/sse
            require_approval: "never"

Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

jmartin-tech

This PR looks pretty useful, the new formats are something we are seeing more middle layers starting to mimic.

Some overall asks:

Please rebase this on main and adjust the PR target.
Given the comments made here about OpenAICompatible this might be functionality that has value in finding a way to integrate there.

jmartin-tech · 2026-05-28T12:46:50Z

    }


+class OpenAIResponsesGenerator(Generator):


This should probably extend OpenAICompatible or possibly be integrated into that class.

If kept separate, I would even go as far as suggesting this should be OpenAIReponsesCompatible as the intent described in the description suggests usage with any service that supports the Responses APi.

jmartin-tech · 2026-05-28T12:53:43Z

+        # response.output item types to collect into the final text.
+        # "message" captures standard assistant text; "reasoning" captures
+        # the model's reasoning summary (ResponseReasoningItem).
+        "output_types": ["message"],


This needs first class support, reasoning traces are additional data that garak needs to start gathering during the inference phase. As a first to support view I could see storing them as notes on the message but I don't think we should land them as an alternative value collected like this option suggests would be possible.

While is suspect the intent of making this a list is to gather both the option to not include messages becomes implied as valid, I am suggesting that should not be the case.

jmartin-tech · 2026-05-28T13:02:33Z

+        "uri": None,
+        "instructions": None,
+        "tools": [],
+        "max_output_tokens": 1024,


This value should map to max_tokens. Since this generator is targeting the responses API this class should promote the garak standard generator option of max_tokens to be max_output_tokens during __init__ of this class instead of exposing max_output_tokens.

I understand this diverges from how OpenAIReasoningGenerator handled max_completion_tokens, however the evolution of how max_tokens is treated needs to standardize.

jmartin-tech · 2026-05-28T13:05:37Z

+        super().__init__(self.name, config_root=config_root)
+
+    @staticmethod
+    def _build_input(prompt: Union[Conversation, str]):


No need to support string here, all prompts must be of type Conversation at this time.

Suggested change

def _build_input(prompt: Union[Conversation, str]):

def _build_input(prompt: Conversation):

jmartin-tech · 2026-05-29T14:18:57Z

+            if item_type == "message":
+                for part in item.content:
+                    if getattr(part, "type", None) == "output_text":
+                        text_parts.append(part.text)
+            elif item_type == "reasoning":
+                for summary in getattr(item, "summary", []):
+                    if getattr(summary, "type", None) == "summary_text":
+                        text_parts.append(summary.text)


Merging message text and reasoning text without may marker to delineate between them is not viable. Reasoning either needs to be stored separately or enclosed in something like the <think></think> tag from legacy APIs to ensure garak can properly exclude reasoning from the responses.

jmartin-tech · 2026-05-29T14:36:05Z

+        instructions = self.instructions or self._extract_system_prompt(prompt)
+        create_args = {
+            "model": self.name,
+            "input": self._build_input(prompt),
+            "max_output_tokens": self.max_output_tokens,
+        }


While I understand the format is different here, _extract_system_prompt and _build_input do not take into account additional input modalities supported by the Responses API that are already incorporated in OpenAICompatible._conversation_to_list(). This can be refactored to share that ETL process and enable this generator to support the additional input modalities.

jmartin-tech · 2026-05-29T14:39:13Z

+        self.key_env_var = self.ENV_VAR
+        self._load_unsafe()
+        super().__init__(self.name, config_root=config_root)


The Configurable class handles ENV_VAR support no need set here. Also _load_unsafe() should be called after all initialization and validation has completed.

Suggested change

self.key_env_var = self.ENV_VAR

self._load_unsafe()

super().__init__(self.name, config_root=config_root)

super().__init__(self.name, config_root=config_root)

self._load_unsafe()

feat: implement OpenAIResponsesGenerator

169a5cc

Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

jmartin-tech requested changes May 29, 2026

View reviewed changes

jmartin-tech mentioned this pull request May 29, 2026

probe: pdf injection via hidden text in documents #1757

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement OpenAIResponsesGenerator#1806

Implement OpenAIResponsesGenerator#1806
ABeltramo wants to merge 1 commit into
NVIDIA:feature/technique_intentfrom
trustyai-explainability:feature/openai-responses-generator

ABeltramo commented May 28, 2026

Uh oh!

jmartin-tech left a comment

Uh oh!

jmartin-tech May 28, 2026

Uh oh!

jmartin-tech May 28, 2026

Uh oh!

jmartin-tech May 28, 2026

Uh oh!

jmartin-tech May 28, 2026

Uh oh!

jmartin-tech May 29, 2026

Uh oh!

jmartin-tech May 29, 2026

Uh oh!

jmartin-tech May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	def _build_input(prompt: Union[Conversation, str]):
	def _build_input(prompt: Conversation):

Conversation

ABeltramo commented May 28, 2026

Verification

Uh oh!

jmartin-tech left a comment

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 29, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 29, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants