fix(divergence): guard max_tokens override when the generator lacks it#1795
fix(divergence): guard max_tokens override when the generator lacks it#1795adityasingh2400 wants to merge 2 commits into
Conversation
generators.function.Single overrode DEFAULT_PARAMS with only {"kwargs": {}},
so it did not pick up the base Generator.DEFAULT_PARAMS values. Because
_apply_missing_instance_defaults iterates over the most-derived
DEFAULT_PARAMS, params such as max_tokens, temperature, top_k, context_len,
skip_seq_start and skip_seq_end were never set as instance attributes.
Accessing them (e.g. generator.max_tokens) raised AttributeError.
Merge Generator.DEFAULT_PARAMS into the class dict, matching the pattern
used by the other generators (openai, rest, huggingface, cohere, ollama,
etc.). Multiple inherits from Single and is covered as well.
Adds a regression test asserting both Single and Multiple expose the
inherited defaults while keeping the function-specific kwargs default.
Fixes NVIDIA#1096
Signed-off-by: Aditya Singh <adisin650@gmail.com>
jmartin-tech
left a comment
There was a problem hiding this comment.
Is this how we want the function generators to work? Note that this would not pass these options to the function as only kwargs are passed by design. The current docs define that only the prompt as a str and the kwargs dictionary provided as static config are passed to these generators.
As noted in the reporting issue, there are currently two probes that expect to set generator options in pre or post inference stages. The promptinject probe guards to ensure the attribute is on the generator. I would suggest that divergence should also guard the expectation that max_tokens will exist on a the generator though I do think there might be some value in ensuring the base.Probe.DEFAULT_PARAMS are made available to function generators.
I could seen enhancing the function generators to inspect the signature of the function passed and add any generator attributes with matching names into the kwargs dictionary passed at inference time.
+1 on all of this. I think it changes the way that |
|
Good points, and I agree the current change is not the right shape on its own. Since function generators only receive the prompt and the static kwargs by design, inheriting DEFAULT_PARAMS on the generator does not actually make max_tokens reach the function, so it does not fully fix the divergence case and it blurs how function differs from the other generators. I think the cleanest split is what you both described:
I am happy to rework this PR down to item 1 as the focused fix, and to add item 2 if you want function generators to surface those params. Which would you prefer, or both? |
|
Refocus the PR on a defined target, the reported issue seem to focus on an expectation that all probes will have some parameters available which is simply not true today. Accounting for that gap should be the first iteration. If you would like to address enhancing the capabilities of |
|
Done in 3d0a50e, refocused as you suggested.
|
|
DCO sign-off is required in all commit messages of the branch.
|
…erators Per review, refocus on the reported gap: not every generator exposes max_tokens, so the divergence probe guards the attribute before overriding it, the same way promptinject does with 'in dir(generator)'. Revert the function generator DEFAULT_PARAMS change, which is better handled as a separate enhancement PR. Fixes NVIDIA#1096. Signed-off-by: Aditya Singh <adisin650@gmail.com>
3d0a50e to
a901de6
Compare
|
Thanks for flagging. I added the DCO sign-off to all commits on the branch and force-pushed, so every commit now carries the Signed-off-by trailer and the DCO check is green. Let me know if anything else is needed. |
Fixes #1096.
Refocused per review. The reported problem is that the divergence probe's
_generator_precall_hookreadsgenerator.max_tokensunconditionally, so generators that do not carrymax_tokens(the function generators, which by design only receive the prompt and static kwargs) raiseAttributeErrorwhen run with this probe.This guards the attribute the same way the promptinject probe does, checking that
max_tokensis present on the generator before overriding it, and restoring it afterward only when it was actually changed. Generators withoutmax_tokensare left untouched.The earlier approach of having the function generators inherit the base
DEFAULT_PARAMSis dropped here. As discussed, that changes how function generators are configured and is better evaluated as a separate enhancement PR on its own merits.