Bill inference to an org + fix per-project .env loading#324
Open
FabienDanieau wants to merge 2 commits into
Open
Bill inference to an org + fix per-project .env loading#324FabienDanieau wants to merge 2 commits into
FabienDanieau wants to merge 2 commits into
Conversation
All HF Router LLM calls (main agent, research sub-agent, and context compaction) previously billed the token owner's personal monthly Inference Providers allowance, with no way to redirect to an org. When that allowance is exhausted the router returns 402 and the turn dies. Add an optional HF_BILL_TO env var. When set, an X-HF-Bill-To header is attached to every HF Router call so usage is charged to that org's credits instead. This mirrors huggingface_hub's `bill_to=` constructor arg, which sets the same header; we set it directly because we drive the router through LiteLLM's OpenAI-compatible path rather than the InferenceClient. Local OpenAI-compatible endpoints never receive it. Assisted-by: Claude:claude-opus-4-8
load_config intended to read a .env from the directory the user launches ml-intern from, but called load_dotenv() with no path. python-dotenv's find_dotenv then walks up from config.py's own location (inside the repo), never the launch CWD, so a per-project .env was silently ignored — only the ml-intern repo's own .env was ever loaded. Resolve the launch-directory .env explicitly with find_dotenv(usecwd=True) and load it after the repo .env to fill in any vars the repo one didn't set. Precedence is unchanged: repo .env still wins on conflicts. Assisted-by: Claude:claude-opus-4-8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two independent fixes, prompted by hitting 402 — monthly Inference Providers credits exhausted on personal credits with no way to redirect billing.
feat: HF_BILL_TO env var — All HF Router calls (main agent, research sub-agent, compaction) billed the token owner's personal allowance. Set HF_BILL_TO= to attach an X-HF-Bill-To header and charge that org's credits instead. Mirrors huggingface_hub's bill_to=; local OpenAI-compatible endpoints never get it.
fix: launch-directory .env ignored — load_config meant to read a .env from the directory you run ml-intern from, but bare load_dotenv() makes find_dotenv walk up from config.py's own location, never the launch CWD — so a per-project .env was silently dropped (only the repo's own .env loaded). Now resolved explicitly with find_dotenv(usecwd=True). Precedence unchanged: repo .env still wins on conflicts.
Testing