Skip to content

Reduce translator-query skill token usage#7

Open
sierra-moxon wants to merge 3 commits into
mainfrom
reduce-skill-token-usage
Open

Reduce translator-query skill token usage#7
sierra-moxon wants to merge 3 commits into
mainfrom
reduce-skill-token-usage

Conversation

@sierra-moxon

Copy link
Copy Markdown
Member

Summary

  • Token cost scales with assistant turn count (the cached prompt prefix is re-sent every turn). A sample session used 17 turns / 560,075 tokens, 86% of it cache_read.
  • Add an Efficiency section to SKILL.md: work silently (no narration between tool calls), batch independent commands, plan hops up front.
  • Add drug-class guidance to Strategy 1 so class-member queries are issued deliberately instead of after a surprise re-query.
  • Move Strategy 3/4 and Domain Knowledge into an on-demand REFERENCE.md so they no longer load into the prefix on every turn.

Closes #6

Test plan

  • Run the translator-query skill on a drug-class question (e.g. "related drugs to wegovy") and confirm fewer assistant turns and lower total_tokens in benchmarks/results/tct_session_metrics.csv.
  • Confirm the model still reaches REFERENCE.md when a multi-hop / single-KP task needs it.

🤖 Generated with Claude Code

sierra-moxon and others added 3 commits June 5, 2026 16:41
Token cost scales with assistant turn count because the cached prompt
prefix is re-sent every turn. Cut turns and shrink the always-loaded
prefix:

- Add an Efficiency section: work silently (no narration between tool
  calls), batch independent commands, plan hops up front.
- Add drug-class guidance to Strategy 1 so class-member queries are
  issued deliberately instead of after a surprise re-query.
- Move Strategy 3/4 and Domain Knowledge to an on-demand REFERENCE.md
  so they no longer load into context on every turn.

Closes #6

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce translator-query skill token usage

1 participant