vortex opt-in triton kernels + gene completion eval#225
Conversation
garykbrixi
commented
Jun 17, 2026
- Update with the opt in triton kernels
- Add gene completion eval from evo 2 paper
- 7b 8k default fp8 to true
- Update evo 2 version
Release the gene completion benchmark from the Evo 2 paper: prompt the model with the start of a gene, complete it, and measure percent amino-acid (AA) recovery over the non-prompt region. Includes the prokaryote (MSA-scored) and eukaryote (exonerate-scored) panels, evaluation code against the public Evo 2 API, the gene panels with reference proteins, and precomputed per-gene results for the headline models (1B base, 7B, 20B, 40B). Addresses ArcInstitute#217.
There was a problem hiding this comment.
Code Review
This pull request introduces support for Vortex's opt-in Triton inference kernels to accelerate Hyena convolutions, updates the vtx dependency, and adds a new gene completion benchmark under scripts/gene_completion/. Feedback on the benchmark script recommends safer unpacking of the generation output using the .sequences attribute, failing early if the exonerate dependency is missing before loading the model, and adding the --use_kernels command-line argument to allow benchmarking with the newly introduced Triton kernels.
| seqs, _ = evo_model.generate( | ||
| prompt_seqs=[prompt] * n, | ||
| n_tokens=n_tokens, | ||
| temperature=temperature, | ||
| top_k=top_k, | ||
| batched=True, | ||
| cached_generation=True, | ||
| verbose=0, | ||
| force_prompt_threshold=min(200, len(prompt)), | ||
| ) |
There was a problem hiding this comment.
Unpacking the output of evo_model.generate as seqs, _ is risky and can fail or silently unpack dictionary keys if the return type is a dotdict or dictionary. It is much safer and more robust to use the documented .sequences attribute directly, which also aligns with the usage in README.md and test_evo2_generation.py.
gen_out = evo_model.generate(
prompt_seqs=[prompt] * n,
n_tokens=n_tokens,
temperature=temperature,
top_k=top_k,
batched=True,
cached_generation=True,
verbose=0,
force_prompt_threshold=min(200, len(prompt)),
)
seqs = gen_out.sequences|
|
||
| # Lazily import Evo 2 so --help / dataset inspection works without a GPU. | ||
| from evo2 import Evo2 | ||
| evo_model = Evo2(args.model_name) |
There was a problem hiding this comment.
| if df.empty: | ||
| raise SystemExit("No genes selected.") |
There was a problem hiding this comment.
For the eukaryote panel, check if exonerate is available early in evaluate before loading the model. Loading a large model (e.g., 7B or 40B) takes significant time and VRAM, so failing early if dependencies are missing provides a much better user experience.
| if df.empty: | |
| raise SystemExit("No genes selected.") | |
| if df.empty: | |
| raise SystemExit("No genes selected.") | |
| if args.panel == "eukaryote": | |
| from scoring import _resolve | |
| try: | |
| _resolve(args.exonerate_path) | |
| except FileNotFoundError as e: | |
| raise SystemExit(str(e)) |
| p.add_argument("--top_k", type=int, default=4) | ||
| p.add_argument("--batch_size", type=int, default=8) | ||
| p.add_argument("--n_tokens", type=int, default=None, help="Default: cover the remainder") | ||
| p.add_argument("--exonerate_path", default="exonerate") |
There was a problem hiding this comment.
Add the --use_kernels command-line argument to the parser so that users can opt-in to the Triton inference kernels when running the benchmark.
| p.add_argument("--exonerate_path", default="exonerate") | |
| p.add_argument("--exonerate_path", default="exonerate") | |
| p.add_argument("--use_kernels", action="store_true", | |
| help="Enable Vortex opt-in Triton HC{S,M,L} inference kernels (requires vtx>=1.1.0)") |
|
PR tested by @danielchang2002 |