-
Notifications
You must be signed in to change notification settings - Fork 260
Pull requests: sgl-project/SpecForge
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Expose flex_attention kernel options in DFlash and Domino training
#586
opened Jun 18, 2026 by
heiheiha798
Loading…
[Feature] VLM DFlash Training: Multi-Model Support for Qwen3-VL / Qwen3.5 / Qwen3.6
#585
opened Jun 18, 2026 by
zyk42
Loading…
Add Ascend NPU support for the Domino training pipeline (Qwen3.5-4B)
#584
opened Jun 18, 2026 by
curnane-lab
Loading…
feat: Train-Inference Disaggregation for Remote Target Model Serving
#573
opened Jun 2, 2026 by
moehanabi
Contributor
Loading…
4 of 6 tasks
feat: add automatic device detection for non-CUDA backends
#559
opened May 25, 2026 by
curnane-lab
Loading…
6 tasks done
Support sharded target logits for EAGLE3 online training
#558
opened May 25, 2026 by
Yukino256
Loading…
6 tasks
Supports eagle3 training for Gemma3 27B and Gemma4 26B.
#553
opened May 1, 2026 by
pyc96
Collaborator
Loading…
6 tasks
Add transformers-like checkpoint parameters (--save-total-limit, --save-strategy, and so on)
#547
opened Apr 27, 2026 by
thechaos16
Loading…
2 of 6 tasks
feat: add is_vlm param to safe_conversations_generator for multimodal data support
#545
opened Apr 24, 2026 by
sunny-infra
Loading…
2 of 6 tasks
chore: regenerate_train_data accepts API key and https URL
#543
opened Apr 24, 2026 by
lianakoleva
Loading…
1 of 6 tasks
add the configs for qwen3-vl-8b-instruct model
#542
opened Apr 23, 2026 by
sunny-infra
Loading…
1 of 6 tasks
fix: EAGLE-3 training compatibility with multimodal-wrapped targets and large vocabs
#535
opened Apr 16, 2026 by
elad-inferize
Loading…
4 tasks done
Preserve drafter vocab mapping when fine-tuning from a checkpoint
#534
opened Apr 15, 2026 by
luv-bansal
Loading…
[Fix] preserve image data in preprocess for VLM training on multimodal data
#532
opened Apr 13, 2026 by
jamesahou
Loading…
2 of 6 tasks
fix: Bump sglang version from 0.5.9 to 0.5.10
#529
opened Apr 13, 2026 by
moehanabi
Contributor
Loading…
1 of 6 tasks
Reduce peak GPU memory in Eagle3 online target generation by avoiding an extra logits copy
#528
opened Apr 9, 2026 by
zijiexia
Loading…
1 of 6 tasks
Fix VLM preprocessing and add mRoPE position handling in target head
#527
opened Apr 8, 2026 by
liusy58
Loading…
6 tasks
Fix multimodal hidden-state preparation for Qwen3-VL models
#526
opened Apr 8, 2026 by
liusy58
Loading…
6 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.