Skip to content

fix(stem): IndexError on models with more than 38 layers (Qwen3-32B/235B, DeepSeek-R1)#351

Open
hobostay wants to merge 1 commit into
Tencent:mainfrom
hobostay:fix/stem-keep-ratio-index-error
Open

fix(stem): IndexError on models with more than 38 layers (Qwen3-32B/235B, DeepSeek-R1)#351
hobostay wants to merge 1 commit into
Tencent:mainfrom
hobostay:fix/stem-keep-ratio-index-error

Conversation

@hobostay

Copy link
Copy Markdown
Contributor

Problem

Running Stem sparse-attention prefill on a model with more than 38 transformer layers crashes immediately:

IndexError: list index out of range

This affects flagship models AngelSlim explicitly supports, e.g. Qwen3-32B (64 layers) and Qwen3-235B-A22B (94 layers), as well as DeepSeek-R1/V3 (61 layers).

Root cause

generate_exact_k_schedule() reads the per-layer keep ratio from a fixed-length list:

# angelslim/compressor/sparsity/stem/backends/torch_impl.py:49
_DEFAULT_LAYER_KEEP_RATIOS: list[float] = [1.0, 1.0] + [0.2] * 36   # length 38
...
keep_ratio = _DEFAULT_LAYER_KEEP_RATIOS[layer_idx]

layer_idx is assigned to every transformer layer (stem/patch.py sets self.self_attn.layer_idx = i for all layers), so for any model with > 38 layers the lookup overflows the list.

Fix

Express the default keep-ratio as a function of layer_idx instead of a length-38 list. It reproduces the exact previous values for layers 0..37 (1.0 for the first two layers, 0.2 thereafter) and extends the same rule to deeper layers:

-def _DEFAULT_LAYER_KEEP_RATIOS ... = [1.0, 1.0] + [0.2] * 36
+def _default_layer_keep_ratio(layer_idx: int) -> float:
+    return 1.0 if layer_idx < 2 else 0.2
...
-    keep_ratio = _DEFAULT_LAYER_KEEP_RATIOS[layer_idx]
+    keep_ratio = _default_layer_keep_ratio(layer_idx)

Verification

  • python3 -m py_compile passes.
  • Value-equivalence check:
    old = [1.0, 1.0] + [0.2] * 36
    assert all(_default_layer_keep_ratio(i) == old[i] for i in range(38))   # identical 0..37
    for i in range(38, 94): assert _default_layer_keep_ratio(i) == 0.2      # no longer raises
  • Confirmed _DEFAULT_LAYER_KEEP_RATIOS had no other references (only the definition + this call site).

generate_exact_k_schedule() read the per-layer keep ratio from a fixed-length
list:

    _DEFAULT_LAYER_KEEP_RATIOS = [1.0, 1.0] + [0.2] * 36   # length 38
    keep_ratio = _DEFAULT_LAYER_KEEP_RATIOS[layer_idx]

layer_idx is assigned to every transformer layer (stem/patch.py sets
self_attn.layer_idx = i for all layers). For models with more than 38 layers
this raised IndexError on the first prefill, before producing any output:

    IndexError: list index out of range

This affects flagship models AngelSlim explicitly supports, e.g. Qwen3-32B
(64 layers) and Qwen3-235B-A22B (94 layers).

Replace the length-38 list with a function of layer_idx. It reproduces the
exact previous values for layers 0..37 (1.0 for the first two layers, 0.2
thereafter) and extends the same rule to deeper layers.

Co-Authored-By: Claude <noreply@anthropic.com>
@yisunlp

yisunlp commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Thanks for fixing this, please pass the pre-commit checks

pip3 install pre-commit black isort flake8; cd AngelSlim; pre-commit install;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants