fix(stem): IndexError on models with more than 38 layers (Qwen3-32B/235B, DeepSeek-R1) by hobostay · Pull Request #351 · Tencent/AngelSlim

hobostay · 2026-06-17T12:56:28Z

Problem

Running Stem sparse-attention prefill on a model with more than 38 transformer layers crashes immediately:

IndexError: list index out of range

This affects flagship models AngelSlim explicitly supports, e.g. Qwen3-32B (64 layers) and Qwen3-235B-A22B (94 layers), as well as DeepSeek-R1/V3 (61 layers).

Root cause

generate_exact_k_schedule() reads the per-layer keep ratio from a fixed-length list:

# angelslim/compressor/sparsity/stem/backends/torch_impl.py:49
_DEFAULT_LAYER_KEEP_RATIOS: list[float] = [1.0, 1.0] + [0.2] * 36   # length 38
...
keep_ratio = _DEFAULT_LAYER_KEEP_RATIOS[layer_idx]

layer_idx is assigned to every transformer layer (stem/patch.py sets self.self_attn.layer_idx = i for all layers), so for any model with > 38 layers the lookup overflows the list.

Fix

Express the default keep-ratio as a function of layer_idx instead of a length-38 list. It reproduces the exact previous values for layers 0..37 (1.0 for the first two layers, 0.2 thereafter) and extends the same rule to deeper layers:

-def _DEFAULT_LAYER_KEEP_RATIOS ... = [1.0, 1.0] + [0.2] * 36
+def _default_layer_keep_ratio(layer_idx: int) -> float:
+    return 1.0 if layer_idx < 2 else 0.2
...
-    keep_ratio = _DEFAULT_LAYER_KEEP_RATIOS[layer_idx]
+    keep_ratio = _default_layer_keep_ratio(layer_idx)

Verification

python3 -m py_compile passes.

Value-equivalence check:

old = [1.0, 1.0] + [0.2] * 36
assert all(_default_layer_keep_ratio(i) == old[i] for i in range(38))   # identical 0..37
for i in range(38, 94): assert _default_layer_keep_ratio(i) == 0.2      # no longer raises

Confirmed _DEFAULT_LAYER_KEEP_RATIOS had no other references (only the definition + this call site).

generate_exact_k_schedule() read the per-layer keep ratio from a fixed-length list: _DEFAULT_LAYER_KEEP_RATIOS = [1.0, 1.0] + [0.2] * 36 # length 38 keep_ratio = _DEFAULT_LAYER_KEEP_RATIOS[layer_idx] layer_idx is assigned to every transformer layer (stem/patch.py sets self_attn.layer_idx = i for all layers). For models with more than 38 layers this raised IndexError on the first prefill, before producing any output: IndexError: list index out of range This affects flagship models AngelSlim explicitly supports, e.g. Qwen3-32B (64 layers) and Qwen3-235B-A22B (94 layers). Replace the length-38 list with a function of layer_idx. It reproduces the exact previous values for layers 0..37 (1.0 for the first two layers, 0.2 thereafter) and extends the same rule to deeper layers. Co-Authored-By: Claude <noreply@anthropic.com>

yisunlp · 2026-06-18T03:29:51Z

Thanks for fixing this, please pass the pre-commit checks

pip3 install pre-commit black isort flake8; cd AngelSlim; pre-commit install;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(stem): IndexError on models with more than 38 layers (Qwen3-32B/235B, DeepSeek-R1)#351

fix(stem): IndexError on models with more than 38 layers (Qwen3-32B/235B, DeepSeek-R1)#351
hobostay wants to merge 1 commit into
Tencent:mainfrom
hobostay:fix/stem-keep-ratio-index-error

hobostay commented Jun 17, 2026

Uh oh!

yisunlp commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hobostay commented Jun 17, 2026

Problem

Root cause

Fix

Verification

Uh oh!

yisunlp commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants