hpc:https://github.com/Tencent/hpc-ops/tree/main hpc/stem.py中多了num_prompt_tokens和最后四个参数:
def stem_tpd(
block_logits: Tensor,
q_seq_lens: Tensor,
kv_seq_lens: Tensor,
num_prompt_tokens: Tensor,
block_size: int = 128,
alpha: float = 1.0,
initial_blocks: int = 4,
window_size: int = 4,
k_block_num_rate_medium: float = 0.2,
k_block_num_bias_medium: int = 30,
k_block_num_rate_large: float = 0.1,
k_block_num_bias_large: int = 30,
)
stem:angelslim/compressor/sparsity/stem/backends/hpc_impl.py 中调用时多了k_block_num_rate和 k_block_num_bias
return hpc.stem_tpd(
block_logits,
q_seq_lens,
kv_seq_lens,
params["block_size"],
params["alpha"],
params["initial_blocks"],
params["window_size"],
params["k_block_num_rate"],
params["k_block_num_bias"],
)
报错日志:
[Stem] Patch applied. backend=hpc, hpc_dtype=fp8, num_layers=36
Loaded prompt from: prompt_16k.txt
[Input Stats] token_length=39966
Generating ...
[Stem][HPC] first prefill chunk (q_len == kv_len, no history); using varlen fp8 path.
[Stem][HPC] paged fp8 path failed (hpc::stem_tpd() Expected a value of type 'Tensor' for argument 'num_prompt_tokens' but instead found type 'int'.
Position: 3
Value: 128
Declaration: hpc::stem_tpd(Tensor block_logits, Tensor q_seq_lens, Tensor kv_seq_lens, Tensor num_prompt_tokens, int block_size, float alpha, int initial_blocks, int window_size, float k_block_num_rate_medium, int k_block_num_bias_medium, float k_block_num_rate_large, int k_block_num_bias_large) -> Tensor
Cast error details: Unable to cast 128 to Tensor); falling back to torch backend.
Mode: stem
hpc:https://github.com/Tencent/hpc-ops/tree/main hpc/stem.py中多了num_prompt_tokens和最后四个参数:
def stem_tpd(
block_logits: Tensor,
q_seq_lens: Tensor,
kv_seq_lens: Tensor,
num_prompt_tokens: Tensor,
block_size: int = 128,
alpha: float = 1.0,
initial_blocks: int = 4,
window_size: int = 4,
k_block_num_rate_medium: float = 0.2,
k_block_num_bias_medium: int = 30,
k_block_num_rate_large: float = 0.1,
k_block_num_bias_large: int = 30,
)
stem:angelslim/compressor/sparsity/stem/backends/hpc_impl.py 中调用时多了k_block_num_rate和 k_block_num_bias
return hpc.stem_tpd(
block_logits,
q_seq_lens,
kv_seq_lens,
params["block_size"],
params["alpha"],
params["initial_blocks"],
params["window_size"],
params["k_block_num_rate"],
params["k_block_num_bias"],
)
报错日志:
[Stem] Patch applied. backend=hpc, hpc_dtype=fp8, num_layers=36
Loaded prompt from: prompt_16k.txt
[Input Stats] token_length=39966
Generating ...
[Stem][HPC] first prefill chunk (q_len == kv_len, no history); using varlen fp8 path.
[Stem][HPC] paged fp8 path failed (hpc::stem_tpd() Expected a value of type 'Tensor' for argument 'num_prompt_tokens' but instead found type 'int'.
Position: 3
Value: 128
Declaration: hpc::stem_tpd(Tensor block_logits, Tensor q_seq_lens, Tensor kv_seq_lens, Tensor num_prompt_tokens, int block_size, float alpha, int initial_blocks, int window_size, float k_block_num_rate_medium, int k_block_num_bias_medium, float k_block_num_rate_large, int k_block_num_bias_large) -> Tensor
Cast error details: Unable to cast 128 to Tensor); falling back to torch backend.
Mode: stem