update chunk prefill torch #679

ttanzhiqiang · 2025-04-27T09:28:40Z

What this PR does / why we need it?

update vllm chunk prefill and torch kernel (concat_and_cache_mla_torch/merge_attn_states_torch/gather_cache_torch/flash_attn_varlen_func_torch/flash_mla_with_kvcache_torch)

Does this PR introduce any user-facing change?

How was this patch tested?

cd vllm-ascend/examples
python offline_inference_npu_v1.py

Signed-off-by: ttanzhiqiang <389825161@qq.com>

ttanzhiqiang · 2025-04-28T02:44:33Z

open #609

ttanzhiqiang · 2025-04-28T02:44:45Z

@wangxiyuan update

update chunk prefill torch

763088c

Signed-off-by: ttanzhiqiang <389825161@qq.com>

github-actions bot added module:tests module:ops labels Apr 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update chunk prefill torch #679

update chunk prefill torch #679

ttanzhiqiang commented Apr 27, 2025

ttanzhiqiang commented Apr 28, 2025

ttanzhiqiang commented Apr 28, 2025

update chunk prefill torch #679

Are you sure you want to change the base?

update chunk prefill torch #679

Conversation

ttanzhiqiang commented Apr 27, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

ttanzhiqiang commented Apr 28, 2025

ttanzhiqiang commented Apr 28, 2025