Skip to content

[AMD/MI300A] DeepSeek-V4 DSA indexer + attention: validate/provide gfx942 forward path #27

Description

@xzyaoi

Context

DeepSeek-V4 (deepseek_v4) on AMD MI300A (gfx942). V4's attention is a hybrid Compressed Sparse Attention (CSA) + Heavily Compressed Attention (HCA) with a DSA indexer that selects top-512 (Flash) / 1024 (Pro) KV per query (index_n_heads: 64, index_head_dim: 128), plus per-layer compress_ratios and sliding_window: 128.

Status (single-node probe, MI300A, 2026-06-11)

Good: the V4 model imports and constructs cleanly on AMD — the attention + indexer modules build without error (the probe reached MoE weight allocation past them). No import wall.

Unknown: the forward path is unvalidated — the probe OOM'd at MoE weight alloc before running a forward (blocked on the mxfp4-EP gap; see companion issue). The indexer forward uses NVIDIA-only kernels:

  • deep_gemm.fp8_fp4_mqa_logits(...) (thirdparty/deep_gemm, CUDA) for the indexer logits, and
  • indexer_mxfp4_paged_gather(...) (tokenspeed_kernel/ops/attention/cuda/deepseek_v4.py, CUDA-only — the Triton variant ops/attention/triton/deepseek_v4.py does not have the mxfp4 paged gather).

There is a fallback gate _deepseek_v4_deepgemm_fp4_indexer_available() in models/deepseek_v4.py, but its AMD branch is unverified.

Ask

Once the mxfp4-EP blocker is resolved and a V4 model fits on MI300A, run a forward and:

  1. Validate the indexer + CSA/HCA attention produce correct output on gfx942 (the fallback path may already work via Triton).
  2. If the deep_gemm fp8/fp4 MQA-logits and the cuda mxfp4 paged-gather are required, provide gfx942 equivalents (Triton/Gluon or AITER) for: the indexer logits (fp8/fp4 MQA) and the mxfp4 paged KV gather.

Repro / HW

jobs/serve-v4pro-1node-probe2.sbatch on beverin; V4-Pro staged at infra01/hf_models/.../DeepSeek-V4-Pro. MI300A gfx942 / ROCm 7.2 / torch 2.11. Companion to the mxfp4-EP issue + the V4 tracking issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions