PANews reported on April 16th that DeepSeek's open-source matrix operation library, DeepGEMM, has initiated a merge request titled "Public release 26/04, " introducing new features such as Mega MoE and FP4 Indexer . This update merges dispatch , linear1/SwiGLU/linear2 , and combine in MoE into a single mega-kernel , and optimizes overlap between NVLink communication and tensor core computation. Currently, it only supports FP8 x FP4 MoE , EP≤8 , and requires PyTorch≥2.9 . It also adds FP4 Indexer (for MQA logits , supporting larger MTP ), FP8 x FP4 GEMM , PDL , and DeepEPv2 MoE GEMM layout , optimizes GEMM heuristics and kernel, speeds up JIT compilation, and fixes issues such as JIT crashes and partial kernel hangs under distributed file systems . This release is only related to DeepGEMM development and is unrelated to internal model releases.
DeepSeek DeepGEMM releases major updates including Mega MoE and FP4 Indexer.
Share to:
Author: PA一线
This content is for market information only and is not investment advice.
Follow PANews official accounts, navigate bull and bear markets together
PANews App
24/7 blockchain news tracking and in-depth analysis.

