DeepSeek's new model MODEL1 code has been leaked, suggesting a completely new architecture.

PANews reported on January 21 that, according to QuantumBit, the name "MODEL1" appeared for the first time in DeepSeek's updated FlashMLA code on GitHub, appearing in 28 mentions across 114 files and listed alongside the existing version V32 (DeepSeek-V3.2), suggesting that MODEL1 is a next-generation architecture model. Code differences indicate that the model has been optimized in areas such as KV cache layout, sparsity handling, and FP8 decoding, and may be officially released around the Spring Festival. Combined with the recently disclosed mHC residual connection mechanism and Engram memory module, MODEL1 is expected to integrate several self-developed innovations.

Share to:

Author: PA一线

This content is for informational purposes only and does not constitute investment advice.

Follow PANews official accounts, navigate bull and bear markets together
Recommended Reading
18 hour ago
2026-01-15 13:13
2026-01-09 13:18
2026-01-01 09:37
2025-12-23 13:00
2025-12-22 09:24

Popular Articles

Industry News
Market Trends
Curated Readings

Curated Series

App内阅读