DeepSeek launches NSA mechanism to improve long-context training and reasoning efficiency

PANews reported on February 18 that DeepSeek announced the launch of NSA (Sparse Attention Mechanism), which is highly consistent with hardware and supports native training, and is designed to achieve ultra-fast long-context training and reasoning. Through optimized design for modern hardware, NSA significantly reduces pre-training costs while accelerating reasoning without affecting model performance.

According to official introduction, NSA performs well in common benchmarks, long context tasks, and instruction-based reasoning, and performs comparable to or better than the full attention model.

Share to:

Author: PA一线

This content is for informational purposes only and does not constitute investment advice.

Follow PANews official accounts, navigate bull and bear markets together
Recommended Reading
2025-12-23 13:00
2025-12-22 09:24
2025-12-04 07:40
2025-12-02 00:14
2025-11-27 13:45
2025-11-24 06:37

Popular Articles

Industry News
Market Trends
Curated Readings

Curated Series

App内阅读