Alibaba launches more efficient Qwen3-Next artificial intelligence model

PANews reported on September 12th that Alibaba's Tongyi Qianwen released its next-generation basic model architecture, Qwen3-Next, and open-sourced the Qwen3-Next-80B-A3B series of models based on this architecture. Compared to the Qwen3 MoE model architecture, this architecture features the following core improvements: a hybrid attention mechanism, a highly sparse MoE structure, a series of optimizations for stable and user-friendly training, and a multi-token prediction mechanism to improve inference efficiency. Based on the Qwen3-Next model architecture, Alibaba trained the Qwen3-Next-80B-A3B-Base model, which has 80 billion parameters but only activates 3 billion. This Base model achieves performance similar to or slightly better than the Qwen3-32B dense model, while its training cost (GPU hours) is less than one-tenth of that of the Qwen3-32B. Its inference throughput for contexts above 32k is over ten times that of the Qwen3-32B, achieving exceptional cost-effectiveness for both training and inference.

Share to:

Author: PA一线

This content is for informational purposes only and does not constitute investment advice.

Follow PANews official accounts, navigate bull and bear markets together
Recommended Reading
2025-09-11 23:26
2025-09-11 23:18
2025-09-11 23:08
2025-09-11 22:24
2025-09-11 16:04
2025-09-11 15:47

Popular Articles

Industry News
Market Trends
Curated Readings

Curated Series

App内阅读