Stanford's 423-page AI report: The gap between China and the US is only 2.7%, and Tsinghua University's DeepSeek has broken into the global top ten.

Author: New Zhiyuan

Edited by: Sleepy Peach

[New Intelligence Summary] Stanford's "2026 AI Index Report" is out! This 432-page report is extremely valuable: the gap between China and the US in AI has almost been closed, reduced to only 2.7%. The world's top AI talent pool produces 95 individuals annually, primarily concentrated in large tech companies. Most strikingly, employment for developers aged 22-25 has been cut by 20%.

Today, Stanford HAI released its "2026 AI Index Report"!

This 423-page annual report comprehensively reveals the latest power landscape of the global AI industry.

It draws a core conclusion: AI's capabilities are growing rapidly; however, humanity's ability to measure and manage it has not kept pace.

The most shocking conclusion is—

The performance gap between Chinese and American AI models has largely disappeared, with the two sides frequently changing hands in their peak competition. Currently, Anthropic's lead is only 2.7%.

The United States is spending more money on AI than anyone else, but it is finding it increasingly difficult to attract top talent.

The report also points out that the evolution of AI has not encountered any so-called "bottleneck" but is instead accelerating at an unprecedented speed.

Over the past year, more than 90% of the world's top models have matched or even surpassed human performance in doctoral-level scientific problems, multimodal reasoning, and competitive mathematics.

Especially in terms of coding ability, SWE-bench scores soared from 60% to nearly 100% within a year.

However, AI exhibits an extremely serious "uneven" approach, presenting a distorted reality:

LLM can win an IMO gold medal, but it can't read analog clocks correctly, with an accuracy rate of only 50.1%.

Meanwhile, the prediction that AI would take away jobs has become a reality, and the first to suffer are today's young "working class".

Here's the real deal: 12 key trends to watch in the "2026 AI Index Report".

Other highlights at a glance:

Global AI computing power has increased 30 times in 3 years, with Nvidia accounting for 60% of the market share, and almost all chips are produced by TSMC.
Global corporate AI investment is projected to reach $581.7 billion by 2025, more than doubling year-on-year, with the United States accounting for nearly half of that.
The number of AI researchers entering the United States has dropped by 89% in seven years, with an 80% drop in the past year alone.
Employment for software developers aged 22-25 has declined by 20% since 2024, with entry-level positions being precisely targeted for elimination.
China has built a total of 85 public AI supercomputers, more than twice that of North America, ranking first in the world.
The usage rate of AI in the Chinese workplace exceeds 80%, far surpassing the global average of 58%.
The most powerful models are becoming increasingly opaque; 80 out of 95 representative models do not have publicly available training code.

The gap between China and the US in terms of close proximity has narrowed to only 2.7%.

Stanford plotted the top U.S. and top Chinese players on the Arena rankings since May 2023 on the same coordinate system.

In May 2023, GPT-4-0314 led with 1320 points, while China was still in chatglm-6b, a difference of more than 300 points.

In February 2025, DeepSeek-R1 briefly tied with the US head model for the first time.

In March 2026, Claude Opus 4.6 from the United States scored 1503 points, while dola-seed-2.0-preview from China scored 1464 points.

Currently, the gap between AI in China and the US is only 39 points, which translates to 2.7% as a percentage.

What's even more noteworthy is the frequency of these role reversals over the past year. Since the beginning of 2025, the leading models from both countries have been swapping positions on Arena several times.

The numbers are also roughly equal.

In 2025, the United States released 50 "significant models," and China followed suit by releasing 30 top-tier large-scale models.

In the top tier, OpenAI, Google, Alibaba, Anthropic, and xAI are all on the same stage, with the global top 5 sharing the profits equally.

Looking further down into the top 10, Chinese institutions and companies occupy four spots: Alibaba, DeepSeek, Tsinghua University, and ByteDance.

The focus of the open source ecosystem has also clearly shifted eastward this year.

DeepSeek, Qwen, GLM, MiniMax, and Kimi have all pushed the open-source weight capability curve forward.

If we also consider the number of published papers, citations, patent output, and industrial robot installations, China ranks first in the world in all of these areas.

Price is another front.

Overseas developers have done the math on X, and the output price of Seed 2.0 Pro is only about one-tenth of that of Claude Opus 4.6.

High-performance, face-hugging design, at a tenth of the price. The chain reaction from this is just beginning.

90% of cutting-edge models originate from industry, achieving unprecedented speed of transformation.

Of the 95 most representative models released last year, over 90% came from industry, not academic institutions or government laboratories.

The academic community has fallen behind the forefront.

The release speed is also accelerating abnormally.

In February 2026 alone, eight or nine flagship models entered the market, including Gemini 3.1 Pro, Claude Opus 4.6, GPT-5.3 Codex, Grok 4.20, Qwen 3.5, Seed 2.0 Pro, MiniMax M2.5, and GLM-5.

The cycle of the Investiture of the Gods changed from "years" to "months".

Benchmark capped in one year, AI has no bottleneck

The most dramatic curve is programming.

SWE-bench Verified, the benchmark for real bug fixing, increased from 60% to nearly 100% in one year.

It didn't just rise a few points; it's basically hit its limit.

Terminal-Bench tests the agent's ability to handle real terminal tasks, and this percentage has increased from 20% last year to 77.3%.

The success rate of cybersecurity agents in resolving issues has increased from 15% to 93%.

Gemini Deep Think won a gold medal at the International Mathematical Olympiad.

PhD-level scientific question answering (GPQA Diamond), academic competition mathematics (AIME), and multimodal reasoning (MMMU)—these were originally considered "unsurpassable by humans," but cutting-edge models have conquered them all.

The most telling example is Humanity's Last Exam.

This is a test specifically designed to "bump AI and favor human experts," with questions provided by top experts in various fields.

Last year, OpenAI achieved an 8.8% score on the O1 test. In just one year, cutting-edge models have pushed the score up by another 30 percentage points. Currently, Claude Opus 4.6 and Gemini 3.1 Pro have both exceeded 50%.

A serrated front edge can win an IMO gold medal but can't read a watch.

But the same index produced a different set of figures.

The most powerful model achieved an accuracy of 50.1% on the task of "reading an analog clock".

The robot achieved an 89.4% success rate in a laboratory simulation environment (RLBench). However, when placed in a real home setting to perform household chores such as washing dishes and folding clothes, the success rate immediately dropped to 12%.

The difference between the laboratory and the kitchen is 77 percentage points.

Researchers have named this phenomenon the "jagged frontier." The distribution of AI capabilities is uneven; it might win a gold medal in a math Olympiad but not be able to reliably tell you the current time.

AI can win gold medals in math Olympiads, but it only has a 50% chance of understanding analog clocks. AI is accelerating, but it's accelerating in a different direction.

Furthermore, in the intelligent agent task, the cutting-edge AI capabilities (66.3%) in the OSWorld test are approaching the human baseline.

However, in the PaperArena test, which specifically assesses scientific research logic, the most powerful AI-powered agent scored only 39%, which is only half the ability of a doctoral student.

However, this unevenness no longer prevents companies from integrating AI into their production lines.

Another figure from AI Index is that the global enterprise AI adoption rate has reached 88%. Nine out of ten companies have already integrated AI into some kind of workflow.

The costs are rising in tandem. The number of AI-related incidents has increased from 233 in 2024 to 362.

Money is pouring into AI at an accelerated pace: 581.7 billion yuan.

Global enterprise AI investment is projected to reach $581.7 billion by 2025, a year-on-year increase of 130%. Of this, private equity investment is expected to reach $344.7 billion, a year-on-year increase of 127.5%.

Both curves nearly doubled.

In terms of countries, the United States is far ahead. Private equity investment in AI in the US reached $285.9 billion in 2025. Furthermore, it saw 1,953 new AI startups established annually, more than ten times the number in the second-ranked country.

Money is flowing rapidly into the United States. But another core resource of the United States is flowing in the opposite direction.

The number of AI researchers moving to the United States has dropped by 89%.

One set of numbers in the document took me aback.

Since 2017, the number of AI researchers and developers entering the United States has decreased by 89%.

More importantly, this decline is accelerating. In just one year, the decline has reached 80%.

The United States remains the country with the highest density of AI researchers globally, but the tap is being turned off.

The curves for money and people have started to reverse. This is a situation that hasn't occurred in the past decade.

The key to a 30-fold increase in computing power over three years lies in the hands of one company.

The AI capability curve is accelerating, and the computing power curve behind it is running even faster.

Since 2021, the total global AI computing power has increased 30-fold. In the past three years, it has more than tripled every year.

Only a few companies are supporting this curve.

Nvidia's GPUs account for over 60% of the world's AI computing power. Amazon and Google rank second and third with their self-developed chips, but even combined, they are far from catching up with Nvidia.

Almost all of these chips come from a single foundry, TSMC. The steeper the computing power curve, the narrower the vulnerability.

At the same time, the costs are also increasing.

The total power of global AI data centers has reached 29.6 GW, equivalent to the entire electricity demand of New York State during peak hours. The estimated carbon emissions of a single xAI Grok 4 training session are 72,816 tons of CO2 equivalent, equivalent to the exhaust emissions of 17,000 cars running for a whole year.

Where to build data centers, where to get electricity, and where to produce chips—these three questions have become the most pressing headaches for all AI company CEOs this year.

Generative AI penetration reached 53% in three years, with usage exceeding 80% in Chinese workplaces.

Generative AI has reached a global population penetration rate of 53% within three years.

This speed is faster than a personal computer and faster than the internet.

However, the penetration rate is highly correlated with country. Singapore (61%) and the UAE (54%) are both ahead of the United States. The United States ranks only 24th among the countries covered in the survey, with a penetration rate of 28.3%.

The contrast becomes even greater if we shift the perspective from consumers to the workplace.

Another set of data in the report shows that by 2025, 58% of employees worldwide will have begun to regularly use AI in their work. However, in five countries—China, India, Nigeria, the UAE, and Saudi Arabia—this figure exceeds 80%.

The penetration rate of AI in the workplace in China is already more than 20 percentage points higher than the global average.

What's even more interesting is the consumer value.

AI Index estimates that generative AI tools will generate $172 billion in value annually for U.S. consumers by early 2026. The median value per user will triple from 2025 to 2026.

The vast majority of users still use the free version.

The average person is willing to pay far less for AI than the value AI creates for them. This gap is what all AI companies are trying to bridge.

Entry-level positions have plummeted, with development jobs for the 22-25 age group seeing a dramatic 20% cut.

The section on youth employment that probably silenced Chinese readers the most in the entire AI Index was the one about youth employment.

The number of software developers aged 22 to 25 has decreased by about 20% since 2024.

Meanwhile, the number of older peers is actually increasing.

This isn't limited to software development roles. The same pattern is emerging in other industries with high AI exposure, such as customer service.

What's even more worrying are the results of the corporate survey. The surveyed executives generally expect future layoffs to be even larger than those of the past few months.

This isn't about the macro unemployment rate; it's about the precise elimination of entry-level jobs.

Losing your first job cuts off a step in your career ladder. The long-term impact of this is something no one can fully calculate right now.

AI is rewriting the way scientific discovery is made.

If the job market is a cold market, then the science market is a hot market.

The number of AI-related papers in the fields of natural sciences, physical sciences, and life sciences increased by 26% to 28% year-on-year in 2025.

In terms of specific applications, this year marked the first time that AI successfully completed a full end-to-end weather forecasting process. It directly outputs final forecasts for temperature, wind speed, and humidity from raw meteorological observation data, without any intervention from traditional numerical models.

AI is evolving from "helping you write papers" and "helping you calculate numbers" to "making discoveries on its own."

The same applies to hospitals. By 2025, many hospitals began deploying AI tools that could automatically generate clinical records from patient conversations. Doctors across multiple hospital systems reported that the time spent writing medical records had decreased by up to 83%, and job burnout had significantly decreased.

However, the same index poured cold water on medical AI. A review of more than 500 clinical AI studies found that nearly half of the studies relied on exam-style datasets, while only 5% used real clinical data.

It's certain that AI can reduce the time doctors spend typing. However, the clinical value of AI in real patients still has many questions.

The wave of self-learning has swept the globe, and formal education has fallen behind.

Formal education is falling behind AI.

In the United States, four out of five high school and college students now use AI to complete their schoolwork. However, only half of high schools have AI usage policies, and only 6% of teachers believe these policies are clearly written.

The students are running ahead, while the teacher is still standing still; the rules haven't been explained yet.

While formal education lags behind, a wave of self-learning is sweeping the globe. The report states that the three countries with the fastest growth in AI engineering skills learning are the UAE, Chile, and South Africa.

It's not the United States, it's not Europe.

The steepest part of the skill curve is where no one is looking.

The most powerful model becomes the least transparent, tearing apart experts and the public.

The strongest model is becoming the most opaque model.

The Foundation Model Transparency Index's average score this year dropped from 58 last year to 40. AI Index directly named Google, Anthropic, and OpenAI, stating that they have all stopped publicly disclosing the training data size and training time of their latest models.

Of the 95 most representative models released last year, 80 did not have their training code made public.

Public sentiment has become more complex.

Globally, the percentage of people who believe AI is more beneficial than harmful rose from 52% to 59%. However, during the same period, the percentage who felt nervous about AI rose from 50% to 52%.

Growth is occurring in both directions simultaneously.

The most divided country is the United States. Only 33% of Americans believe AI will make their jobs better, compared to a global average of 40%. Americans also have the lowest level of trust in their government's regulation of AI among the surveyed countries, at 31%.

Singaporeans have an 81% level of trust in their government's regulation of AI.

Following the recent attack on Sam Altman's home, people in Silicon Valley were "surprised" to find that ordinary people in the Instagram comments section showed little sympathy, with some even feeling that "it should have been more intense."

They didn't realize things had gotten this bad.

According to Pew and Ipsos data cited in the research report, the gap between experts' and the public's perceptions of AI's impact on employment, healthcare, and the economy generally exceeds 30 percentage points, with the largest gap reaching 50 percentage points.

On one hand, the curves in the laboratory are soaring, and on the other hand, the anxiety in the hearts of ordinary people is accumulating.

There is no bridge in the middle.

In conclusion

The 423-page report contained hundreds of charts and graphs, but in reality, only one chart was drawn.

The horizontal axis represents time, and the vertical axis represents ability.

The curves for model capability are soaring, computing power is soaring, investment is soaring, and adoption rate is soaring. Everything else is either stagnating or declining.

That's all for the 2026 AI Index.

AI is accelerating. Everything else is falling behind.

If you're in this industry, the question you should be asking now isn't "What will the future hold?", but rather "Which curve am I on?"