Source: AI Cambricon
Another year has passed since the last I/O.
Google's CEO set the tone right from the start: The past year has seen the AI industry enter a new phase; people are no longer just concerned with the technology itself, but want to see AI truly bring value to everyday products. Google's answer is today's presentation.
Token quantity: From 480 trillion to 30 quadrillion
Tokens are a straightforward metric for measuring the scale of AI adoption.
Two years ago, Google's various products processed a total of 9.7 trillion tokens per month. At last year's I/O, that number grew to approximately 480 trillion. This year, it jumped directly to over 3.2 quadrillions per month, a sevenfold increase.
The data from both the developer and enterprise sides are equally impressive:
More than 8.5 million developers build apps using Google Models every month.
The model API currently processes approximately 19 billion tokens per minute.
Over the past 12 months, more than 375 Google Cloud customers have each processed more than 1 trillion tokens.
Product scale: 13 products with over one billion users
Google currently has 13 products with over 1 billion monthly active users, 5 of which have over 3 billion.
Search remains the most widely used entry point for AI products. AI Overviews has over 2.5 billion monthly active users. AI Mode, the biggest upgrade to Search's history, surpassed 1 billion monthly active users in just one year, and people's usage of it is also changing, evolving from single queries to continuous conversations.
The Gemini app had 400 million monthly active users during its I/O last year, and this year it has exceeded 900 million, more than doubling, with daily requests increasing more than sevenfold. The Nano Banana image generation model has cumulatively generated over 50 billion images.
Conversational AI is being integrated into more products.
Ask YouTube
YouTube has a vast library of videos, but finding truly relevant content isn't easy. Ask YouTube has redesigned this experience, not only showing matching videos but also jumping directly to the most relevant segments. It's currently in testing and will roll out fully in the US this summer.
Docs Live
Google Docs has added a new voice feature, Docs Live. Previously, writing documents with Gemini required explicit input; now, simply speak your ideas into the microphone, and Gemini will automatically organize them into a document. Future updates will also support creating and editing documents directly with your voice. Docs Live will roll out to subscribers this summer, with Gmail and Keep also adding voice functionality at the same time.
Ask Maps
Maps is getting its biggest upgrade in a decade, including the Ask Maps feature, which supports more complex and longer questions.
Infrastructure: Capital expenditures range from $31 billion to $180 billion
Supporting the large-scale operation of these products requires massive infrastructure investment.
In 2022, Google's annual capital expenditures were $31 billion. This year, they are projected to reach approximately $180 billion to $190 billion, roughly six times that of 2012.
At the chip level, Google released its eighth-generation TPU on Cloud Next, adopting a dual-chip strategy for the first time, with dedicated architectures for training and inference:
The TPU 8t is used for large-scale pre-training, with raw computing power approximately three times that of the previous generation. Combined with JAX and Pathways, training is no longer limited to a single data center and can be distributed across multiple sites. More than 1 million TPUs can be accessed globally, forming the world's largest training cluster. Model training time has been reduced from months to weeks.
The TPU 8i is designed specifically for inference and features comprehensive speed optimizations. Both chips offer approximately twice the energy efficiency of their predecessors.
New models: Gemini Omni and Gemini 3.5 Flash
Gemini Omni
AI is shifting from predicting text to simulating reality. Gemini Omni, Google's newly released multimodal world model, can accept input from any modality and generate output from any modality. The initial version primarily outputs video, with support for images and text to follow. Gemini Omni Flash is available today on the Gemini app, Google Flow, and YouTube Shorts, with developers and enterprise customers gaining access via API in the coming weeks.
1) Editing videos using natural language
Omni supports progressive video editing through dialogue, with each instruction building upon the previous one, ensuring consistency between characters, adherence to physical laws, and coherence between scenes.
2) Physical Understanding and World Knowledge
Omni has a more accurate and intuitive understanding of physical laws such as gravity, kinetic energy, and fluid dynamics, resulting in more realistic physical representations of generated scenes. Furthermore, it can draw upon Gemini's historical, scientific, and cultural background knowledge to connect language, images, and meaning, rather than simply performing pattern matching.
3) Any combination of inputs
Omni supports using images, text, video, and audio as input simultaneously to generate output with a consistent style.
4) Digital Avatar
Users can use Omni to create their own digital avatars, generating videos that look and sound like themselves. Google says it's still testing modifications to the audio and voice within the videos.
However, some users' initial tests revealed that omini's video generation is quite poor, falling far short of seedance 2.0.
Gemini 3.5 Flash
Google today launched Gemini 3.5 Flash, positioning it as a next-generation model that combines cutting-edge intelligence with mobility.
Compared to 3.1 Pro, 3.5 Flash outperforms several benchmarks, including Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo), and MCP Atlas (83.6%), achieving 84.2% in CharXiv Reasoning for multimodal understanding. The improvement is particularly noticeable in GDPval, which specifically measures tasks with economic value in the real world. In terms of speed, it outputs tokens four times faster per second than other state-of-the-art models.
3.5 Flash, in conjunction with Antigravity, can schedule multiple sub-agents to work collaboratively, handling complex tasks on a large scale.
The front-end generation capabilities remain very strong. Building on multimodal capabilities, Flash 3.5 can generate richer interactive web page UIs and graphics, such as generating interactive animations directly for a research paper in AI Studio.
In terms of price, 3.5 Flash costs less than half the price of comparable cutting-edge models. Google estimates that leading companies process approximately 1 trillion tokens daily, and if they switched 80% of their workloads from other cutting-edge models to 3.5 Flash, they could save over $1 billion annually.
Gemini 3.5 Flash is available to all users and APIs starting today. Gemini 3.5 Pro is currently used internally at Google and will be released next month.
According to internal Google data, the token processing volume for AI development tools has more than doubled from 500 billion per day in March to more than 3 trillion per day today, thanks to the Antigravity platform and Flash 3.5.
Antigravity 2.0: Agent Development Platform
Antigravity, originally an AI programming environment, is now expanding into a complete autonomous AI agent development and management platform.
Antigravity 2.0 is a new standalone desktop application that serves as the central hub for agent interaction, allowing users to coordinate various agents to handle different tasks. In terms of speed, this version uses a specially optimized version of Flash, making it 12 times faster than other cutting-edge models. Antigravity 2.0 looks almost identical to Codex, 😂
Antigravity users can start experiencing it starting today. See the official announcement for details:
https://deepmind.google/technologies/antigravity/
Gemini Spark: A 24/7 Personal AI Agent
The Gemini app is about to launch a personal AI agent—Gemini Spark—which will take actions in the digital world on behalf of the user with the user's authorization.
Several key features:
• Runs on a dedicated Google Cloud virtual machine, providing continuous 24/7 operation without requiring your computer to be constantly on.
Powered by Gemini 3.5 and Antigravity, it can easily handle long-running tasks in the background.
• Starting with integration using Google's own tools, third-party tools will be integrated via MCP in the coming weeks.
• Supports interaction within the Gemini app, and will also be available via email and instant messaging in the future.
• On Android, you can view the Agent's real-time progress through the new UI space, Android Halo, which will be available later this year.
• Later this summer, Spark will run directly in Chrome, becoming a cross-page agent browser.
Spark is open to trusted test users starting this week, and the Beta version will be rolled out to Google AI Ultra subscribers in the US next week.
Search enters the Agent era
Search is also evolving towards becoming an agent.
Information Agent : Users can set up a personalized AI agent to run continuously in the background, proactively finding the necessary information and assisting in taking action when appropriate. It will be rolled out to Google AI Pro and Ultra subscribers starting this summer.
Generative UI : Combining Gemini 3.5 Flash and Antigravity, Search will dynamically generate a customized interface for each question, including personalized layouts and interactive visual content. It will be available to all users for free this summer.
Persistent Custom Kanban Boards : For long-term tasks requiring continuous tracking, Search allows users to build custom kanban boards or tracking tools, similar to mini-apps tailored for specific tasks. Starting in the coming months, it will be available to Google AI Pro and Ultra subscribers in the US.
Other published content
Daily Brief : The Gemini app's upcoming out-of-the-box Agent integrates your inbox, calendar, and tasks to generate personalized daily summaries. It not only summarizes information but also prioritizes, organizes, and suggests next steps for easy browsing.
Google Flow : Today, we are rolling out a new Agent to all users, enabling them to plan and process complex tasks with user participation and control. It supports direct vibe coding within Flow for various creative tools, such as video effects design, hand-drawn animation, or text overlay tools.
Google Pics : An AI-powered image creation and editing tool based on the latest Nano Banana model. It treats each element in an image as an independent object rather than a flat image, allowing for precise creation, replacement, and adjustment of specific details. Currently available to trusted beta users, it will roll out to Google AI Pro and Ultra subscribers in Workspace later this summer.
Smart Glasses : More details have emerged about the AI glasses products that debuted early last year. They come in two types: audio glasses (earphone-like with voice prompts) and display glasses (capable of displaying information), both supporting hands-free use with Gemini. The audio glasses will be available this fall.
Gemini for Science : An AI toolset for scientific research, integrating Gemini's deep reasoning and research capabilities, Deep Think, and Deep Research, and adding Science Skills, which connects Agent platforms like Antigravity to over 30 major life science databases and tools. Users can apply to try out Gemini for Science's experimental features at Google Labs, while Science Skills are available directly on GitHub and Antigravity starting today.
From the TPU 8i to Gemini 3.5, and then to Antigravity and Spark, what Google presented at this year's I/O was a complete system that is evolving towards agents, from chips to applications.




