Author: Li Hailun , Tencent Technology
Edited by Xu Qingyang
On June 2nd, local time in the United States, Microsoft's Build 2026 developer conference kicked off in Fort Mason, San Francisco. The conference focused on the practical applications of cutting-edge AI technologies, and Microsoft released a series of products and updates covering self-developed AI models, intelligent agent applications, operating system security, developer tools, cloud services, and new hardware platforms.
At its 2025 developer conference, Microsoft established the direction of the "AI intelligent agent era," released Copilot Studio multi-agent orchestration and Windows AI Foundry, and announced full support for the Model Context Protocol. GitHub Copilot also launched the Coding Agent.
In Microsoft's narrative, 2025 was about "what standards and frameworks to use in the era of intelligent agents", and 2026 was about "how to make our own models and products truly run" - the model layer was filled with self-developed main force capable of taking the lead, and the product layer pushed intelligent agents from demonstration to full-stack implementation of systems, hardware and cloud.
This press conference featured six main announcements: MAI's self-developed model family, the intelligent agent ecosystem represented by Scout and GitHub Copilot applications, the Windows system-level AI security sandbox MXC, the Surface RTX Spark Dev Box and system optimizations for developers, the Project Solara new intelligent agent device platform, and developer tools and governance frameworks including Microsoft IQ, Rayfin, ASSERT, and ACS.
01 Seven models trained from scratch, rejecting distillation
The entire keynote address unfolded gradually, with Microsoft CEO Satya Nadella outlining his vision. After he presented the "agent-first" strategic framework, executives from various business lines took turns on stage to unveil specific products that put this framework into practice.
At the conference, Suleiman announced the launch of seven new models developed internally by Microsoft AI, which will be uniformly included in the MAI family.
He described MAI's mission as building a "climbing machine" that continuously improves itself through continuous investment in computing power, better data, and more accurate evaluation, keeping users at the forefront of technology.
Regarding the scale of training computation, Suleiman pointed out that the computational cost for training cutting-edge models has increased by one trillion times, and is expected to increase by another thousand times in the next three years. All Microsoft MAI models are trained "from scratch, with zero distillation," without relying on third-party model outputs.
Microsoft AI head Suleiman introduces seven self-developed models
The specific model is as follows:
The flagship inference model, MAI-Thinking-1 , is a mid-sized model. Microsoft states that its performance in key software engineering tests is on par with the best models on the market. In blind testing, human judges showed similar levels of preference for it as Sonnet 4.6. This model was trained from scratch using clean data without using third-party model distillation.
MAI-Code-1-Flash is a high-performance, inference-efficient algorithmic coding model with 5 billion parameters. It is specifically designed and deeply integrated with GitHub Copilot, VS Code, and the Microsoft technology stack. Microsoft claims it is comparable to Haiku but at a lower cost.
The text-based image model MAI-Image-2.5 and its highly efficient Flash variant support text-based images and image editing. Microsoft claims it surpasses Google Nano Banana Pro in Arena ratings.
The MAI-Transcribe-1.5 transcription model boasts state-of-the-art (SOTA) accuracy. It is claimed to be five times faster than competing models and includes built-in support for domain-specific terminology recognition in 43 languages.
The MAI-Voice-2 speech generation model provides high-quality, natural-sounding speech generation, supports 15 languages, can adapt voices based on short samples, and has anti-abuse protection measures. Its Flash variant is coming soon, achieving the same functionality at a lower cost.
All models share the same data specifications, infrastructure, and evaluation framework. In addition to being distributed on Azure Foundry and optimized for Microsoft first-party products, these models will also be available to developers on Open Router, as well as Fireworks and Baseten. For the first time, developers will be able to adjust model weights themselves.
At the conference, Nadella introduced Microsoft Frontier Tuning, a method that allows enterprises to customize models using their own operational data. The logic is that the most valuable data is not general corpora, but rather the real-world trajectory, steps, and decisions of agents performing tasks within the enterprise.
Microsoft CEO Satya Nadella introduces Frontier Tuning
This mechanism integrates the MAI model into actual business processes, allowing the model to learn on the job in a real environment. Suleiman said, "You are building your own model: in your environment, trained with your data, and under your control. Your institutional knowledge becomes part of the model and belongs only to you."
In terms of performance, Microsoft's MAI model for Excel is comparable to GPT-5.4, while improving efficiency by 10 times. McKinsey, after adopting Frontier Tuning, achieved the highest win rate among all tested models, with costs reduced by approximately 10 times.
In the healthcare field, Microsoft announced a collaboration with the Mayo Clinic to create a cutting-edge AI model for healthcare. This model combines the Mayo Clinic's clinical expertise, de-identified clinical data, and longitudinal insights with Microsoft's foundational AI capabilities.
Microsoft also revealed that the MAI model is being co-designed with its self-developed Maia 200 chip, and through joint hardware and software optimization, it has achieved a 1.4x efficiency improvement.
02 The intelligent agent ecosystem has been fully implemented.
At the conference, Microsoft announced a major shift to "Agent First," aiming to automate how knowledge workers use software and integrate AI assistants into daily office interactions.
Scout is the core intelligent agent product released this time. This AI agent, described as "always online," is built on the OpenClaw framework and can interact with Microsoft Teams like a human colleague.
Scout can browse a user's work messages, calendar, and email inbox, automate tasks, reschedule conflicting meetings, and draft professional-sounding responses. Users can send it commands directly within Teams or name it.
Microsoft's newly appointed corporate vice president, Omar Shaheen, explained Scout's design philosophy: "Your company is essentially employing your assistants. The whole point of having personal assistants is that they are still working when you are not working."
Scout is offered through Microsoft's Frontier program and requires a GitHub Copilot subscription. Microsoft is testing a Scout desktop application that will be rolled out to subscribers who opt for "Frontier" access. Within Microsoft, Shaheen said the sales department is the largest and fastest-growing user group of the tool.
The GitHub Copilot desktop application is another important release. GitHub Chief Product Officer Mario Rodriguez describes it as "a desktop experience built on top of GitHub, with Agent-native capabilities."
Through a unified "My Work" view, developers can see dynamic work across connected repositories, including active sessions, topics, pull requests, and background automations. Each session runs in its own Git worktree, with parallel agents operating independently. Applications feature Agent Merge, which guides pull requests through review, inspection, and merging. A Canvas interface facilitates two-way human-computer interaction, allowing developers to inspect, guide, and validate work performed on their behalf by agents.
The GitHub Copilot application is available in technical preview for Windows 11, Windows 11 on Arm, Mac, and Linux, requiring a GitHub Copilot subscription. It will be available to Copilot Free users in the future. The application supports cloud and local sandboxes and code reviews, both with policy support.
In terms of intelligent agent security governance, Microsoft has released the Agent Control Specification (ACS) , a new open-source standard designed to provide developers with a more consistent and granular approach to controlling the behavior of AI agents. ACS enables development, compliance, and security teams to define policy documents for agents, specifying what agents can and absolutely cannot do, when human approval is required, and what evidence should be logged for review.
ACS is released as an SDK, bundled with plugins for LangChain, OpenAI Agents SDK, Anthropic Agents SDK, AutoGen, CrewAI, Semantic Kernel, Microsoft.Extensions.AI, MCP tools, and more. Because policies can be written in a single file, they can be bundled with agents and follow them across different frameworks and environments.
ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing) is another testing tool. It is an open-source framework that uses AI to transform high-level natural language descriptions of goals, strategies, or expected behaviors into structured scoring tests.
ASSERT receives a concise, language-based description of the expected behavior of an AI model, generates sets of acceptable and unacceptable behaviors, problem scenarios, and test cases, runs tests on the target system, and scores them. It can also record the paths taken by the AI system, including intermediate operations and tool calls, so developers can check for failures.
03 The more autonomous an agent is, the more dangerous it is. Microsoft uses MXC to draw a red line at the system level.
As AI agents become increasingly powerful and autonomous, Microsoft has identified a critical issue: the more autonomous an agent is, the more useful it becomes, but the more dangerous it is to allow it to operate unchecked on enterprise networks. Microsoft's official blog describes this as a "multi-layered systems problem," where every interaction between the agent and humans, tools, applications, models, and other agents "exposes new attack surfaces and introduces different failure modes."
To address this issue, Microsoft introduced Microsoft Execution Containers (MXC) , a policy-driven execution layer built into the Windows operating system itself. Pavan Davuruli, Microsoft's Executive Vice President of Windows and Devices, emphasized that this is crucial for making AI agents commercially viable; they are "centered around security, inclusion, isolation, and user control," ensuring the agents are secure enough for both consumer and enterprise deployments.
Microsoft CEO Satya Nadella introduces the system-level security sandbox MXC
MXC is essentially an SDK and policy model embedded in Windows and Windows Subsystem for Linux, providing what Microsoft calls a “composable sandbox spectrum.” This spectrum ranges from lightweight process isolation (adopted by the command-line interface of GitHub Copilot) to microvirtual machines, Linux containers, and full cloud instances running on Windows 365.
This system decouples agent execution from the user's desktop, clipboard, user interface, and input devices. Each agent is bound to an identity, either a local ID or a cloud-provisioned identity powered by Microsoft Entra, ensuring that every action of the agent can be attributed, audited, and governed.
MXC is currently available in an early preview. Agent 365, integrated with the Microsoft Enterprise Security Stack, will be available in preview in July 2026, layering Entra Identity Services, Intune Device Management, Defender Threat Protection, and Purview Data Governance capabilities onto MXC, enabling IT departments to centrally manage Agent isolation.
In terms of partners, OpenAI, NVIDIA, Manus, Nous Research (Hermes Agent maker), and the OpenClaw open-source project have announced builds on MXC.
It's worth mentioning that the collaboration with OpenClaw began when its creator, Peter Steinberger, proactively contacted Microsoft to express his interest, which eventually developed into a comprehensive platform-level partnership.
04 Three updates enable Edge's AI to "run offline"
Microsoft's Edge browser has also received an upgrade to its native AI capabilities. Microsoft stated that since the introduction of Phi-4-mini in Build 2025, the team has expanded its on-device AI capabilities based on feedback from web developers.
The first item is Aion-1.0-Instruct, a smaller, faster, and more efficient local language model than Phi-4-mini. It can run on PCs with limited GPU and CPU capabilities and is currently available in developer preview, with a release on Hugging Face in July.
The second is the Language Detection and Translation API, available with Edge version 148. Both APIs are powered by Edge's built-in on-device AI model for JavaScript, allowing websites and browsers to extend their language recognition capabilities to translate text pairs. Microsoft claims it "delivers fast, high-quality translations, supports over 145 languages, and is optimized for translation workloads on the web," and this service is free.
The third feature is speech recognition via the Web Speech API, available experimentally in the Edge Canary and Dev channels. This API helps developers integrate voice or audio input into websites and browser extensions, running locally on the device, or backed by cloud-based speech-to-text and text-to-speech services.
05 Developer Tools and Cloud Service Iterations
In terms of data intelligence, Microsoft released Microsoft IQ, which merged the four previously separate context sources into a shared foundation for agents.
Microsoft Fabric Chief Technology Officer Amir Netz used an analogy: the green code waterfall in "The Matrix" wasn't decoration, but the foundation upon which that world was built. He said, "What we do in the data world is create a data-driven reality for agents."
Microsoft IQ's four context sources are: Work IQ, which captures how organizations operate day-to-day, leveraging emails, documents, meetings, and schedules; Foundry IQ, which manages organizational knowledge, curating and indexing knowledge bases; Fabric IQ, which models the real-time operational status of a business through data, defining entities, relationships, and business rules anchored to real-time signals based on Fabric Real-Time Intelligence (this feature is expected to be officially released in the coming months); and Web IQ, which adds real-time global context from the web.
With this contextual system, the Agent is no longer just a tool that executes commands, but a virtual employee who understands how the company operates.
A shared "foundation" alone is not enough. When agents start building applications, each application needs a backend. If left unchecked, these applications will create new data silos outside the context layer. To address this, Microsoft released Rayfin, an open-source SDK and CLI. It deploys applications built by agents directly to the Fabric platform as a governed production backend. Application data defaults to entering the unified OneLake data lake and then feeding back to Microsoft IQ, instead of accumulating externally.
Microsoft positions it as a competitor to Supabase and Neon, with the core difference being governance: all applications use the same data and compliance channels. Netz explains that this is a two-way process: when an agent builds an application, it retrieves information from the enterprise's data rules; the data generated by the application updates these rules, allowing the next agent to use the latest information.
Microsoft also launched the WSL container feature, which allows developers to create and manage Linux containers directly on Windows. Microsoft also provided a command-line interface and API, allowing Linux containers to run within native Windows applications. This feature will be available for public preview in the coming months.
To save developers time on environment configuration, Microsoft also released Windows Developer Configurations, which can quickly set up a new machine and apply developer-optimized configurations, automatically install WSL, PowerShell 7, and Visual Studio Code, and enable Git version control and show hidden files in File Explorer.
06 Two new hardware devices bring heavy AI tasks back to the local device
This Build event wasn't just a software showcase of models, agents, and development tools; hardware was also present. As AI computing becomes increasingly resource-intensive and agentic workflows need to run continuously, Microsoft has turned its attention to the devices available to developers. Instead of renting expensive cloud GPUs each time, it allows these tasks to be completed directly on local machines.
Andrew Hill, Vice President of Surface Products, announced two new devices:
The Surface RTX Spark Dev Box is a compact developer PC powered by the NVIDIA RTX Spark super chip, which combines an NVIDIA Blackwell RTX GPU and an NVIDIA Grace CPU to deliver up to 1 Petaflop of AI computing power, and comes with 128 GB of unified memory.
The device utilizes an aluminum chassis that also serves as a heat sink, designed for long-running training tasks, large model inference, and complex agentic workflows. It comes pre-installed with Windows 11 Pro and pre-configured for developers at the image level: dark theme, simplified taskbar for development, removal of widgets, "Do Not Disturb" mode enabled, developer mode enabled, and PowerShell 7 as the default shell. WSL 2 is configured with GPU passthrough and CUDA support, and VS Code, GitHub Copilot, Git, Python, and Node.js are all installed.
In terms of security, the Surface RTX Spark Dev Box is built on chip-to-cloud security in accordance with Microsoft's zero-trust principle, including a Secured-core PC architecture, BitLocker encryption, and Microsoft Defender protection, and can be integrated with Entra ID and Intune for large-scale management and governance.
Hill explained, "The way developers build software is fundamentally changing. AI models are becoming increasingly powerful and complex, agentic workflows require continuous computing power, and even for tasks that don't require state-of-the-art models, each iteration can incur cloud costs."
Another high-performance laptop, the Surface Laptop Ultra, designed for developers, creators, and technology professionals, was launched earlier. Together, they represent the next step for Surface: creating dedicated devices for people who build the future. The Surface RTX Spark Dev Box will be available in the US later this year, exclusively through Microsoft.com.
07 A new platform that allows devices to run AI agents instead of applications
Stevie Battish, head of Microsoft's Applied Science division, introduced an internal project known as Project Solara.
This is a new platform from chip to cloud, based on Android rather than Windows, designed to allow devices to run AI agents instead of applications. Batish explains its motivation: "The boundaries are collapsing. You don't necessarily need traditional application models. You don't need the traditional way to develop experiences."
The first two concept devices were showcased at the Build conference:
The desktop hub, placed next to a PC, responds to voice commands, allows users to log in via facial recognition, and displays the day's most urgent tasks. When connected to a monitor, it transforms into a full-fledged Windows machine running in the cloud.
Wearable employee badges redefine the standard employee ID card. A single press of a fingerprint wakes up the agent, a touch records and transcribes conversations, and a built-in camera allows the agent to take action based on what the user sees.
In a healthcare demonstration, the badge functioned as an agent designed for healthcare workers, capable of scanning patient QR codes, recording and transcribing patient visits, recording vital signs, and issuing prescriptions. In another application, a built-in camera scanned a brainstorming board displaying office renovation ideas and suggested adding greenery.
Batish stated that Microsoft will not manufacture these devices itself, but envisions hardware manufacturers and other industry partners turning these reference designs into their own products, each targeting a specific industry, company, or scenario.
08 Quantum chip upgrade improves reliability by a thousandfold
Microsoft also released its next-generation topological quantum chip, Majorana 2.
Compared to its predecessor, Majorana 1, the core change this time is that the superconductor material has been changed from aluminum to lead. This adjustment improves the reliability of the qubit by 1,000 times, and the average qubit lifetime reaches 20 seconds, with some instances lasting up to one minute.
Other technological approaches typically result in qubit lifetimes on the order of microseconds. Based on this progress, Microsoft has halved its anticipated timeline for scalable quantum computers, now projecting it to be achieved by 2029.
The chip's development utilized the Agentic AI capabilities of the Microsoft Discovery platform throughout its development. The AI agent handled tasks such as manufacturing management, automated quantum state measurement, and interdisciplinary data analysis, reducing the measurement cycle from several weeks to several orders of magnitude and identifying correlations that are difficult for humans to perceive from nearly two decades of accumulated data.
Microsoft Technical Fellow Chetan Nayak said, "Agentic AI has permeated almost everything we do." But he emphasized that AI only provides guidance, "and scientists are always in the loop."
The Microsoft Discovery platform was also officially launched at the conference. This is an organization-level platform for cutting-edge research and development, allowing researchers to deploy human-guided teams of autonomous agents for hypothesis generation, experimental optimization, and theoretical verification. Microsoft also released an early preview version of the Microsoft Discovery app, which individuals can download for free and run locally using their GitHub Copilot account.



