ainewsblitz.com

Breaking · Microsoft AI

Microsoft Unveils Seven Homegrown 'MAI' AI Models to Cut OpenAI Reliance

Microsoft on June 2 announced a new family of seven in-house AI models, the "MAI model family," at its annual Build 2026 developer conference in San Francisco. Spanning reasoning, coding, image, transcription and voice synthesis, all of the models were built from scratch with "zero distillation"—no knowledge distillation from rival models—and a traceable "clean data lineage" sourced from commercially licensed data. The launch was led by the Microsoft AI team (CEO: Mustafa Suleyman), which emphasized efficiency and seamless interoperability as a family. (GeekWire, Mashable)

The seven models comprise MAI-Thinking-1, its first reasoning model; the coding-focused MAI-Code-1-Flash; the image generation and editing MAI-Image-2.5 and MAI-Image-2.5-Flash; the transcription model MAI-Transcribe-1.5; and the voice synthesis MAI-Voice-2 and MAI-Voice-2-Flash. Microsoft positioned them as a "full-stack AI ecosystem," with some available immediately through integrations into its own products such as Copilot, PowerPoint, OneDrive and VS Code. (@MicrosoftAI)

The centerpiece MAI-Thinking-1 uses a sparse MoE (Mixture of Experts) design with 35B active parameters and roughly 1 trillion total parameters, supporting a 256K context window and function calling. On the AIME math benchmark it scored 97.0% on the 2025 edition and 94.5% on the 2026 edition, and it is said to reach parity with Claude Opus 4.6 on the demanding SWE-Bench Pro software-engineering benchmark. In blind tests by independent evaluators it tended to be preferred over Claude Sonnet 4.6. Microsoft, prioritizing low token cost, is offering the model in private preview on Foundry. (microsoft.ai)

The coding-oriented MAI-Code-1-Flash posted 71.6 on SWE-Bench Verified (above Claude Haiku 4.5's 66.6), 51.2 on SWE-Bench Pro and 54.8 on Terminal Bench 2, while claiming up to 60% fewer tokens. It is already rolling out to GitHub Copilot and VS Code. The MAI-Image-2.5 series ranked third in text-to-image and second in image editing on the Arena leaderboard, with the Flash variant priced at $1.75 per 1M text input tokens and $19.50 per 1M image output tokens. The transcription model MAI-Transcribe-1.5 topped the 43-language FLEURS benchmark with a WER of 2.4% and is said to be up to five times faster than rivals, processing one hour of audio in under 15 seconds. The synthesis model MAI-Voice-2 supports 15 languages with emotion control and code-switching such as Hindi-English, and was preferred by 72% of users over the earlier MAI-Voice-1. (@MicrosoftAI)

The announcement is widely seen as part of Microsoft's strategy to reduce reliance on OpenAI and pursue long-term self-sufficiency. The company stressed cost efficiency, low token costs and enterprise deployment, presenting its commercially licensed, auditable data lineage as a reassurance for corporate use. Against rivals such as Anthropic's Claude, OpenAI's GPT line and Google's Gemini, it is differentiating through a "family" strategy that bundles task-specialized models. (CNBC, Yahoo Finance) Distribution centers on Microsoft Foundry, with some models on OpenRouter and the Thinking model coming soon to Fireworks AI and Baseten. Microsoft describes its philosophy of continuous improvement via RL and data pipelines as a "hill-climbing machine." (microsoft.ai)

The official thread on X drew thousands of likes and high engagement. Developers praised Code-1-Flash's SWE-Bench gains and token reduction, the Image models' Arena rankings and Transcribe's speed, with strong interest in "VS Code/Copilot enhancements" and "image-editing precision." Others were more cautious, noting that on human evaluation Sonnet appeared stronger on factuality, flagging rollout delays such as the VS Code picker not yet reflecting the change, voicing concerns over overly aggressive safety filters, arguing that "one unified model would be better than seven," and questioning whether open weights would be released—leaving lingering caution over immediacy and openness. (@MicrosoftAI)

Source post →