Archive2026.06.03

AI Industry Daily News

A roundup of the AI industry's day, centered on Codex Windows support, grok-build-0.1, Claude Opus 4.8, Command A+, and Rosalind Biodefense.

Today's highlights

Policy / AI Executive Order

Trump Signs Frontier AI Security Executive Order; Anthropic Voices Support

President Trump signed an executive order on June 2 balancing frontier-AI innovation and national security, which Anthropic welcomed as 'an important step' toward strengthening US AI leadership.

OpenAI / Codex

OpenAI Extends Codex With Role-Specific Plugins and 'Sites'

OpenAI introduced six role-specific Codex plugins spanning 62 apps and 110 skills, plus a 'Sites' feature that turns work into shareable interactive web apps via a single URL.

Google DeepMind / Co-Scientist

Google DeepMind Opens 'Co-Scientist' Research Partner to Researchers

Google DeepMind made its Gemini-based multi-agent system 'Co-Scientist,' which generates, debates and refines scientific hypotheses, available to general researchers.

Microsoft AI / MAI

Microsoft Unveils Seven MAI Models at Build 2026

Microsoft AI launched a family of seven in-house MAI models, with its image-editing MAI-Image-2.5 landing at No. 2 on the Image Edit Arena and three models now available on OpenRouter.

Microsoft / Scout

Microsoft Launches 'Scout,' Its First Always-On Autopilot Agent

At Build 2026, Microsoft introduced Microsoft Scout, an always-on autonomous agent for Microsoft 365 that works in the background without prompts.

Key topics and reactions

Policy / AI Executive Order

Trump Signs Frontier AI Security Executive Order; Anthropic Voices Support

The order, 'Promoting Advanced Artificial Intelligence Innovation and Security,' defines models with advanced cyber capabilities as 'covered frontier models' and tasks the NSA with developing benchmarks to identify them within 60 days. It establishes a framework under which developers voluntarily submit models for review up to 30 days before release, and directs the Treasury to set up an 'AI Cybersecurity Clearinghouse' within 30 days.

Notably, the order explicitly rejects any 'mandatory governmental licensing, preclearance, or permitting requirement,' limiting government engagement to voluntary cooperation. According to NPR and The New York Times, an initial 90-day review window was shortened to 30 days after industry concerns.

Anthropic shared the White House announcement and pledged to help implement the order. The move follows Anthropic's April decision to restrict the release of its 'Mythos Preview' model over cyber-exploit capabilities, underscoring how regulation, geopolitics and model deployment are now tightly intertwined.

OpenAI / Codex

OpenAI Extends Codex With Role-Specific Plugins and 'Sites'

The six plugins cover data analytics, creative production, sales, product design, public equity investing, and investment banking, turning Codex into a role specialist with one code-free install. Data analytics connects to Snowflake, Databricks Genie, Hex and Tableau; creative production combines Figma, Canva, Shutterstock and Picsart; sales links Salesforce, HubSpot, Slack, Outreach and Clay.

The new 'Sites' feature lets users convert work, ideas and plans into interactive websites, dashboards or internal tools that Codex builds and hosts, sharing them via a single URL. It uses Cloudflare Workers-compatible ES module output with support for D1 databases, R2 object storage, workspace auth and RBAC.

The push reflects Codex's expansion beyond developers: analysts, marketers, designers and investors now make up about 20% of users and are growing more than three times faster than developer users. The plugins are described as productized versions of tools OpenAI's own internal teams built and use.

Google DeepMind / Co-Scientist

Google DeepMind Opens 'Co-Scientist' Research Partner to Researchers

Offered as an experimental 'Hypothesis Generation' tool within 'Gemini for Science,' the system lets researchers specify a goal in natural language, then generates thousands of hypotheses and ranks and refines them through a 'tournament of ideas.' Its architecture comprises specialized Gemini agents handling generation, critique, ranking and supervision.

Co-Scientist evolved from the February 2025 arXiv paper 'Towards an AI co-scientist' and was formalized in a Nature paper around May 2026. The work demonstrated biomedical validations, including in vitro drug-repurposing candidates for acute myeloid leukemia and a novel epigenetic target for liver fibrosis confirmed in human liver organoids.

DeepMind positions the system as a research partner that assists rather than replaces human scientists.

Microsoft AI / MAI

Microsoft Unveils Seven MAI Models at Build 2026

The MAI family spans reasoning, code, image, transcription and voice, including MAI-Thinking-1 (35B reasoning), MAI-Image-2.5, MAI-Transcribe-1.5, MAI-Voice-2 (15 languages) and MAI-Code-1-Flash, with immediate VS Code/GitHub Copilot support and a Mayo Clinic-linked medical model.

MAI-Image-2.5 debuted at No. 3 on the Text-to-Image Arena with a 1,254 score (+72 over MAI-Image-2) and reached No. 2 on the Image Edit Arena with 1,401, beating Google's Nano Banana 2, xAI's Grok Imagine and OpenAI's ChatGPT-Image-Latest-High Fidelity by over 10 points while trailing the leading gpt-image-2 (medium, 1,466) narrowly.

Microsoft also began offering MAI-Image-2.5, MAI-Transcribe-1.5 and MAI-Voice-2 on the OpenRouter routing platform, advancing Mustafa Suleyman's 'AI self-sufficiency' strategy to compete directly with rival hyperscalers. Some users, however, found the quality gap with Nano Banana Pro less decisive than the official claims suggest.

Category highlights

Gemini 3.5 Flash Improvements Roll Out

Google DeepMind rolled out an improved Gemini 3.5 Flash for Antigravity, announcing significantly reduced hallucinations and greater resilience on hard tasks, alongside a rate-limit reset for all users. The model rose 15 ranks to fourth on the Mobile Arena (Elo 1,257) and gained about eight ranks on average in the code category. Some users criticized over-autonomous agent behavior, such as making unrequested code changes.

Anthropic Ships 'ant' CLI and Reworks Claude Code Commands

Anthropic introduced 'ant,' an official CLI exposing all Claude Platform API endpoints as terminal commands, with Claude Code able to use it natively via a 'claude-api' skill. Separately, Claude Code's '/fork' was reworked into a background agent that inherits the full session context and prompt cache, with the prior copy-and-control behavior moved to '/branch.' Anthropic also published guidance on building self-verification loops that let Claude Code check its own work until tests pass.

Chrome Publishes Prompt API Polyfill for Unsupported Browsers

Google's Chrome team released an experimental polyfill for the Prompt API ('window.LanguageModel'), letting the same code run on-device AI in environments without native support, including Firefox, Safari and mobile. It defaults to a local backend of Transformers.js plus Gemma 3 1B (WebGPU or WASM) and can switch to cloud backends such as Gemini API or OpenAI, giving developers control over privacy, offline use and cost trade-offs.

Open-Weight Model Race Heats Up

Beyond MiniMax M3, JetBrains open-sourced Mellum2, a 12B MoE software-engineering model trained on about 10.6T tokens, while China's Kimi 2.6 drew praise as 'GPT-Medium class' and production-ready. Microsoft's MAI models also began shipping via OpenRouter, Baseten and fal.

Perplexity Adds Hybrid Agentic Reasoning and Apple Health

Perplexity announced hybrid agentic reasoning and an Apple Health integration. The updates extend the assistant's reasoning capabilities and personal data context as agent-based platforms broaden their reach into everyday workflows.

Key trends

Build 2026 Underway; NeurIPS Tightens Authorship Rules

Microsoft Build 2026 is in progress, anchoring this week's MAI and Scout announcements. Meanwhile, NeurIPS 2026 set a policy requiring its Position Paper Track to be 'substantially human-authored.' Google Flow also highlighted artists Keenan MacWilliam and Julie W in its 'Flow Sessions' creator showcase.