Archive2026.06.12

AI Industry Daily News

A roundup of the AI industry's day, centered on Codex Windows support, grok-build-0.1, Claude Opus 4.8, Command A+, and Rosalind Biodefense.

Today's highlights

OpenAI / Codex

OpenAI Agrees to Acquire Ona, Reportedly Files Confidential IPO Paperwork

OpenAI moves to acquire secure cloud execution startup @ona_hq to strengthen long-running Codex agents, while reportedly submitting a confidential S-1 to the SEC.

Anthropic / Claude

Anthropic Releases Claude Fable 5, Tops Design Arena With 1365 Elo

Anthropic's first publicly available Mythos-class model leads Design Arena overall and Code Arena Frontend, and is being adopted in Microsoft 365 Copilot.

Google / Gemini

Google's Gemini Omni Flash Tops Video Arena for Text- and Image-to-Video

Gemini Omni Flash leads both Text-to-Video and Image-to-Video, while Gemini 3.5 Audio adds Live Translate across more than 70 languages.

OpenAI / GPT-5.5

GPT-5.5 Places Second in Agent Arena Behind Claude Fable 5

OpenAI's GPT-5.5 (xHigh) records a +10.6% net improvement to rank second on Arena.ai's Agent Arena, narrowly behind Claude Fable 5.

xAI / Grok Build

xAI Opens Grok Build Plugin Marketplace in Beta With Five Plugins

xAI launches a plugin marketplace for its terminal-based Grok Build agent, with MongoDB, Vercel, Sentry, Cloudflare and Chrome DevTools support.

Key topics and reactions

OpenAI / Codex

OpenAI Agrees to Acquire Ona, Reportedly Files Confidential IPO Paperwork

OpenAI has agreed to acquire @ona_hq, a company with secure cloud execution technology, to support "long-run" operations in which Codex can keep working after a laptop is closed and to expand agent deployment in production environments.

At the same time, OpenAI is reported to have filed a confidential S-1 with the U.S. Securities and Exchange Commission in preparation for an IPO. Sam Altman has framed a "third phase" centered on automating AI research and distributing personal AGI.

The moves align with a broader industry push toward long-running, stateful agents that can operate over multiple days, a theme echoed by Claude Managed Agents scheduling support and other infrastructure work.

Anthropic / Claude

Anthropic Releases Claude Fable 5, Tops Design Arena With 1365 Elo

Anthropic publicly released Claude Fable 5 on June 9, its first Mythos-class model, recording an overall Elo of 1365 on the crowdsourced Design Arena benchmark — 22 points above Claude Opus 4.8. It also took first place in Code Arena: Frontend and led Agent Arena by a large margin.

Fable 5 is a safety-classified version of the top Mythos tier that had previously been withheld as too dangerous to release. It automatically routes high-risk cybersecurity, biology and chemistry queries to Claude Opus 4.8, and Anthropic has made the unrestricted Claude Mythos 5 available to vetted users such as cyber defenders.

Developers report strong autonomy in real use, including building 3D game prototypes overnight and running multi-project workflows. Reported limitations include token consumption nearly double that of Opus and rate limits during long sessions. The model is available on Pro, Max, Team and Enterprise plans and via the API.

Google / Gemini

Google's Gemini Omni Flash Tops Video Arena for Text- and Image-to-Video

Google's Gemini Omni Flash took first place in both the Text-to-Video and Image-to-Video categories of Video Arena. In Text-to-Video it scored 158 points above Veo 3.1 and 61 points above the next-ranked Seedance 2.0.

Google also released Gemini 3.5 Audio's Live Translate, providing real-time speech translation across more than 70 languages.

Separately, Google published Gemma 4 12B, a multimodal, on-device-oriented model, and Google DeepMind launched a roughly $10M fund with Schmidt Sciences and others to study emergent collective behavior among millions of interacting agents.

OpenAI / GPT-5.5

GPT-5.5 Places Second in Agent Arena Behind Claude Fable 5

On the Agent Arena leaderboard run by Arena.ai (formerly LMArena), OpenAI's GPT-5.5 (xHigh) ranked second with a net improvement of +10.6%, the highest position among OpenAI models, just behind Claude Fable 5 (High) at +11.20%.

Agent Arena evaluates models on behavioral signals from real long-horizon agent tasks rather than static benchmarks or Elo voting, drawing on about 559,200 sessions, 23 models, over 300,000 tasks, more than 2 million tool calls and 40 million lines of generated code.

GPT-5.5 led the Praise vs. Complaint (+29.4%) and Bash Recovery (+14.1%) signals, but trailed Fable 5 on confirmed success rate and steerability, leaving it narrowly in second place overall.

Category highlights

NVIDIA Unveils RTX Spark Superchip for Arm Windows PCs at GTC Taipei

At GTC Taipei 2026, NVIDIA's Jensen Huang announced RTX Spark (codename N1X), a superchip combining a 20-core Arm-based Grace CPU and a Blackwell RTX GPU with up to 6,144 CUDA cores via NVLink-C2C. It offers up to 128GB of unified memory and up to 1 Petaflop of FP4 AI performance, targeting always-on personal AI agents, creative work and gaming. Built with Microsoft for Windows on Arm, devices from ASUS, Dell, HP, Lenovo, Microsoft Surface, MSI, Acer and GIGABYTE are expected in fall 2026.

Design Arena Passes 4 Million Users and 30-Plus Arenas

Design Arena, the crowdsourced benchmark that compares AI-generated designs through blind head-to-head user votes, said it surpassed 4 million registered users (shown as 4.4M+ on its site) and expanded from 5-6 arenas to more than 30. The Y Combinator S25 startup says it has tested over 300 models and that its community has produced more than 5 million designs across more than 140 countries, with Anthropic models holding the top overall positions.

Microsoft, MiniMax, Stability AI and Google Update Model Lineups

Microsoft rolled out MAI-Code-1-Flash to 100% of GitHub Copilot plans, MiniMax announced a Friday weights release for M3 with free access through June via a PBD token router, Stability AI released Creative Image Model v4 with improved prompt fidelity, and Google published Gemma 4 12B for multimodal on-device use.

HeyGen Adds CLI, Avatars and Free Music and SFX Library

HeyGen drew strong support from solo creators with a CLI, Avatars, and a free library of more than 250,000 music and SFX tracks, letting users generate complete videos with avatar, voice and script from a single terminal command. In one published case, Belgian cat behaviorist Anneleen Bru used HeyGen to cut video production time to about one-fifth and localize 56 video courses into French, German, Italian, Russian, Hungarian and Japanese. Long, complex videos still require manual fine-tuning.

Runway, Higgsfield and ElevenLabs Expand Video and Audio Tools

Runway deepened its Lionsgate partnership for original IP development and a New York film festival, while Higgsfield AI shipped a professional video editing plugin bundling seven AI tools. In audio, ElevenLabs demonstrated Matthew McConaughey dubbing with Dubbing v2 and added Avatars to ElevenCreative, while Grok Voice — adopted as Vapi's default voice engine — reached more than 2.5 million voice agents.

Key trends

Databricks, Anthropic and Apple Expand Platform Features

Databricks announced Zerobus Ingest for serverless streaming data ingestion, Claude Managed Agents gained scheduled execution and environment variable support, and Apple advanced large-scale rollout of photorealistic image generation along with Reframe, Extend and Cleanup features.

GTC Taipei, Automate2026 and PixVerse Tokyo Event Headline Calendar

NVIDIA published behind-the-scenes coverage of GTC Taipei, and humanoids were a focus of debate at Automate2026. Luma and CapCut previewed Cannes Lions 2026 and AI film festivals, and PixVerse will hold its first offline event in Tokyo on June 16.