Archive2026.06.13

AI Industry Daily News

A roundup of the AI industry's day, centered on Codex Windows support, grok-build-0.1, Claude Opus 4.8, Command A+, and Rosalind Biodefense.

Today's highlights

Google / Legal

Google Sues China-Based Network Accused of Using Gemini to Build 9,000 Phishing Sites

Google filed a civil suit in federal court against the 'Outsider Enterprise' network, in what it calls its first lawsuit over abuse of Gemini for financial crime.

MiniMax / M3

MiniMax Releases Open-Weight M3 Model With 1M-Token Context and Day-Zero Hosting

MiniMax published its new flagship M3, a roughly 428B-parameter MoE with about 23B active parameters, with native multimodal support and immediate availability across inference platforms.

Moonshot / Kimi

Moonshot Open-Sources Kimi-K2.7-Code With 21.8% Coding Gain and 30% Fewer Reasoning Tokens

Moonshot released its coding-focused Kimi-K2.7-Code under Apache 2.0, a 1T-parameter MoE with about 32B active parameters and a 256K context window.

Google / Gemini

Google Previews Gemini 3.5 Live Translate With Single-Model Speech-to-Speech

Google released a preview of Gemini 3.5 Live Translate in the Gemini Live API and AI Studio, handling speech input to translated speech output in one model.

NVIDIA / Nemotron

NVIDIA's Open-Weight Nemotron 3 Ultra Ranks 20th Overall, 5th Among Open Models in Agent Arena

NVIDIA's 550B-A55B open-weight MoE placed 20th overall and 5th among open models on Arena.ai's new Agent Arena leaderboard.

Key topics and reactions

Google / Legal

Google Sues China-Based Network Accused of Using Gemini to Build 9,000 Phishing Sites

Google filed a civil lawsuit on June 12, 2026 in the U.S. District Court for the Southern District of New York against 'Outsider Enterprise,' a China-based cybercrime network it accuses of using its Gemini model to generate large volumes of phishing sites. The company describes it as its first lawsuit over Gemini abuse and seeks to disrupt the group's infrastructure and recover damages. According to the complaint, the defendants had Gemini generate fake-site code and distributed it as 'phishing-as-a-service' kits via Telegram.

The complaint alleges Outsider created about 9,000 fraudulent sites and roughly 1.59 million malicious URLs. The fakes impersonated Google and YouTube corporate pages as well as government and public bodies such as the U.S. Postal Service and New York's E-ZPass toll system, with SMS phishing ('smishing') mainly targeting Android users. At a peak around May 2026, some 2.5 million scam texts were sent in two weeks, with more than 100,000 victims and estimated losses ranging from hundreds of thousands to billions of dollars. The kits sold for about $88 per week on Telegram.

Google's Threat Intelligence Group has previously reported state-sponsored hackers abusing Gemini, but the Outsider case marks its first judicial response to abuse aimed at financial fraud. Google said it is coordinating with the FBI and telecom carriers.

MiniMax / M3

MiniMax Releases Open-Weight M3 Model With 1M-Token Context and Day-Zero Hosting

MiniMax released M3, a new open-weight flagship built on a Mixture-of-Experts architecture with roughly 428B total parameters and about 23B active. The model emphasizes frontier coding, long-horizon agent workflows, native multimodal input, and a 1M-token context window. It was published on Hugging Face and deployed day-zero across inference platforms including Fireworks, Telnyx, Modular, Parasail, Unsloth and SambaNova.

Early user reports highlighted long-running agent workflows that used the 1M context to carry complex creative tasks end-to-end, from planning and code generation to iterative fixes. The model scored 59% on SWE-Bench Pro and 66% on Terminal-Bench, placing it at frontier level, though some tasks still lagged Claude in direct comparison.

Pricing during a current 50%-off period was widely cited as roughly one-sixteenth that of Claude Opus 4.8. Observers also noted sparse-attention efficiency and native multimodality, while flagging long deliberation times and high output-token counts as current limitations.

Moonshot / Kimi

Moonshot Open-Sources Kimi-K2.7-Code With 21.8% Coding Gain and 30% Fewer Reasoning Tokens

Moonshot released Kimi-K2.7-Code on June 12, 2026 and open-sourced it under Apache 2.0 on Hugging Face. Built on Kimi K2.6, the model is positioned by the company as its strongest coding model to date, with a 1T-parameter MoE architecture (about 32B active), native multimodal input across text, images and video, an always-on thinking mode, and a 256K context length. It was immediately available via the Kimi API, Kimi Code CLI, and the Design Arena comparison platform.

Against K2.6, the model gained 21.8% on Kimi Code Bench v2, 11.0% on Program Bench and 31.5% on MLS Bench Lite, while cutting reasoning-token usage by about 30% to reduce 'overthinking.' Kimi API pricing is $0.19 per MTok for cache hits, $0.95 for input and $4.00 for output.

The base K2.6 already topped Design Arena's 3D Design leaderboard, outscoring Anthropic Opus 4.7, OpenAI GPT 5.5 and Google Gemini 3.5 Flash at lower cost. The Kimi Code CLI supports MCP and ACP for IDE integration, prompting views that it could serve as a Claude Code CLI alternative, particularly in Asia.

Google / Gemini

Google Previews Gemini 3.5 Live Translate With Single-Model Speech-to-Speech

Google made Gemini 3.5 Live Translate available in preview through the Gemini Live API and AI Studio. The system performs speech input to translated speech output within a single model, removing the previously fragmented multi-model stack.

Early reactions focused on low latency, with reports of under 500ms in real-time voice dialogue and discussion of customer-support use cases where minimal lag preserves conversational trust.

Some observers were more cautious, noting limited published figures under real load and asking for measured results outside demos, while others raised the possibility of excessive tool use seen in Flash versions appearing in translation tasks.

Category highlights

Anthropic Adds Five Sandbox Guides to Claude Managed Agents

Anthropic added new guides for running Claude Managed Agents (CMA) on user-controlled sandboxes, supporting Blaxel, E2B, Google Cloud, Namespace Labs and Superserve AI. Developers can now choose Anthropic-managed cloud environments or their own infrastructure by specifying 'type: cloud' or 'type: self_hosted' at environment creation, addressing compliance, data-residency and performance needs. The design separates a 'brain' of Claude and harness from 'hands' of sandbox and tool execution to keep interfaces stable across model updates.

Huang Frames AI as a Five-Layer Industry Stack Shifting From Retrieval to Generation

NVIDIA CEO Jensen Huang argued computing is moving from retrieval to generation, calling it the biggest change in 60 years, and organized the view into a 'five-layer AI cake' framework: energy, chips, infrastructure, models and applications. Speaking with Sequoia Capital partner Konstantine Buhler in a conversation NVIDIA amplified on June 12, 2026, Huang said every successful application draws all the layers beneath it and referenced an AI economy exceeding $20 trillion. The framework was also discussed at Davos 2026.

NVIDIA Unveils Omnimodal World Model Cosmos 3 at CVPR 2026

NVIDIA presented Cosmos 3, an omnimodal world model that processes language, image, video, audio and action in a single model and extends to robot control, at CVPR 2026. The announcement accompanied the open-weight Nemotron 3 Ultra MoE, reinforcing NVIDIA's push into physical AI and unified multimodal modeling.

Google Demonstrates Character Re-Styling in Gemini Omni Video Editing

Google's Google Flow account published a demo on June 12, 2026 showing Gemini Omni's video editing changing a single character into multiple different looks while maintaining consistency through conversational editing. Gemini Omni, which replaces Veo 3.1, accepts any combination of text, image, audio and video, and automatically applies SynthID and C2PA Content Credentials. It is available to Google AI Plus, Pro and Ultra subscribers via Google Flow, the Gemini app and YouTube Shorts.

Open Coding Models Multiply Across U.S. and Chinese Labs

Beyond MiniMax M3 and Kimi-K2.7-Code, Google released the experimental 26B open model DiffusionGemma for fast text-diffusion generation, and Cohere released the open-source North Mini Code. Claude Fable 5 showed advanced agentic ability by producing a working V8 engine CAD design in under 10 minutes, while Anthropic Opus 4.8 reportedly slipped to 23rd in Design Arena single-turn web development, underscoring intensifying competition in open-weight agentic and coding models.

Key trends

Video and Audio Tooling Updates From fal, PixVerse and Others

fal open-sourced an audio-reactive LoRA for LTX2.3 to strengthen beat synchronization, while PixVerse launched a Canvas workflow on the web and offered Seedance 2.0 discounts. Kling and Vidu shared new creative examples and Pictory published an industry report analyzing over 1.5 million videos. In audio, Suno highlighted sampling-based psych-rock creation and Vapi argued perceived quality factors like silence and interruption matter more than raw latency.