Archive2026.06.25

AI Industry Daily News

A roundup of the AI industry's day, centered on Codex Windows support, grok-build-0.1, Claude Opus 4.8, Command A+, and Rosalind Biodefense.

Today's highlights

OpenAI / Hardware

OpenAI Unveils In-House AI Chip Jalapeño, to Be Mass-Produced With Broadcom

OpenAI announced its first in-house AI chip, Jalapeño, designed for LLM inference workloads across ChatGPT, Codex, the API and agent products.

OpenAI / GPT-5.5 Instant

OpenAI Rolls Out Updated GPT-5.5 Instant to ChatGPT and API

OpenAI released an updated version of GPT-5.5 Instant, ChatGPT's default model, with improvements to intent understanding, complex-constraint handling and shopping experiences.

Perplexity / Legal

Perplexity Launches Computer for Counsel for Legal Work

Perplexity introduced Computer for Counsel, a legal-focused product integrating research databases, document tools and matter management.

Z.ai / GLM-5.2

Z.ai's GLM-5.2 (Max) Ranks Second on Code Arena Frontend Leaderboard

The open-weight GLM-5.2 (Max) placed second overall on Code Arena: Frontend, beating several Claude Opus models in head-to-head matchups.

ByteDance / Seedance

ByteDance's Seedance 2.0 Mini Arrives on Replicate and fal

Seedance 2.0 Mini offers roughly double the speed and half the cost of prior versions while maintaining multi-shot consistency, with native 4K generation under $10.

Key topics and reactions

OpenAI / Hardware

OpenAI Unveils In-House AI Chip Jalapeño, to Be Mass-Produced With Broadcom

OpenAI introduced Jalapeño, its first internally developed AI chip, which it will mass-produce with Broadcom. The chip is optimized for large language model workloads spanning ChatGPT, Codex, the API and agent products.

The move underscores a broader industry shift in which major vendors are turning to custom silicon to reduce inference costs. It sits alongside trends such as open-model adoption and cheaper-model prioritization as companies look to manage runaway AI compute spending.

OpenAI / GPT-5.5 Instant

OpenAI Rolls Out Updated GPT-5.5 Instant to ChatGPT and API

On June 24, 2026, OpenAI began rolling out an updated GPT-5.5 Instant, the default model in ChatGPT. The update improves how the model grasps user intent, follows complex constraints, and summarizes shopping and local recommendations. It reached paid users the same day and free users by June 25, and applies to both ChatGPT and the API via the `chat-latest` alias.

GPT-5.5 Instant is the low-latency line that prioritizes concise answers and personalization while reducing hallucinations. It was introduced as ChatGPT's default model on May 5, 2026, replacing GPT-5.3 Instant, and has received successive updates including response-style improvements on May 28 and personalization upgrades on June 9. Context windows vary by plan: 16K for Free, 32K for Plus/Business, and 128K for Pro/Enterprise.

OpenAI has not published pricing or specific benchmark figures for the June 24 update. Some users praised the improved intent understanding and everyday usefulness, while others said the frequent model swaps were tiring and that a higher-reasoning mode remained preferable for tasks requiring deep thinking.

Perplexity / Legal

Perplexity Launches Computer for Counsel for Legal Work

Perplexity announced Computer for Counsel, a product tailored to legal work that links research databases, document tools and matter management in one workflow.

It is available to Pro and Max users.

Z.ai / GLM-5.2

Z.ai's GLM-5.2 (Max) Ranks Second on Code Arena Frontend Leaderboard

GLM-5.2 (Max), the latest open-weight flagship from Tsinghua-affiliated Z.ai (Zhipu AI), ranked second overall on Arena.ai's Code Arena: Frontend, behind the currently unavailable Fable 5. It took second in React, fourth in HTML, and led on Design Arena with an Elo of 1360. In one-on-one matchups it beat Opus 4.7 Thinking 55.0% of the time and led Kimi-K2.6 and Sonnet 4.6 by wider margins; its closest competitors were GPT-5.5 (xHigh) and Opus 4.6.

Released in mid-June 2026, GLM-5.2 builds on the GLM-5 series and substantially raises long-context and agentic performance over GLM-5.1. It targets repo-scale refactoring, multi-hour autonomous agent work, and frontend design and coding. It carries an unrestricted MIT license with day-one support across Claude Code, Cline, Roo Code, Goose, OpenCode and ZCode.

Technically, it is a Mixture-of-Experts model with 744B–753B total parameters and roughly 40B active per token. It adds an IndexShare optimization to DeepSeek Sparse Attention, reducing per-token FLOPs at one million tokens by about 2.9x, and offers a practical one-million-token context window with output up to roughly 131,000 tokens.

Category highlights

Qwen-AgentWorld Models Environments Directly to Cover Seven Settings

Qwen-AgentWorld introduced an approach that treats imitating the environment itself as a learning objective, allowing a single model to handle seven environments. Separately, AI21 took first place on DeepResearch Bench II, and Google DeepMind has begun discussing an 'agent economy' in which millions of AI agents negotiate and transact.

Runway Adds Ad Localization; xAI Releases Grok Imagine Video 1.5

Runway added an ad localization feature that generates versions for all markets and languages from a single image with one click. xAI's Grok Imagine Video 1.5 became generally available for fast video generation. Vidu, PixVerse and Higgsfield shared production tips, while Seedance 2.0's skin texture and realism drew attention.

ElevenLabs Launches Colin Cowherd Sports AI in FOX Sports App

ElevenLabs released a 'Colin Cowherd Sports AI' in the FOX Sports app to answer questions about the World Cup and fantasy sports. Separately, Vapi detailed barge-in technology for interruptible voice conversations.

Claude Desktop, Kimi API and Databricks Updates

Claude Desktop gained full Chat/Cowork/Code functionality via Amazon Bedrock, and the Kimi API arrived on AWS Marketplace. Databricks was named a Leader in Gartner's Magic Quadrant for AI platforms. Luma launched Connectors for Airtable, Dropbox and Drive, while xAI expanded Grok Build plugins including MongoDB and Firecrawl. Gemini Omni is slated to arrive soon on the Gemini API and Enterprise Agent Platform.

HeyGen Adds Look Packs for Consistent AI Avatar Faces

HeyGen introduced Look Packs, which keep an AI avatar's face consistent across different outfits, settings and styles after a Digital Twin is created. The feature addresses the common problem of faces changing between generations, though it requires an initial Digital Twin and the free pack is limited to first-time use.

Key trends

Google Details Using Gemini for Founder Personal Branding

A partner article published around June 24, 2026 outlines using Google Gemini to help founders build a personal brand through brainstorming, organizing expertise and generating strong first drafts. It presents a workflow from uploading materials to drafting LinkedIn posts, available via gemini.google.com and the Gemini for Google Workspace integration, while noting edits to match one's own voice remain essential.

Eco Wave Power Uses NVIDIA Omniverse Digital Twins for Wave Power and Data Centers

NVIDIA disclosed that wave power developer Eco Wave Power (NASDAQ: WAVE) is using NVIDIA Omniverse digital twins and GPU-accelerated computing to allocate data center compute tasks based on wave-state forecasts. A participant in NVIDIA's Inception program, the company aims to match data center workloads to 24-hour wave power generation, with a pipeline exceeding 400MW.

Google I/O 2026 Deep Dive and Together AI Workshop

A Google I/O 2026 deep dive was published on the Google Developer Program dashboard, and the Gemma team sponsored a one-day hackathon on Kaggle. Together AI held an agent inference workshop at the AI Engineer World's Fair and reported migrating 400 trillion tokens of production workloads.