ainewsblitz.com

Breaking · Design Arena

MiniMax M3 Ranks #4 Overall on Design Arena With 1319 Elo

MiniMax M3, the newly released open-weights model from Chinese AI company MiniMax (@MiniMax_AI), has landed at #4 overall on the open-weights leaderboard of the design-focused benchmark Design Arena, posting an Elo rating of 1319. The figure was announced by Design Arena (@Designarena) on June 2, 2026, marking a 35-Elo improvement over the previous M2.7.

The score places M3 in the same performance band as GLM 5.1 (@Zai_org), Kimi K.26 (@Kimi_Moonshot), and MiMo-V2.5 Pro (@XiaomiMiMo). A follow-up post on Design Arena emphasized that M3 is positioned as the first open-weights model supporting coding, agentic and multimodal capabilities together.

MiniMax formally released M3 around June 1, 2026 via API, with weights and a technical report due within roughly ten days of launch (see the official blog and an apidog explainer). The evaluation venue, Design Arena, is a Y Combinator S25 crowdsourced benchmark that ranks design outputs—UI, Website, 3D Design, Data Visualization—via Elo ratings derived from real user votes. Unlike text-centric benchmarks, it emphasizes look, usability and aesthetics, and Y Combinator notes participation from over 50,000 users across more than 140 countries.

The M series strengthened agentic and coding performance through M2, M2.5 and M2.7, and M3 adds Sparse Attention to speed up handling of 1M-token contexts. VentureBeat had earlier reported up to a 15.6x decoding speedup. MiniMax frames M3 as the first open-weights model to combine frontier-level coding, a 1M context window and native multimodality simultaneously—a sign of intensifying competition among Chinese models including GLM, Kimi and Qwen.

By category on Design Arena, M3 scored Elo 1324–1346 in 3D Design (win rate 61.7–63.7%, top 5–9%), Elo 1305–1316 in Website, and Elo 1319–1320 in code categories (top 8%), while Data Visualization sat in the top 10–15% and UI Component in the top 11–14%. Average generation time is about 292 seconds, with strength in blog and dashboard outputs, per OpenRouter benchmarks. Official specs cite SWE-Bench Pro 59.0%, Terminal Bench 2.1 66.0% and MCP Atlas 74.2%, and claim it surpasses Gemini 3.1 Pro on OmniDocBench. M3 is available via the MiniMax API and OpenRouter, with pricing cited around $0.30 per million input tokens and $1.20 per million output tokens.

Reactions are largely positive. On Reddit, users said that thinking and planning improved clearly over M2.7, that it looks promising for agentic workflows and code review, and that paired with OpenCode it could approach Claude-level quality. Others flagged that it is somewhat slow, inconsistent in coding, weak at game generation and not yet sufficient for serious app development (see a YouTube review). With user reports still limited just after launch, the prevailing view is that M3 is a notable open-weights model now standing alongside frontier systems.

Source post →