Twitter/XJune 14, 2026NEW
Modelo

Twitter/X: On AIME 2025 math reasoning: 97%. On SWE-Bench Pro: matches Claude Opus 4.6. In blind evals: pref…

On AIME 2025 math reasoning: 97%. On SWE-Bench Pro: matches Claude Opus 4.6. In blind evals: preferred over Claude Sonnet 4.6. If you're building on Claude or GPT, you now have a serious third option on the cloud you probably already use.

Full content is available at the original source.

x.com

Read full article on x.com